SlideShare a Scribd company logo
1 of 51
1
From idea to production in a day
Leveraging Azure ML and Streamlit
to build and user test machine learning
ideas quickly
Florian Roscheck
PyCon DE & PyData Berlin 2024
2
3
4
How do we use it
to build + test
quickly?
What is our tech
stack?
What are we
building?
5
Hi, I’m Florian!
Sr. Data Scientist
Florian Roscheck
• Sr. Data Scientist at Henkel
• Instructor for Apache Spark
with 7k+ students
• Vice President NumFOCUS
Affiliated Project Selection Committee
• Active on LinkedIn
6
WHAT TO BUILD
IN ONE DAY?
A Minimum
Viable Product
• Enough features to be usable
• Ability to collect user feedback
7
WHAT TO BUILD
IN ONE DAY?
A Minimum
Viable Product
BUILD
M
E
A
-
S
U
R
E
LEARN
To learn about users quickly,
we want to implement
build-measure-learn loop
To make users happier over time,
we aim to create data flywheel
8
Ready?
9
Ready?
Data not in
place
Environment
issues
Lost in
modeling
Inappropriate
user interface
No feedback
about use
Difficult
collaboration
10
BUILD
M
E
A
-
S
U
R
E
LEARN
Data not in
place
Environment
issues
Lost in
modeling
Inappropriate
user interface
No feedback
about use
Difficult
collaboration
GET DATA
BEFOREHAND
11
HOW TO BUILD IN ONE DAY?
A Time-Saving Stack Environment
issues
Lost in
modeling
Inappropriate
user interface
No feedback
about use
Difficult
collaboration
Azure ML
Notebooks
Automated ML on Azure
Streamlit
Azure Application Insights
+ Streamlit
12
Let’s build!
13
Example: Trash Recognizer App
bottle
• Customer: Waste management company
• Need: Want to evaluate computer vision
solutions for recognizing trash
• Idea: Waste management professionals
manually evaluate performance through app
with feedback functionality
14
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
15
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
16
Training Data: TACO Trash Image Dataset
• TACO: Trash Annotations in Context
• Dataset of 1.5k images with 4.7k+ annotations
• Annotations for 60 categories, incl. backgrounds
• Open source
Proença, P
. F., & Simões, P
. (2020). TACO: Trash Annotations in Context for
Litter Detection. arXiv Preprint arXiv:2003.06975.
tacodataset.org Source: tacodataset.org
17
Source: tacodataset.org
18
What is Azure ML?
• Cloud-based ML platform by Microsoft
• Run ad-hoc analyses with
Jupyter Notebooks
• Run and track machine learning
experiments through tight integration
with MLFlow
• Version data and models
• Build complex and reproducible modeling
pipelines
• Deploy models as API
Screenshot of Azure ML web app
19
Basics of Getting Data
Data Asset
TACO-annotations
• Azure ML-managed
• Like a mask for files
• Sharable
• Version controlled
• Interactively explorable
TACO
GitHub repo
!
Azure ML Workspace
Azure blob
storage
Azure ML Notebook
0_prepare_dataset.ipynb
• Like Jupyter Notebook
• Managed environment
• Sharable
• Runs on compute in
workspace
Reproducible
environment
Collaboration-
ready
ENABLERS
20
Movie
21
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
22
Automated
Machine Learning
23
!
24
Automated Machine Learning
on Azure ML
• Automated ML: Try different models and hyperparameters
that are automatically selected
• We have very little time for modeling!
• Depending on data and problem type, automated machine
learning can provide a reasonable starting point for
modeling with a high return on time investment
• Azure ML offers automated ML pipelines for several
common tasks, incl. classification, regression, forecasting,
NLP
, and computer vision
Modeling time
saver
ENABLERS
25
Setting Up AutoML Through Code
Create compute cluster
1
Define training job
2
Submit job to compute cluster
3
Azure ML Notebooks
1_training.ipynb
Azure ML Workspace
TACO-annotations
TACO-training
26
Compute Cluster Creation Tips & Tricks
Save Money
Shut down unused instances
120 seconds to auto shutdown
Pick auto-evictable machine
(own case: 80% cheaper)
4 experiments
can run in parallel
Tesla T4 GPU w/ 16 GB memory,
56 GB RAM, 8 vCPUs,
but many options available!
Pick a Fitting Compute
27
Increasing Efficiency for
Automated ML on Azure
• Set ML parameters based on your data science knowledge
• Train/test/validation split, cross-validation, etc.
• Hyperparameter selection strategy
• Restrict hyperparameter search space
• Set job limits
• Max no. of trials
• Max runtime per trial or of all trials
• Termination based on score
28
3 Hours
Later
29
Our annotations, linked to the job
MLFlow model!
30
Models ordered by performance
Azure AutoML experimented with
a single model type
31
Training Results
Movie
32
How to Dig Deeper
• Metrics are comprehensive and look
great – but what are we looking at?
• More details in the logs:
• Tip: Read Azure ML documentation!
• You still need data science knowledge to
understand what Azure ML is doing here.
Section of std_log.txt file in Outputs + logs tab
33
We have
a model!
34
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
35
The Power of ONNX
for Model Packaging
• Great: Azure ML packaged model in MLFlow format
• The Issue: Tight MLFlow model dependencies restrict platforms where
model can be used
• 204 (!) pinned dependencies, incl. 31 Azure-specific packages
• Experienced issues installing some (azureml-dataprep-native)
on macOS (M1)
• The Solution: Use ONNX model file (byproduct of Azure AutoML training)
and use it with a single dependency: onnxruntime
• ONNX (Open Neural Network Exchange): Open standard for deep
learning models, makes models work across frameworks
• ONNX Runtime: Cross-platform, open source ML model accelerator
You can now use your AutoML-trained model outside of Azure!
Flexible model
deployment
ENABLERS
36
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
37
Building an App with Streamlit
• Streamlit is an open-source app framework for creating
data-based web apps in Python
• My experience with Streamlit:
• The Good: Very easy and fast to code and use, apps
look great and work – Wow!
• The Good-to-Know: Complex workflows with state
management harder to program, may be perceived as
slow by users in comparison to “professional” web apps
• Streamlit is perfect for getting a user-facing app off the
ground and testing your data-based product ideas!
Streamlit logo, see streamlit.io
38
Streamlit App Example
39
Streamlit App Blueprint
Trash Recognizer
Upload image(s)
Detected Trash
- 2 items for yellow trash can
- 1 item for blue trash can
No trash detected.
Detected Trash
- 1 item for other trash can
[Model + data description] Load ONNX model
1
Preprocess images
2
Run model inference
3
Postprocess images
4
What the user sees What the app does
Display results
5
40
Easy-to-use
interface
ENABLERS
Movie
41
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
42
Thanks,
Data Science
Engineering Team!
43
Deployment Pipeline
• Henkel Data Science Engineering team
developed pipeline for one-click
deployment of data science infrastructure,
incl. Streamlit apps, on secure Azure cloud
infrastructure
• Open sourced via article series
“Kickstarting Data Science Projects in
Azure DevOps” by Roberto Alonso
• Part 1 and 2 already available on
Henkel Data & Analytics Blog medium.com/henkel-data-and-analytics
medium.com/henkel-data-and-analytics
44
45
BUILD
M
E
A
-
S
U
R
E
LEARN
1 Get Data
2
Train
Model
3 Build App
4
Deploy App
with Model
5
Collect
Feedback
Our Plan
46
Collecting Feedback
streamlit-feedback
Azure
Application Insights
Python logging
Azure Dashboards
Open-source feedback
plugin for Streamlit
Use AzureLogHandler through
opencensus logging extension
Collect logs from
application, query with
Kusto language
Interactive live dashboards on
Azure for application metrics
47
Movie
48
Easy and fast
measurement
ENABLERS
49
bottle
BUILD
M
E
A
-
S
U
R
E
LEARN
Reproducible
environment
Collaboration-
ready
Modeling
time saver
Flexible model
deployment
Easy and fast
measurement
Easy-to-use
interface
Learning
culture
50
Code, Slides, Details
PyData team + sponsors, Henkel, incl. Henkel Data Science CoE team,
Open-source contributors for TACO, ONNX, ONNX Runtime, Streamlit,
streamlit-feedback, Streamlit for reaching out before talk
Learn More
• Build-Measure-Learn Loop: The Lean Startup | Methodology
• Data Flywheel: Data Flywheel: Scaling a world-class data strategy
• Dataset: Tacodataset.org
• Automated Machine Learning on Azure: What is automated ML?
• ONNX: ONNX Runtime, ONNX File Format
• Streamlit: Get started with Streamlit, streamlit-feedback
• Azure Tricks for Data Science: Henkel Data & Analytics Blog
• Logging to Azure from Python: Monitor Python applications
• Azure Dashboards: Dashboards of Azure Log Analytics data
• A similar project: Instance Segmentation with Azure Machine Learning github.com/flrs/build_and_test_ml_quickly
Thanks
Photo credits, in order of appearance: Greg Rakozy, Canva Studio, Sewupari Studio, Massimo Botturi, Charlotte Coneybeer, Desola Landre Ologun, Studio Saiz, Claudio
Schwarz, NASA, Alena Darmel, Anna Shvetz, Vadim B, The Lucky Neko, Visual Tag Mx; User icon from “Redefining Women” icon collection by Iconathon
51
What are
your questions?
Sr. Data Scientist
linkedin.com/in/florianroscheck
github.com/flrs
Florian Roscheck
Let’s connect!

More Related Content

Similar to From idea to production in a day – Leveraging Azure ML and Streamlit to build and user test machine learning ideas quickly

2020 10 22 AI Fundamentals - Azure Machine Learning
2020 10 22 AI Fundamentals - Azure Machine Learning2020 10 22 AI Fundamentals - Azure Machine Learning
2020 10 22 AI Fundamentals - Azure Machine LearningBruno Capuano
 
innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...Wilfried Hoge
 
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...OpenWhisk
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramFIWARE
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer PresentationDamien Dallimore
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsDatabricks
 
.NET per la Data Science e oltre
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltreMarco Parenzan
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Databricks
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Luciano Resende
 
Bodywork - GitOps for Machine Learning
Bodywork - GitOps for Machine LearningBodywork - GitOps for Machine Learning
Bodywork - GitOps for Machine LearningAlex Ioannides
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksAlberto Diaz Martin
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon
 
UI5con 2018 - Keynote
UI5con 2018 - KeynoteUI5con 2018 - Keynote
UI5con 2018 - KeynotePeter Muessig
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Mule any pointstudio
Mule any pointstudioMule any pointstudio
Mule any pointstudiohimajareddys
 

Similar to From idea to production in a day – Leveraging Azure ML and Streamlit to build and user test machine learning ideas quickly (20)

2020 10 22 AI Fundamentals - Azure Machine Learning
2020 10 22 AI Fundamentals - Azure Machine Learning2020 10 22 AI Fundamentals - Azure Machine Learning
2020 10 22 AI Fundamentals - Azure Machine Learning
 
innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...innovations born in the cloud - cloud data services from IBM to prototype you...
innovations born in the cloud - cloud data services from IBM to prototype you...
 
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
 
IBM Bluemix Openwhisk
IBM Bluemix OpenwhiskIBM Bluemix Openwhisk
IBM Bluemix Openwhisk
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers Program
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
 
.NET per la Data Science e oltre
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltre
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
 
Bodywork - GitOps for Machine Learning
Bodywork - GitOps for Machine LearningBodywork - GitOps for Machine Learning
Bodywork - GitOps for Machine Learning
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
UI5con 2018 - Keynote
UI5con 2018 - KeynoteUI5con 2018 - Keynote
UI5con 2018 - Keynote
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Mule any pointstudio
Mule any pointstudioMule any pointstudio
Mule any pointstudio
 
Mule any pointstudio
Mule any pointstudioMule any pointstudio
Mule any pointstudio
 

Recently uploaded

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 

From idea to production in a day – Leveraging Azure ML and Streamlit to build and user test machine learning ideas quickly

  • 1. 1 From idea to production in a day Leveraging Azure ML and Streamlit to build and user test machine learning ideas quickly Florian Roscheck PyCon DE & PyData Berlin 2024
  • 2. 2
  • 3. 3
  • 4. 4 How do we use it to build + test quickly? What is our tech stack? What are we building?
  • 5. 5 Hi, I’m Florian! Sr. Data Scientist Florian Roscheck • Sr. Data Scientist at Henkel • Instructor for Apache Spark with 7k+ students • Vice President NumFOCUS Affiliated Project Selection Committee • Active on LinkedIn
  • 6. 6 WHAT TO BUILD IN ONE DAY? A Minimum Viable Product • Enough features to be usable • Ability to collect user feedback
  • 7. 7 WHAT TO BUILD IN ONE DAY? A Minimum Viable Product BUILD M E A - S U R E LEARN To learn about users quickly, we want to implement build-measure-learn loop To make users happier over time, we aim to create data flywheel
  • 9. 9 Ready? Data not in place Environment issues Lost in modeling Inappropriate user interface No feedback about use Difficult collaboration
  • 10. 10 BUILD M E A - S U R E LEARN Data not in place Environment issues Lost in modeling Inappropriate user interface No feedback about use Difficult collaboration GET DATA BEFOREHAND
  • 11. 11 HOW TO BUILD IN ONE DAY? A Time-Saving Stack Environment issues Lost in modeling Inappropriate user interface No feedback about use Difficult collaboration Azure ML Notebooks Automated ML on Azure Streamlit Azure Application Insights + Streamlit
  • 13. 13 Example: Trash Recognizer App bottle • Customer: Waste management company • Need: Want to evaluate computer vision solutions for recognizing trash • Idea: Waste management professionals manually evaluate performance through app with feedback functionality
  • 14. 14 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 15. 15 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 16. 16 Training Data: TACO Trash Image Dataset • TACO: Trash Annotations in Context • Dataset of 1.5k images with 4.7k+ annotations • Annotations for 60 categories, incl. backgrounds • Open source Proença, P . F., & Simões, P . (2020). TACO: Trash Annotations in Context for Litter Detection. arXiv Preprint arXiv:2003.06975. tacodataset.org Source: tacodataset.org
  • 18. 18 What is Azure ML? • Cloud-based ML platform by Microsoft • Run ad-hoc analyses with Jupyter Notebooks • Run and track machine learning experiments through tight integration with MLFlow • Version data and models • Build complex and reproducible modeling pipelines • Deploy models as API Screenshot of Azure ML web app
  • 19. 19 Basics of Getting Data Data Asset TACO-annotations • Azure ML-managed • Like a mask for files • Sharable • Version controlled • Interactively explorable TACO GitHub repo ! Azure ML Workspace Azure blob storage Azure ML Notebook 0_prepare_dataset.ipynb • Like Jupyter Notebook • Managed environment • Sharable • Runs on compute in workspace Reproducible environment Collaboration- ready ENABLERS
  • 21. 21 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 23. 23 !
  • 24. 24 Automated Machine Learning on Azure ML • Automated ML: Try different models and hyperparameters that are automatically selected • We have very little time for modeling! • Depending on data and problem type, automated machine learning can provide a reasonable starting point for modeling with a high return on time investment • Azure ML offers automated ML pipelines for several common tasks, incl. classification, regression, forecasting, NLP , and computer vision Modeling time saver ENABLERS
  • 25. 25 Setting Up AutoML Through Code Create compute cluster 1 Define training job 2 Submit job to compute cluster 3 Azure ML Notebooks 1_training.ipynb Azure ML Workspace TACO-annotations TACO-training
  • 26. 26 Compute Cluster Creation Tips & Tricks Save Money Shut down unused instances 120 seconds to auto shutdown Pick auto-evictable machine (own case: 80% cheaper) 4 experiments can run in parallel Tesla T4 GPU w/ 16 GB memory, 56 GB RAM, 8 vCPUs, but many options available! Pick a Fitting Compute
  • 27. 27 Increasing Efficiency for Automated ML on Azure • Set ML parameters based on your data science knowledge • Train/test/validation split, cross-validation, etc. • Hyperparameter selection strategy • Restrict hyperparameter search space • Set job limits • Max no. of trials • Max runtime per trial or of all trials • Termination based on score
  • 29. 29 Our annotations, linked to the job MLFlow model!
  • 30. 30 Models ordered by performance Azure AutoML experimented with a single model type
  • 32. 32 How to Dig Deeper • Metrics are comprehensive and look great – but what are we looking at? • More details in the logs: • Tip: Read Azure ML documentation! • You still need data science knowledge to understand what Azure ML is doing here. Section of std_log.txt file in Outputs + logs tab
  • 34. 34 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 35. 35 The Power of ONNX for Model Packaging • Great: Azure ML packaged model in MLFlow format • The Issue: Tight MLFlow model dependencies restrict platforms where model can be used • 204 (!) pinned dependencies, incl. 31 Azure-specific packages • Experienced issues installing some (azureml-dataprep-native) on macOS (M1) • The Solution: Use ONNX model file (byproduct of Azure AutoML training) and use it with a single dependency: onnxruntime • ONNX (Open Neural Network Exchange): Open standard for deep learning models, makes models work across frameworks • ONNX Runtime: Cross-platform, open source ML model accelerator You can now use your AutoML-trained model outside of Azure! Flexible model deployment ENABLERS
  • 36. 36 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 37. 37 Building an App with Streamlit • Streamlit is an open-source app framework for creating data-based web apps in Python • My experience with Streamlit: • The Good: Very easy and fast to code and use, apps look great and work – Wow! • The Good-to-Know: Complex workflows with state management harder to program, may be perceived as slow by users in comparison to “professional” web apps • Streamlit is perfect for getting a user-facing app off the ground and testing your data-based product ideas! Streamlit logo, see streamlit.io
  • 39. 39 Streamlit App Blueprint Trash Recognizer Upload image(s) Detected Trash - 2 items for yellow trash can - 1 item for blue trash can No trash detected. Detected Trash - 1 item for other trash can [Model + data description] Load ONNX model 1 Preprocess images 2 Run model inference 3 Postprocess images 4 What the user sees What the app does Display results 5
  • 41. 41 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 43. 43 Deployment Pipeline • Henkel Data Science Engineering team developed pipeline for one-click deployment of data science infrastructure, incl. Streamlit apps, on secure Azure cloud infrastructure • Open sourced via article series “Kickstarting Data Science Projects in Azure DevOps” by Roberto Alonso • Part 1 and 2 already available on Henkel Data & Analytics Blog medium.com/henkel-data-and-analytics medium.com/henkel-data-and-analytics
  • 44. 44
  • 45. 45 BUILD M E A - S U R E LEARN 1 Get Data 2 Train Model 3 Build App 4 Deploy App with Model 5 Collect Feedback Our Plan
  • 46. 46 Collecting Feedback streamlit-feedback Azure Application Insights Python logging Azure Dashboards Open-source feedback plugin for Streamlit Use AzureLogHandler through opencensus logging extension Collect logs from application, query with Kusto language Interactive live dashboards on Azure for application metrics
  • 50. 50 Code, Slides, Details PyData team + sponsors, Henkel, incl. Henkel Data Science CoE team, Open-source contributors for TACO, ONNX, ONNX Runtime, Streamlit, streamlit-feedback, Streamlit for reaching out before talk Learn More • Build-Measure-Learn Loop: The Lean Startup | Methodology • Data Flywheel: Data Flywheel: Scaling a world-class data strategy • Dataset: Tacodataset.org • Automated Machine Learning on Azure: What is automated ML? • ONNX: ONNX Runtime, ONNX File Format • Streamlit: Get started with Streamlit, streamlit-feedback • Azure Tricks for Data Science: Henkel Data & Analytics Blog • Logging to Azure from Python: Monitor Python applications • Azure Dashboards: Dashboards of Azure Log Analytics data • A similar project: Instance Segmentation with Azure Machine Learning github.com/flrs/build_and_test_ml_quickly Thanks Photo credits, in order of appearance: Greg Rakozy, Canva Studio, Sewupari Studio, Massimo Botturi, Charlotte Coneybeer, Desola Landre Ologun, Studio Saiz, Claudio Schwarz, NASA, Alena Darmel, Anna Shvetz, Vadim B, The Lucky Neko, Visual Tag Mx; User icon from “Redefining Women” icon collection by Iconathon
  • 51. 51 What are your questions? Sr. Data Scientist linkedin.com/in/florianroscheck github.com/flrs Florian Roscheck Let’s connect!