SlideShare a Scribd company logo
1 of 89
Foundation for ML
Your data +
Microsoft data
Breakthrough
advancements
Data Cloud Models
Power of Azure
SpeechVision Language
2016 2017 20182018
Microsoft ML breakthroughs
Microsoft 365
ML at Microsoft
| Research
ML at scale
Monthly active
Office 365 users
using AI
180
million
Questions Asked
of Cortana
18
Billion
Number of Signals
Analyzed to Block
Emerging Threats
DAILY
6.5
Trillion
But ML is HARD!
Building a model
Building
a model
Data ingestion Data analysis
Data
transformation
Data validation Data splitting
Trainer
Model
validation
Training
at scale
LoggingRoll-out Serving Monitoring
Ok, but, like, I’m
a data scientist. IDGAF
I don’t care
about all that.
Yes You Do!
11
Cowboys and Ranchers Can Be Friends!
SRE/ML EngineersData Scientist
• Quick iteration
• Frameworks they
understand
• Best of breed tools
• No management
headaches
• Unlimited scale
• Reuse of tooling and
platforms
• Corporate compliance
• Observability
• Uptime
Haven’t I Heard This Before?
GitOps = Git + Dev + Ops
GitOps
== VELOCITY and SECURITY
MLOps!
MLOps = ML + DEV + OPS
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
ML
+ Testing
Continuous Integration
Continuous Deployment
MLOps Benefits
• Code drives generation
and deployments
• Pipelines are
reproducible and
verifiable
• All artifacts can be
tagged and audited
• SWE best practices for
quality control
• Offline comparisons of
model quality
• Minimize bias and
enable explainability
• Controlled rollout
capabilities
• Live comparison of
predicted vs. expected
performance
• Results fed back to
watch for drift and
improve model
Automation /
Observability Validation
Reproducibility
/Auditability
== VELOCITY and SECURITY (For ML)
Internal MLOps Platforms
FBLearner FlowTensorFlow Extended
Uber’s Michelangelo
Microsoft Aether
But I Don’t Work at a
Big Company With
Thousands of
ML Engineers!
Build Your Own MLOps Platform
And many MANY more…
+ +
Cloud Provider
MLOps Platforms
Real World Multi-Cloud
CI/CD Pipeline
Process Train Stage Serve
Data
Distributed Cloud
SRE/ML Engineers
Data Scientist
ENV
#1
ENV
#2
Azure DevOps Pipelines
Cloud-hosted pipelines for Linux, Windows and macOS.
Any language, any platform, any cloud
Build, test, and deploy Node.js, Python, 
Java, PHP,
Ruby, C/C++, .NET, Android, and iOS apps. Run in
parallel on Linux, macOS, and Windows. Deploy to
Azure, AWS, GCP or on-premises
Extensible
Explore and implement a wide range of community-
built build, test, and deployment tasks, along with
hundreds of extensions from Slack to SonarCloud.
Support for YAML, reporting and more
Containers and Kubernetes
Easily build and push images to container registries
like Docker Hub and Azure Container Registry.
Deploy containers to individual hosts or Kubernetes.
Azure DevOps + Azure ML
First Class Model Training Tasks
CI pipeline captures:
1. Create sandbox
2. Run unit tests and code quality checks
3. Attach to compute
4. Run training pipeline
5. Evaluate model
6. Register model
Automated Deployment
CD pipeline captures:
1. Package model into container
image
2. Validate and profile model
3. Deploy model to DevTest (ACI)
4. If all is well, proceed to rollout
to AKS
Everything is done via the CLI
Model Versioning & Storage
• which data,
• which experiment / previous model(s),
• where’s the code / notebook)
• Was it converted / quantized?
• Private / compliant data
Model Validation
• Data (changes to shape / profile)
• Model in isolation (offline A/B)
• Model + app (functional testing)
• Only deploy after initial validation passes
• Ramp up traffic to new model using A/B
experimentations
• Functional behavior
• Performance characteristics
Model Profiling
Model Deployment
• Focus on ML, not DevOps
• Get telemetry for service health and model behavior
• code-generation
• API specifications / interfaces
• Cloud Services
• Mobile / Embedded Applications
• Edge Devices
• Quantize / optimize models for target platform
• Compliant + Safe
Seems Like a Lot of Work…
33
MLOps Gets You to Production
• End-to-end ownership by data science teams
using SWE best practices
• Continuously deliver of value to end users.
• Enables lineage, auditability and regulatory
compliance through consistency
Ok… but WHY?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Does My Model
Actually Work?
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Time to test out
my model…
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Looks good to
me! To Production!
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Wait, what?
Oh… oh no…
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
WOAH there.
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
WOAH there.
Source Control
What is
happening…
Source Control
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
A Small Example of Issues You Can Have…
• Inappropriate HW/SW stack
• Mismatched driver versions
• Crash looping deployment
• Data/model versioning [Nick Walsh]
• Non-standard images/OS version
• Pre-processing code doesn’t match
production pre-processing
• Production data doesn’t match
training/test data
• Output of the model doesn’t match
application expectations
• Hand-coded heuristics better than model
[Adam Laiacano]
• Model freshness (train on out-of-date
data/input shape changed)
• Test/production statistics/population
shape skew
• Overfitting on training/test data
• Bias introduction (or not tested)
• Over/under HW provisioning
• Latency issues
• Permissions/certs
• Failure to obey health checks
• Killed production model before roll out
of new/in wrong order
• Thundering herd for new model
• Logging to the wrong location
• Storage for model not allocated
properly/accessible by deployment
tooling
• Route to artifacts not available for
download
• API signature changes not
propagated/expected
• Cross-data center latency
• Expected benefit doesn’t materialize (e.g.
multiple components in the app change
simultaneously)
• Get wrong/no traffic because A/B config
didn’t roll out
• Get too much traffic too soon (expected
to canary/exponential roll out)
• Lack of visibility into real-time model
behavior (detecting data drift, live data
distribution vs train data, etc) [Nick
Walsh]
• Outliers not predicted [MikeBSilverman]
• Change was a good change, but didn’t
communicate with the rest of the team
(so you must roll back)
• No dates! (date to measure
impact/improvement against a pre-
agreed measure; date scheduled to
assess data changes) [Mary Branscombe]
• No CI/CD; manual changes untracked
[Jon Peck]
• LACK OF DOCUMENTATION!! (the
problem, the testing, the solution, lots
more) [Terry Christiani]
• Successful model causes pain elsewhere
in the organization (e.g. detecting faults
previously missed) [Mark Round]
Or It Just Doesn’t Work!
At All!
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Source Control
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Nice. Nice.
ü
But I Can Do All
These Manually…
No.
MLOps is a Platform and a Philosophy
Even if:
o Every data scientist trained...
o And you had all the tools necessary...
o And they all worked together...
o And your SREs understood ML modeling...
o And and and and ...
You’d still need a permenant, repeatble
record of what you did
That’s MLOps!
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Did My
Customers See?
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
I’d Like a loan,
please.
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
No.
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Ok, but why?
Source Control
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Uh oh.
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
LawyerLawyer
It’s Not Just About Explainability!
• Yes, models are complicated
• But, that’s not enough:
o What data did you train on?
o How did you transform/exclude outliers?
o What are the data statistics?
o Did anything change between code and production?
o What model did you actually serve (to this person)?
• MLOps can help!
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Source Control
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
32c04681d7573
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
9ce88802f0759
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
32c04681d7573
Why didn’t I get
a loan?
9ce88802f0759
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
32c04681d7573
32c04681d7573
9ce88802f0759
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Is My Model
Still Good?
Is My Model Still Good?
SRE/ML Engineers
The Cloud
There is a
blue or
orange
DUCK inside
this barn.
What color
is the duck?
Let’s Use Machine
Learning!!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
There is a
blue or
orange
DUCK inside
this barn.
What color
is the duck?
But wait...
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
5 Blue Ducks
995 Yellow Ducks
Accuracy = 99%
False Positive = 1%
???????????????????
Thomas
Bayes
𝑷 𝑨| 𝑩 =
𝑷 𝑩| 𝑨 ⋅ 𝑷 𝑨
𝑷 𝑩
Bayes’ Theorem
Accuracy depends on
the population
distribution!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
995 Yellow Ducks
5 Blue Ducks
WRONG 2/3rd of the Time!
Accuracy = 99%
False Positive = 1%
???????????????????
Who cares…
This Can Be
Addressed!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
995 Yellow Ducks
5 Blue Ducks
Model Server
d4093cc84b267
But…
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
995 Yellow Ducks
5 Blue Ducks
d4093cc84b267
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server 500 Yellow Ducks
500 Blue Ducks
d4093cc84b267
Is My Model Still Good?
• Models != Code – they can go stale... QUICKLY.
• IMPORTANT:
o Watch your model & data for drift from training
o Regularly (if not continuously) retrain, even before
performance begins to fail
o Multiple versions rollbacks are not uncommon!
• Without an e2e MLOps pipeline, many of the
above are O(really really hard)!
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Next for MLOps
MLOps Gives* You…
• Software best practices for building machine
learning solutions
• Repeatable workflow for training a model and
rolling it out to production
• An immutable record of what’s actually running
• Lineage of model creation including data sources
• Acceleration from code to customer benefits
* Requires some human and software work
What’s Next for MLOps
• Simplify monitoring and retraining
• Extend MLOps for data incl prep and profiling
• Enterprise features
o Test cases
o Auditing
o Security
o Resource management (bin packing / resource optimization)
o Network isolation
• Metadata and API standards
Or, better yet, you tell us!
It’s a whole new world
• Data science will touch
EVERY industry.
• We can’t ask people to
become a PhD in statistics
though.
• How do WE help everyone
take advantage of this
transformation?
me: David Aronchick (david.aronchick@microsoft.com)
twitter: @aronchick
github:
• https://github.com/aronchick/kubeflow-and-mlops
• https://aka.ms/mlops
THANK YOU!
Using MLOps to Bring ML to Production/The Promise of MLOps

More Related Content

What's hot

From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro sessionAvinash Patil
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&MDatabricks
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOpsDatabricks
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflowDatabricks
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to productionHerman Wu
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOpsMarco Parenzan
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWSGili Nachum
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowDatabricks
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_futureNisha Talagala
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

What's hot (20)

Introducing MLOps.pdf
Introducing MLOps.pdfIntroducing MLOps.pdf
Introducing MLOps.pdf
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOps
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWS
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
Machine Learning Operations & Azure
Machine Learning Operations & AzureMachine Learning Operations & Azure
Machine Learning Operations & Azure
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflowSeamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Similar to Using MLOps to Bring ML to Production/The Promise of MLOps

Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineSanjana Chowdhury
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-useltonrodriguez11
 
Innovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsInnovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsHal Rottenberg
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Databricks
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsDatabricks
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedLaurenz Wuttke
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...All Things Open
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)dtz001
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowDatabricks
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleDatabricks
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningDavid Walker, CSM,CSD,MCP,MCAD,MCSD,MVP
 
5 Key Metrics to Release Better Software Faster
5 Key Metrics to Release Better Software Faster5 Key Metrics to Release Better Software Faster
5 Key Metrics to Release Better Software FasterDynatrace
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning ProductsAndrew Musselman
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence WorkshopDavid Tan
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
A Beard, An App, A Blender
A Beard, An App, A BlenderA Beard, An App, A Blender
A Beard, An App, A Blenderedm00se
 
[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft AzureKorkrid Akepanidtaworn
 

Similar to Using MLOps to Bring ML to Production/The Promise of MLOps (20)

Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Innovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsInnovate Better Through Machine data Analytics
Innovate Better Through Machine data Analytics
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
Orchestration, the conductor's score
Orchestration, the conductor's scoreOrchestration, the conductor's score
Orchestration, the conductor's score
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
 
5 Key Metrics to Release Better Software Faster
5 Key Metrics to Release Better Software Faster5 Key Metrics to Release Better Software Faster
5 Key Metrics to Release Better Software Faster
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
A Beard, An App, A Blender
A Beard, An App, A BlenderA Beard, An App, A Blender
A Beard, An App, A Blender
 
[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure
 

More from Weaveworks

Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weaveworks
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Weaveworks
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWebinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWeaveworks
 
Six Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringSix Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringWeaveworks
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWebinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWeaveworks
 
Flux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIFlux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIWeaveworks
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersAutomated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersWeaveworks
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesHow to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesWeaveworks
 
Building internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsBuilding internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsWeaveworks
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfGitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfWeaveworks
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks
 
Implementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyImplementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyWeaveworks
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSAccelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSWeaveworks
 
The Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFThe Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFWeaveworks
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Weaveworks
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Weaveworks
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfFlux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfWeaveworks
 
Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Weaveworks
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsDeploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsWeaveworks
 

More from Weaveworks (20)

Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)Weave AI Controllers (Weave GitOps Office Hours)
Weave AI Controllers (Weave GitOps Office Hours)
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)Flamingo: Expand ArgoCD with Flux (Office Hours)
Flamingo: Expand ArgoCD with Flux (Office Hours)
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWebinar: Capabilities, Confidence and Community – What Flux GA Means for You
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
 
Six Signs You Need Platform Engineering
Six Signs You Need Platform EngineeringSix Signs You Need Platform Engineering
Six Signs You Need Platform Engineering
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWebinar: End to End Security & Operations with Chainguard and Weave GitOps
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
 
Flux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCIFlux Beyond Git Harnessing the Power of OCI
Flux Beyond Git Harnessing the Power of OCI
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes ClustersAutomated Provisioning, Management & Cost Control for Kubernetes Clusters
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy CatastrophesHow to Avoid Kubernetes Multi-tenancy Catastrophes
How to Avoid Kubernetes Multi-tenancy Catastrophes
 
Building internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOpsBuilding internal developer platform with EKS and GitOps
Building internal developer platform with EKS and GitOps
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdfGitOps Testing in Kubernetes with Flux and Testkube.pdf
GitOps Testing in Kubernetes with Flux and Testkube.pdf
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and Linkerd
 
Implementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancyImplementing Flux for Scale with Soft Multi-tenancy
Implementing Flux for Scale with Soft Multi-tenancy
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSAccelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
 
The Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCFThe Story of Flux Reaching Graduation in the CNCF
The Story of Flux Reaching Graduation in the CNCF
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdfFlux’s Security & Scalability with OCI & Helm Slides.pdf
Flux’s Security & Scalability with OCI & Helm Slides.pdf
 
Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension Flux Security & Scalability using VS Code GitOps Extension
Flux Security & Scalability using VS Code GitOps Extension
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsDeploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Using MLOps to Bring ML to Production/The Promise of MLOps

  • 1.
  • 2. Foundation for ML Your data + Microsoft data Breakthrough advancements Data Cloud Models Power of Azure
  • 3. SpeechVision Language 2016 2017 20182018 Microsoft ML breakthroughs
  • 4. Microsoft 365 ML at Microsoft | Research
  • 5. ML at scale Monthly active Office 365 users using AI 180 million Questions Asked of Cortana 18 Billion Number of Signals Analyzed to Block Emerging Threats DAILY 6.5 Trillion
  • 6. But ML is HARD!
  • 8. Building a model Data ingestion Data analysis Data transformation Data validation Data splitting Trainer Model validation Training at scale LoggingRoll-out Serving Monitoring
  • 9. Ok, but, like, I’m a data scientist. IDGAF I don’t care about all that.
  • 11. 11
  • 12. Cowboys and Ranchers Can Be Friends! SRE/ML EngineersData Scientist • Quick iteration • Frameworks they understand • Best of breed tools • No management headaches • Unlimited scale • Reuse of tooling and platforms • Corporate compliance • Observability • Uptime
  • 13. Haven’t I Heard This Before?
  • 14. GitOps = Git + Dev + Ops
  • 17. MLOps = ML + DEV + OPS Experiment Data Acquisition Business Understanding Initial Modeling Develop Modeling Operate Continuous Delivery Data Feedback Loop System + Model Monitoring ML + Testing Continuous Integration Continuous Deployment
  • 18. MLOps Benefits • Code drives generation and deployments • Pipelines are reproducible and verifiable • All artifacts can be tagged and audited • SWE best practices for quality control • Offline comparisons of model quality • Minimize bias and enable explainability • Controlled rollout capabilities • Live comparison of predicted vs. expected performance • Results fed back to watch for drift and improve model Automation / Observability Validation Reproducibility /Auditability == VELOCITY and SECURITY (For ML)
  • 19. Internal MLOps Platforms FBLearner FlowTensorFlow Extended Uber’s Michelangelo Microsoft Aether
  • 20. But I Don’t Work at a Big Company With Thousands of ML Engineers!
  • 21. Build Your Own MLOps Platform And many MANY more… + +
  • 23. Real World Multi-Cloud CI/CD Pipeline Process Train Stage Serve Data Distributed Cloud SRE/ML Engineers Data Scientist ENV #1 ENV #2
  • 24. Azure DevOps Pipelines Cloud-hosted pipelines for Linux, Windows and macOS. Any language, any platform, any cloud Build, test, and deploy Node.js, Python, 
Java, PHP, Ruby, C/C++, .NET, Android, and iOS apps. Run in parallel on Linux, macOS, and Windows. Deploy to Azure, AWS, GCP or on-premises Extensible Explore and implement a wide range of community- built build, test, and deployment tasks, along with hundreds of extensions from Slack to SonarCloud. Support for YAML, reporting and more Containers and Kubernetes Easily build and push images to container registries like Docker Hub and Azure Container Registry. Deploy containers to individual hosts or Kubernetes.
  • 25. Azure DevOps + Azure ML
  • 26. First Class Model Training Tasks CI pipeline captures: 1. Create sandbox 2. Run unit tests and code quality checks 3. Attach to compute 4. Run training pipeline 5. Evaluate model 6. Register model
  • 27. Automated Deployment CD pipeline captures: 1. Package model into container image 2. Validate and profile model 3. Deploy model to DevTest (ACI) 4. If all is well, proceed to rollout to AKS Everything is done via the CLI
  • 28. Model Versioning & Storage • which data, • which experiment / previous model(s), • where’s the code / notebook) • Was it converted / quantized? • Private / compliant data
  • 29. Model Validation • Data (changes to shape / profile) • Model in isolation (offline A/B) • Model + app (functional testing) • Only deploy after initial validation passes • Ramp up traffic to new model using A/B experimentations • Functional behavior • Performance characteristics
  • 31. Model Deployment • Focus on ML, not DevOps • Get telemetry for service health and model behavior • code-generation • API specifications / interfaces • Cloud Services • Mobile / Embedded Applications • Edge Devices • Quantize / optimize models for target platform • Compliant + Safe
  • 32. Seems Like a Lot of Work…
  • 33. 33
  • 34. MLOps Gets You to Production • End-to-end ownership by data science teams using SWE best practices • Continuously deliver of value to end users. • Enables lineage, auditability and regulatory compliance through consistency
  • 36. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 37. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 39. Does My Model Actually Work? SRE/ML EngineersData Scientist Time to test out my model… Laptop The Cloud
  • 40. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud
  • 41. Does My Model Actually Work? SRE/ML EngineersData Scientist Looks good to me! To Production! Laptop The Cloud
  • 42. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud Wait, what? Oh… oh no…
  • 43. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud WOAH there.
  • 44. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud WOAH there. Source Control
  • 45. What is happening… Source Control Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud
  • 46. A Small Example of Issues You Can Have… • Inappropriate HW/SW stack • Mismatched driver versions • Crash looping deployment • Data/model versioning [Nick Walsh] • Non-standard images/OS version • Pre-processing code doesn’t match production pre-processing • Production data doesn’t match training/test data • Output of the model doesn’t match application expectations • Hand-coded heuristics better than model [Adam Laiacano] • Model freshness (train on out-of-date data/input shape changed) • Test/production statistics/population shape skew • Overfitting on training/test data • Bias introduction (or not tested) • Over/under HW provisioning • Latency issues • Permissions/certs • Failure to obey health checks • Killed production model before roll out of new/in wrong order • Thundering herd for new model • Logging to the wrong location • Storage for model not allocated properly/accessible by deployment tooling • Route to artifacts not available for download • API signature changes not propagated/expected • Cross-data center latency • Expected benefit doesn’t materialize (e.g. multiple components in the app change simultaneously) • Get wrong/no traffic because A/B config didn’t roll out • Get too much traffic too soon (expected to canary/exponential roll out) • Lack of visibility into real-time model behavior (detecting data drift, live data distribution vs train data, etc) [Nick Walsh] • Outliers not predicted [MikeBSilverman] • Change was a good change, but didn’t communicate with the rest of the team (so you must roll back) • No dates! (date to measure impact/improvement against a pre- agreed measure; date scheduled to assess data changes) [Mary Branscombe] • No CI/CD; manual changes untracked [Jon Peck] • LACK OF DOCUMENTATION!! (the problem, the testing, the solution, lots more) [Terry Christiani] • Successful model causes pain elsewhere in the organization (e.g. detecting faults previously missed) [Mark Round] Or It Just Doesn’t Work! At All!
  • 47. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud Source Control Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Nice. Nice. ü
  • 48. But I Can Do All These Manually…
  • 49. No.
  • 50. MLOps is a Platform and a Philosophy Even if: o Every data scientist trained... o And you had all the tools necessary... o And they all worked together... o And your SREs understood ML modeling... o And and and and ... You’d still need a permenant, repeatble record of what you did
  • 52. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 53. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 55. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer I’d Like a loan, please. Source Control
  • 56. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer No. Source Control
  • 57. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Ok, but why? Source Control
  • 58. Source Control What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Uh oh. Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer LawyerLawyer
  • 59. It’s Not Just About Explainability! • Yes, models are complicated • But, that’s not enough: o What data did you train on? o How did you transform/exclude outliers? o What are the data statistics? o Did anything change between code and production? o What model did you actually serve (to this person)? • MLOps can help!
  • 60. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Source Control Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment
  • 61. 32c04681d7573 What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 9ce88802f0759
  • 62. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 32c04681d7573 Why didn’t I get a loan? 9ce88802f0759
  • 63. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 32c04681d7573 32c04681d7573 9ce88802f0759
  • 64. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 65. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 67. Is My Model Still Good? SRE/ML Engineers The Cloud There is a blue or orange DUCK inside this barn. What color is the duck?
  • 69. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE There is a blue or orange DUCK inside this barn. What color is the duck?
  • 71. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 5 Blue Ducks 995 Yellow Ducks Accuracy = 99% False Positive = 1% ???????????????????
  • 73. 𝑷 𝑨| 𝑩 = 𝑷 𝑩| 𝑨 ⋅ 𝑷 𝑨 𝑷 𝑩 Bayes’ Theorem
  • 74. Accuracy depends on the population distribution!
  • 75. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 995 Yellow Ducks 5 Blue Ducks WRONG 2/3rd of the Time! Accuracy = 99% False Positive = 1% ???????????????????
  • 78. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 995 Yellow Ducks 5 Blue Ducks Model Server d4093cc84b267
  • 80. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server 995 Yellow Ducks 5 Blue Ducks d4093cc84b267
  • 81. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server 500 Yellow Ducks 500 Blue Ducks d4093cc84b267
  • 82. Is My Model Still Good? • Models != Code – they can go stale... QUICKLY. • IMPORTANT: o Watch your model & data for drift from training o Regularly (if not continuously) retrain, even before performance begins to fail o Multiple versions rollbacks are not uncommon! • Without an e2e MLOps pipeline, many of the above are O(really really hard)!
  • 83. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 85. MLOps Gives* You… • Software best practices for building machine learning solutions • Repeatable workflow for training a model and rolling it out to production • An immutable record of what’s actually running • Lineage of model creation including data sources • Acceleration from code to customer benefits * Requires some human and software work
  • 86. What’s Next for MLOps • Simplify monitoring and retraining • Extend MLOps for data incl prep and profiling • Enterprise features o Test cases o Auditing o Security o Resource management (bin packing / resource optimization) o Network isolation • Metadata and API standards Or, better yet, you tell us!
  • 87. It’s a whole new world • Data science will touch EVERY industry. • We can’t ask people to become a PhD in statistics though. • How do WE help everyone take advantage of this transformation?
  • 88. me: David Aronchick (david.aronchick@microsoft.com) twitter: @aronchick github: • https://github.com/aronchick/kubeflow-and-mlops • https://aka.ms/mlops THANK YOU!