SlideShare a Scribd company logo
1 of 39
Download to read offline
MACHINE LEARNING MODEL DEPLOYMENT
From Strategy to Implementation
2 © Cloudera, Inc. All rights reserved.
ABOUT ME
• Head of Cloudera’s Fast Forward Labs ML research and consulting
team
• Built and scaled numerous production ML systems and teams
spanning government, B2B and consumer organizations
• Tech blogger. Musician. Twitter: @justinJDN
•
Justin Norman
Director DS & Research Svcs
3 © Cloudera, Inc. All rights reserved.
ABOUT ME
• Cloudera Strategic Solutions Architect focused on Data Science
and Machine Learning
• Developed and deployed models across diverse verticals such
as Finance, Healthcare, etc.
• Frequent speaker at Big Data Conferences including Oreilly
Strata etc.
Sagar Kewalramani
Solutions Architect, Professional
Services
4 © Cloudera, Inc. All rights reserved.
• Google predicts
commute times.
ML IS
EVERYWHERE
Google didn’t set out to make a
traffic tool.
Apple isn’t in the facial recognition
business.
• Apple predicts facial
matches.
• Dozens of other ML-
powered models in
your phone today.
5 © Cloudera, Inc. All rights reserved.
ML IS AT THE HEART OF TRANSFORMATION
AI
MACHINE
LEARNING
DATA SCIENCE
ANALYTICS
"BIG DATA"
Probabilistic
Deterministic
What could happen?
What happened?
6 © Cloudera, Inc. All rights reserved.
WHAT IS PRODUCTION ML?
Data
Engineering
Business
Inputs
Data Science
Production Machine Learning
Packaging*
Pipeline
Hardening
(Data
Engineering)
Model
Hardening
Deploy Monitoring
MODEL SECURITY
MODEL
GOVERNANCE
DATA CATALOG
MODEL CATALOG FEATURE CATALOG
7 © Cloudera, Inc. All rights reserved.
WHICH TEAM ROLES ARE INVOLVED?
DATA ENGINEERING
DATA SCIENCE
PRODUCTION ML
DATA
PREP
PIPELINES
DATA MODELING
DATA
TRANSFORMATION
DATA INGEST JOB
MONITORING
TRAINING
DATA
DISCOVERY
JOB TUNING
EXPERIMENTATION
PROTOTYPING
MODEL
DEPLOYMENT
MODEL
MONITORING
DATA
MONITORING
8 © Cloudera, Inc. All rights reserved.
WHAT ARE THE KEY SKILLS?
Big Data
Platform
ML/AI
Frameworks
Container
Infrastructure
Orchestration
9 © Cloudera, Inc. All rights reserved.
WHAT IS A MODEL ANYWAY?
Taking many forms, an algorithm designed to make predictions based on data input
{key, value} - Prediction
- Metadata
Monitoring
Business
SystemsUpstream
Systems
Model
Batch or Stream
10 © Cloudera, Inc. All rights reserved.
HIDDEN TECHNICAL DEBT IN ML SYSTEMS
Google Paper
Source: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Only a small fraction of real-world ML systems is composed of the ML code, as shown by the small black box in the middle. The
required surrounding infrastructure is vast and complex.
11 © Cloudera, Inc. All rights reserved.
SAMPLE DATA SCIENCE / ML WORKFLOW
From Data Exploration to Action
12 © Cloudera, Inc. All rights reserved.
CHALLENGES
Tools, Platforms, Data
?
13 © Cloudera, Inc. All rights reserved.
CHALLENGES
Recipes, not Cakes
Recode
Deployment Expectations
• Support A/B testing
• Support
Experiments
• Support measuring
& Evaluating model
performance
• Deployment should
be fast and adaptive
to business needs
14 © Cloudera, Inc. All rights reserved.
SUMMARY OF CHALLENGES
• Access
For sensitive data, secure clusters are
difficult to access. No shared security
• Flexibility
IT typically doesn’t want random
packages installed on a secure cluster.
• Tools
Popular open source tools don’t easily
connect to these environments, or
always support Hadoop data formats.
Nothing supports full workflow
• Scale
Laptops rarely have capacity for
medium, let alone big data. This
leads to a lot of sampling.
• Parallelism
Popular frameworks don’t easily
parallelize on a cluster. Typically
code has to get rewritten for
production.
• Security
Data being pulled into laptops
• Developer Experience
Notebooks, while awesome, don’t
easily support virtual environment
and dependency management,
especially for teams.
• Collaboration
No easy way to share code between
teams
• Deployment
Notebooks are also challenging to
“put into production.”
15 © Cloudera, Inc. All rights reserved.
MACHINE LEARNING AT UBER, NETFLIX, AND FACEBOOK
Industrialized AI requires requires new supporting tools and platforms
Facebook
FBLearner
Uber
Michelangelo
Netflix
Recommendation
Platform
16 © Cloudera, Inc. All rights reserved.
ML AT SCALE REQUIRES A UNIFIED DATA STRATEGY
Streaming
Ingest
Batch Ingest
Machine
Learning Tools
BI Tools and
SQL Editors
Data Products
DATA, METADATA, SECURITY, GOVERNANCE, WORKLOAD MANAGEMENT
MACHINE
LEARNING
DATA
ENGINEERING
DATA
WAREHOUSE
OPERATIONAL
DATABASE
© Cloudera, Inc. All rights reserved.17 © Cloudera, Inc. All rights reserved.
YOU’VE GOT OPTIONS…
Model Dev, Training, Deployment & Monitoring
© Cloudera, Inc. All rights reserved.18 © Cloudera, Inc. All rights reserved.
MODEL DEVELOPMENT
19 © Cloudera, Inc. All rights reserved.
EVERYONE HAS AN OPINION
• Should enable collaboration and code reuse
(git integration)
• Should support open-source frameworks and
libraries
• Must handle dependencies and isolates dev
environment for and individual session
• Can scale compute resources/up down when
needed
• Doesn’t require you to move data to use it!
© Cloudera, Inc. All rights reserved.20 © Cloudera, Inc. All rights reserved.
TRAINING & EXPERIMENTS
© Cloudera, Inc. All rights reserved.21 © Cloudera, Inc. All rights reserved.
A/B TESTING & MULTIVARIATE TESTING FOR THE MODEL
Is the best trained model indeed the best model, or does a different model
perform better on new, unseen data?
MODEL
VARIATION A
MODEL
VARIATION B
INCOMING
TRAFFIC
Data scientists need ...
• A framework to identify the best performers
among a competing set of models
• To evaluate models which can maximize
business KPIs
• Track specified model metrics, performance,
and model artifacts
• Inspect, & compare deployed models
© Cloudera, Inc. All rights reserved.22 © Cloudera, Inc. All rights reserved.
EXPERIMENT MANAGEMENT
Versioned, reproducible model training & evaluation runs
Data scientists need to ...
• Create a snapshot of model code, dependencies,
and configuration necessary to train the model
• Build and execute the training run in an isolated
container
• Track specified model metrics, performance,
and model artifacts
• Inspect, compare, or deploy prior models
Many options of varying maturity and don’t all
play well with other ecosystem tools
Sacred
Proprietary
Open-Source
© Cloudera, Inc. All rights reserved.23 © Cloudera, Inc. All rights reserved.
MODEL DEPLOYMENT
24 © Cloudera, Inc. All rights reserved.
MODEL DEPLOYMENT PATTERNS
Knowing how business metrics will be improved help guide deployment options
Managers use data to make better
decisions
Centrally automate internal
decisions
Centrally automate customer-
facing decisions
Automate decisions at the edge
Batch Scoring, Hosted
Real Time Scoring, Hosted
Real Time Scoring, Data Flow + Custom
Monitoring
Real Time Scoring, Device Embedded
© Cloudera, Inc. All rights reserved.25 © Cloudera, Inc. All rights reserved.
MODEL DEPLOYMENT APPROACH : TECHNOLOGICAL VS COST BENEFITS
DIFFERENT MODEL DEPLOYMENT FORMATS
NATIVE JAVA/C++ MODEL
• Faster
• Limitation of Available Algo/DS Libraries
HYBRID APPROACH PMML:
• Compatibility across multiple tools
• Non Agile
• Not flexible in terms of deployment
PYTHON STACK
• PMML files are big
• Unit testing is tricky
API POWERED MODEL:
• Agile
• Scalable
• Can be used by both backend & fronted
• Faster
API POWERED
MODEL
HYBRID APPROACH
PMML
REBUILD THE
WHOLE STACK
TO PYTHON
NATIVE JAVA / C++
MODELS
COST $
TECHNOLOGICAL BENEFITS
© Cloudera, Inc. All rights reserved.26 © Cloudera, Inc. All rights reserved.
MONITORING
© Cloudera, Inc. All rights reserved.27 © Cloudera, Inc. All rights reserved.
MONITORING STATS
SCHEDULE & MONITOR
Production ML needs...
● A Monitoring mechanism that is model-agnostic
● Instrumentation of both the data flow in and the model performance metrics out
● To Collect Performance Metrics (e.g., accuracy, RMSE, ,Mean Absolute Error(MAE) )
© Cloudera, Inc. All rights reserved.28 © Cloudera, Inc. All rights reserved.
CLOUDERA ML APPROACH
Modern enterprise platform, tools and expert guidance to add SPEED and SCALE
Agile platform to build,
train, and deploy many
scalable ML applications
Enterprise data science
tools to accelerate
team productivity
Expert guidance,
services & training to
fast track value & scale
© Cloudera, Inc. All rights reserved.29 © Cloudera, Inc. All rights reserved.
ACCELERATING THREE STAGES OF MACHINE LEARNING
Enterprise AI platform supporting model development, training, and deployment
Manage models
Deploy models
Monitor performance
DEPLOYDEVELOP
Explore data
Develop models
Share results
TRAIN
Optimize parameters
Track experiments
Compare performance
© Cloudera, Inc. All rights reserved.30 © Cloudera, Inc. All rights reserved.
ACCELERATING MACHINE LEARNING
Lego Block for ML: Like a containerized edge node
Wrap with REST endpoint
Online Scoring
JSON in, JSON out
MODELSSESSIONS
Interactive session for
exploration and
development
EXPERIMENTS
Initiate and track
Like a lab notebook
Export artifacts to project
Runtime
Engine:
Kernels (R/Python/Scala)
Common Libraries
FS Mounts:
CDH - Parcel Dir
RPM - Hadoop Config Files
Project Dir:
Code
Files
Libraries
Dependencies
JOBS
Scheduled
Run a particular code end-to-
end
New snapshots retain history
Point in time
Git snapshot
© Cloudera, Inc. All rights reserved.31 © Cloudera, Inc. All rights reserved.
DEMO
© Cloudera, Inc. All rights reserved.32 © Cloudera, Inc. All rights reserved.
SELF-SERVICE
CLOUDERA DATA SCIENCE WORKBENCH
© Cloudera, Inc. All rights reserved.33 © Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCH
Bringing the data scientists TO the data in a way that they want to work
For data scientists
• Experiment faster
Use R, Python, or Scala with
on-demand compute and
secure CDH/HDP data access
• Work together
Share reproducible research
with your whole team
• Deploy with confidence
Get to production repeatably
and without recoding
For IT professionals
• Bring data science to the data
Give your data science team
more freedom while reducing
the risk and cost of silos
• Secure by default
Leverage common security
and governance across
workloads
• Run anywhere
On-premises or in the cloud
© Cloudera, Inc. All rights reserved.34 © Cloudera, Inc. All rights reserved.
CDSW MODELS
Machine learning models as one-click microservices (REST APIs)
1. Choose file, e.g. score.py
2. Choose function, e.g. forecast
f = open('model.pk', 'rb')
model = pickle.load(f)
def forecast(data):
return model.predict(data)
3. Choose resources
4. Deploy!
Running model containers also have access to CDH
for data lookups.
© Cloudera, Inc. All rights reserved.35 © Cloudera, Inc. All rights reserved.
CDSW EXPERIMENTS
Versioned model training runs for evaluation and reproducibility
Data scientists can ...
• Create a snapshot of model code, dependencies,
and configuration necessary to train the model
• Build and execute the training run in an isolated
container
• Track specified model metrics, performance,
and model artifacts
• Inspect, compare, or deploy prior models
© Cloudera, Inc. All rights reserved.36 © Cloudera, Inc. All rights reserved.
MODEL MANAGEMENT
View, test, monitor, and update models by team or project
© Cloudera, Inc. All rights reserved.37 © Cloudera, Inc. All rights reserved.
CDSW JOBS TO ORCHESTRATE BATCH SCORING
Schedule reports & scoring to run on a periodic basis
Scheduling is easy and powerful
●Execute arbitrary scripts
●Schedule on a recurring basis
●Create dependencies on other jobs for
complex pipelines
●Allow output to be sent via email to
recipients
© Cloudera, Inc. All rights reserved.38 © Cloudera, Inc. All rights reserved.
SUMMARY OF FEATURES
End-to-End
Workflow
Support
• Development
• Train
• Deployment
Collaboration
• Teams
• Sharing
• Good coding
practices (Git)
Security and
Governance
• Transparent
• Leverages
underlying
frameworks
• No data
movement
• Reproducibility
Openness and
Self-service
• Any
framework
• Isolated for
individual
effectiveness
• Simplified
dependency
management
© Cloudera, Inc. All rights reserved.
THANK YOU

More Related Content

What's hot

MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
Google Cloud Machine Learning
 Google Cloud Machine Learning  Google Cloud Machine Learning
Google Cloud Machine Learning India Quotient
 
ML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationHunter Carlisle
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro sessionAvinash Patil
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLJordan Birdsell
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowDatabricks
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflowDatabricks
 
Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprisedoppenhe
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWSGili Nachum
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&MDatabricks
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOpsMarco Parenzan
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOpsDatabricks
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_futureNisha Talagala
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
 

What's hot (20)

MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Google Cloud Machine Learning
 Google Cloud Machine Learning  Google Cloud Machine Learning
Google Cloud Machine Learning
 
ML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production Application
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Machine Learning Operations & Azure
Machine Learning Operations & AzureMachine Learning Operations & Azure
Machine Learning Operations & Azure
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprise
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWS
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOps
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 

Similar to Machine Learning Model Deployment: Strategy to Implementation

The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndCloudera, Inc.
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGMatt Stubbs
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Data Science in Enterprise
Data Science in EnterpriseData Science in Enterprise
Data Science in EnterpriseJosh Yeh
 
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine ...
Edge to AI:  Analytics from Edge to Cloud with Efficient Movement of Machine ...Edge to AI:  Analytics from Edge to Cloud with Efficient Movement of Machine ...
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine ...Timothy Spann
 
The Edge to AI Deep Dive Barcelona Meetup March 2019
The Edge to AI Deep Dive Barcelona Meetup March 2019The Edge to AI Deep Dive Barcelona Meetup March 2019
The Edge to AI Deep Dive Barcelona Meetup March 2019Timothy Spann
 
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaNeo4j
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfHong Ong
 

Similar to Machine Learning Model Deployment: Strategy to Implementation (20)

The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Data Science in Enterprise
Data Science in EnterpriseData Science in Enterprise
Data Science in Enterprise
 
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine ...
Edge to AI:  Analytics from Edge to Cloud with Efficient Movement of Machine ...Edge to AI:  Analytics from Edge to Cloud with Efficient Movement of Machine ...
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine ...
 
The Edge to AI Deep Dive Barcelona Meetup March 2019
The Edge to AI Deep Dive Barcelona Meetup March 2019The Edge to AI Deep Dive Barcelona Meetup March 2019
The Edge to AI Deep Dive Barcelona Meetup March 2019
 
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, Cloudera
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Machine Learning Model Deployment: Strategy to Implementation

  • 1. MACHINE LEARNING MODEL DEPLOYMENT From Strategy to Implementation
  • 2. 2 © Cloudera, Inc. All rights reserved. ABOUT ME • Head of Cloudera’s Fast Forward Labs ML research and consulting team • Built and scaled numerous production ML systems and teams spanning government, B2B and consumer organizations • Tech blogger. Musician. Twitter: @justinJDN • Justin Norman Director DS & Research Svcs
  • 3. 3 © Cloudera, Inc. All rights reserved. ABOUT ME • Cloudera Strategic Solutions Architect focused on Data Science and Machine Learning • Developed and deployed models across diverse verticals such as Finance, Healthcare, etc. • Frequent speaker at Big Data Conferences including Oreilly Strata etc. Sagar Kewalramani Solutions Architect, Professional Services
  • 4. 4 © Cloudera, Inc. All rights reserved. • Google predicts commute times. ML IS EVERYWHERE Google didn’t set out to make a traffic tool. Apple isn’t in the facial recognition business. • Apple predicts facial matches. • Dozens of other ML- powered models in your phone today.
  • 5. 5 © Cloudera, Inc. All rights reserved. ML IS AT THE HEART OF TRANSFORMATION AI MACHINE LEARNING DATA SCIENCE ANALYTICS "BIG DATA" Probabilistic Deterministic What could happen? What happened?
  • 6. 6 © Cloudera, Inc. All rights reserved. WHAT IS PRODUCTION ML? Data Engineering Business Inputs Data Science Production Machine Learning Packaging* Pipeline Hardening (Data Engineering) Model Hardening Deploy Monitoring MODEL SECURITY MODEL GOVERNANCE DATA CATALOG MODEL CATALOG FEATURE CATALOG
  • 7. 7 © Cloudera, Inc. All rights reserved. WHICH TEAM ROLES ARE INVOLVED? DATA ENGINEERING DATA SCIENCE PRODUCTION ML DATA PREP PIPELINES DATA MODELING DATA TRANSFORMATION DATA INGEST JOB MONITORING TRAINING DATA DISCOVERY JOB TUNING EXPERIMENTATION PROTOTYPING MODEL DEPLOYMENT MODEL MONITORING DATA MONITORING
  • 8. 8 © Cloudera, Inc. All rights reserved. WHAT ARE THE KEY SKILLS? Big Data Platform ML/AI Frameworks Container Infrastructure Orchestration
  • 9. 9 © Cloudera, Inc. All rights reserved. WHAT IS A MODEL ANYWAY? Taking many forms, an algorithm designed to make predictions based on data input {key, value} - Prediction - Metadata Monitoring Business SystemsUpstream Systems Model Batch or Stream
  • 10. 10 © Cloudera, Inc. All rights reserved. HIDDEN TECHNICAL DEBT IN ML SYSTEMS Google Paper Source: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Only a small fraction of real-world ML systems is composed of the ML code, as shown by the small black box in the middle. The required surrounding infrastructure is vast and complex.
  • 11. 11 © Cloudera, Inc. All rights reserved. SAMPLE DATA SCIENCE / ML WORKFLOW From Data Exploration to Action
  • 12. 12 © Cloudera, Inc. All rights reserved. CHALLENGES Tools, Platforms, Data ?
  • 13. 13 © Cloudera, Inc. All rights reserved. CHALLENGES Recipes, not Cakes Recode Deployment Expectations • Support A/B testing • Support Experiments • Support measuring & Evaluating model performance • Deployment should be fast and adaptive to business needs
  • 14. 14 © Cloudera, Inc. All rights reserved. SUMMARY OF CHALLENGES • Access For sensitive data, secure clusters are difficult to access. No shared security • Flexibility IT typically doesn’t want random packages installed on a secure cluster. • Tools Popular open source tools don’t easily connect to these environments, or always support Hadoop data formats. Nothing supports full workflow • Scale Laptops rarely have capacity for medium, let alone big data. This leads to a lot of sampling. • Parallelism Popular frameworks don’t easily parallelize on a cluster. Typically code has to get rewritten for production. • Security Data being pulled into laptops • Developer Experience Notebooks, while awesome, don’t easily support virtual environment and dependency management, especially for teams. • Collaboration No easy way to share code between teams • Deployment Notebooks are also challenging to “put into production.”
  • 15. 15 © Cloudera, Inc. All rights reserved. MACHINE LEARNING AT UBER, NETFLIX, AND FACEBOOK Industrialized AI requires requires new supporting tools and platforms Facebook FBLearner Uber Michelangelo Netflix Recommendation Platform
  • 16. 16 © Cloudera, Inc. All rights reserved. ML AT SCALE REQUIRES A UNIFIED DATA STRATEGY Streaming Ingest Batch Ingest Machine Learning Tools BI Tools and SQL Editors Data Products DATA, METADATA, SECURITY, GOVERNANCE, WORKLOAD MANAGEMENT MACHINE LEARNING DATA ENGINEERING DATA WAREHOUSE OPERATIONAL DATABASE
  • 17. © Cloudera, Inc. All rights reserved.17 © Cloudera, Inc. All rights reserved. YOU’VE GOT OPTIONS… Model Dev, Training, Deployment & Monitoring
  • 18. © Cloudera, Inc. All rights reserved.18 © Cloudera, Inc. All rights reserved. MODEL DEVELOPMENT
  • 19. 19 © Cloudera, Inc. All rights reserved. EVERYONE HAS AN OPINION • Should enable collaboration and code reuse (git integration) • Should support open-source frameworks and libraries • Must handle dependencies and isolates dev environment for and individual session • Can scale compute resources/up down when needed • Doesn’t require you to move data to use it!
  • 20. © Cloudera, Inc. All rights reserved.20 © Cloudera, Inc. All rights reserved. TRAINING & EXPERIMENTS
  • 21. © Cloudera, Inc. All rights reserved.21 © Cloudera, Inc. All rights reserved. A/B TESTING & MULTIVARIATE TESTING FOR THE MODEL Is the best trained model indeed the best model, or does a different model perform better on new, unseen data? MODEL VARIATION A MODEL VARIATION B INCOMING TRAFFIC Data scientists need ... • A framework to identify the best performers among a competing set of models • To evaluate models which can maximize business KPIs • Track specified model metrics, performance, and model artifacts • Inspect, & compare deployed models
  • 22. © Cloudera, Inc. All rights reserved.22 © Cloudera, Inc. All rights reserved. EXPERIMENT MANAGEMENT Versioned, reproducible model training & evaluation runs Data scientists need to ... • Create a snapshot of model code, dependencies, and configuration necessary to train the model • Build and execute the training run in an isolated container • Track specified model metrics, performance, and model artifacts • Inspect, compare, or deploy prior models Many options of varying maturity and don’t all play well with other ecosystem tools Sacred Proprietary Open-Source
  • 23. © Cloudera, Inc. All rights reserved.23 © Cloudera, Inc. All rights reserved. MODEL DEPLOYMENT
  • 24. 24 © Cloudera, Inc. All rights reserved. MODEL DEPLOYMENT PATTERNS Knowing how business metrics will be improved help guide deployment options Managers use data to make better decisions Centrally automate internal decisions Centrally automate customer- facing decisions Automate decisions at the edge Batch Scoring, Hosted Real Time Scoring, Hosted Real Time Scoring, Data Flow + Custom Monitoring Real Time Scoring, Device Embedded
  • 25. © Cloudera, Inc. All rights reserved.25 © Cloudera, Inc. All rights reserved. MODEL DEPLOYMENT APPROACH : TECHNOLOGICAL VS COST BENEFITS DIFFERENT MODEL DEPLOYMENT FORMATS NATIVE JAVA/C++ MODEL • Faster • Limitation of Available Algo/DS Libraries HYBRID APPROACH PMML: • Compatibility across multiple tools • Non Agile • Not flexible in terms of deployment PYTHON STACK • PMML files are big • Unit testing is tricky API POWERED MODEL: • Agile • Scalable • Can be used by both backend & fronted • Faster API POWERED MODEL HYBRID APPROACH PMML REBUILD THE WHOLE STACK TO PYTHON NATIVE JAVA / C++ MODELS COST $ TECHNOLOGICAL BENEFITS
  • 26. © Cloudera, Inc. All rights reserved.26 © Cloudera, Inc. All rights reserved. MONITORING
  • 27. © Cloudera, Inc. All rights reserved.27 © Cloudera, Inc. All rights reserved. MONITORING STATS SCHEDULE & MONITOR Production ML needs... ● A Monitoring mechanism that is model-agnostic ● Instrumentation of both the data flow in and the model performance metrics out ● To Collect Performance Metrics (e.g., accuracy, RMSE, ,Mean Absolute Error(MAE) )
  • 28. © Cloudera, Inc. All rights reserved.28 © Cloudera, Inc. All rights reserved. CLOUDERA ML APPROACH Modern enterprise platform, tools and expert guidance to add SPEED and SCALE Agile platform to build, train, and deploy many scalable ML applications Enterprise data science tools to accelerate team productivity Expert guidance, services & training to fast track value & scale
  • 29. © Cloudera, Inc. All rights reserved.29 © Cloudera, Inc. All rights reserved. ACCELERATING THREE STAGES OF MACHINE LEARNING Enterprise AI platform supporting model development, training, and deployment Manage models Deploy models Monitor performance DEPLOYDEVELOP Explore data Develop models Share results TRAIN Optimize parameters Track experiments Compare performance
  • 30. © Cloudera, Inc. All rights reserved.30 © Cloudera, Inc. All rights reserved. ACCELERATING MACHINE LEARNING Lego Block for ML: Like a containerized edge node Wrap with REST endpoint Online Scoring JSON in, JSON out MODELSSESSIONS Interactive session for exploration and development EXPERIMENTS Initiate and track Like a lab notebook Export artifacts to project Runtime Engine: Kernels (R/Python/Scala) Common Libraries FS Mounts: CDH - Parcel Dir RPM - Hadoop Config Files Project Dir: Code Files Libraries Dependencies JOBS Scheduled Run a particular code end-to- end New snapshots retain history Point in time Git snapshot
  • 31. © Cloudera, Inc. All rights reserved.31 © Cloudera, Inc. All rights reserved. DEMO
  • 32. © Cloudera, Inc. All rights reserved.32 © Cloudera, Inc. All rights reserved. SELF-SERVICE CLOUDERA DATA SCIENCE WORKBENCH
  • 33. © Cloudera, Inc. All rights reserved.33 © Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Bringing the data scientists TO the data in a way that they want to work For data scientists • Experiment faster Use R, Python, or Scala with on-demand compute and secure CDH/HDP data access • Work together Share reproducible research with your whole team • Deploy with confidence Get to production repeatably and without recoding For IT professionals • Bring data science to the data Give your data science team more freedom while reducing the risk and cost of silos • Secure by default Leverage common security and governance across workloads • Run anywhere On-premises or in the cloud
  • 34. © Cloudera, Inc. All rights reserved.34 © Cloudera, Inc. All rights reserved. CDSW MODELS Machine learning models as one-click microservices (REST APIs) 1. Choose file, e.g. score.py 2. Choose function, e.g. forecast f = open('model.pk', 'rb') model = pickle.load(f) def forecast(data): return model.predict(data) 3. Choose resources 4. Deploy! Running model containers also have access to CDH for data lookups.
  • 35. © Cloudera, Inc. All rights reserved.35 © Cloudera, Inc. All rights reserved. CDSW EXPERIMENTS Versioned model training runs for evaluation and reproducibility Data scientists can ... • Create a snapshot of model code, dependencies, and configuration necessary to train the model • Build and execute the training run in an isolated container • Track specified model metrics, performance, and model artifacts • Inspect, compare, or deploy prior models
  • 36. © Cloudera, Inc. All rights reserved.36 © Cloudera, Inc. All rights reserved. MODEL MANAGEMENT View, test, monitor, and update models by team or project
  • 37. © Cloudera, Inc. All rights reserved.37 © Cloudera, Inc. All rights reserved. CDSW JOBS TO ORCHESTRATE BATCH SCORING Schedule reports & scoring to run on a periodic basis Scheduling is easy and powerful ●Execute arbitrary scripts ●Schedule on a recurring basis ●Create dependencies on other jobs for complex pipelines ●Allow output to be sent via email to recipients
  • 38. © Cloudera, Inc. All rights reserved.38 © Cloudera, Inc. All rights reserved. SUMMARY OF FEATURES End-to-End Workflow Support • Development • Train • Deployment Collaboration • Teams • Sharing • Good coding practices (Git) Security and Governance • Transparent • Leverages underlying frameworks • No data movement • Reproducibility Openness and Self-service • Any framework • Isolated for individual effectiveness • Simplified dependency management
  • 39. © Cloudera, Inc. All rights reserved. THANK YOU