SlideShare a Scribd company logo
1 of 39
Scale Machine Learning
Deployment
Gang Tao
Data Science Project Life Cycle
Model Persistent
▶ Python pickle based
code serialization
▶ sklearn.externals.joblib
▶ Spark provide api to
save model/pipeline
as file
▶ Tensorflow provide
tf.train.Saver that
persists the tensor
graph
▶ It is pickle +
metadata +
checkpoint
Python Sklearn / Spark / Tensorflow
▶ Models from different tools are not compatible
▶ Code serialization has dependency on python version
▶ Code serialization has potential security concerns
▶ For tf model, those tensor names are required ( need check if there are in the
meta data)
▶ tf mode has dependency on customer code which defined customer
operations
Issues and Limitations
A simple view of model deployment
▶ Enable wide range of ML modeling tools : Python, R, Tensorflow, Spark
▶ Scale up and down
▶ Performance, Latency optimization
▶ Accessing model, API
▶ Audit and Versioning
▶ CI/CD
▶ Metrics and Monitoring
▶ Optimization, AB Tests
ML Deployment Challenges
Seldon
▶ Seldon, A London Company focuses on providing control over Machine
Learning based on open source software
▶ Seldon Core is a open source platform for deploying machine learning model
on Kubernetes
• Python/Spark/H2O/R model support
• REST and gRPC API
• Deploy Inference graph of Model/Routers/Combiner/Transformers as microservices
• Leveraging K8s to provide scale, security, monitoring etc
Seldon
Pros Cons
▶ Seamless K8s integration
▶ Graph definition to support AB
test and ensembling
▶ No Scala support for Spark
▶ Need customer image for
pySpark
▶ No customization support for
liveness/readiness check due to
CRD
Summary
Clipper
▶ Clipper.ai is a system developed by UC Berkeley RISE lab.
▶ Clipper is a prediction serving system that sits between user-facing
applications and a wide range of commonly used machine learning models
and frameworks.
Clipper
Pros Cons
▶ Easy to use interactive model
deploy
▶ Support Docker and K8s
▶ Query Latency Objective support
▶ Model Version management
• Update and Rollback
▶ Cloud pickle version issue
▶ Python only
▶ Less examples/Documents
▶ Not friendly to AWS
• use_internal_ip does not work well
• need manually create repo for
model
• Failed to pull image from ecr
▶ Cluster creation is not stable
▶ Tensorflow failed to pickle
Summary
MLFlow
▶ MLflow is an open source platform for managing the end-to-end machine
learning lifecycle.
▶ MLFlow is developed by Databricks
MLFlow
Pros Cons
▶ Flexible
▶ Easy to do with SKlearn
▶ Cloud integration to support
sagemaker and azure
▶ No K8s integration
▶ Spark/Tensorflow support is
based on Python
▶ Projects are better managed by
container
Summary
MLeap
▶ MLeap allows data scientists and engineers to deploy machine learning
pipelines from Spark and Scikit-learn to a portable format and execution
engine.
• A JSON base serialization
• A Runtime execution engine
• Benchmarks
▶ http://mleap-docs.combust.ml/core-concepts/transformers/support.html
MLeap
MLeap Serialization
Pros Cons
▶ Portable model between Spark
and Sklearn
▶ Human readable model
▶ Easy model serving
▶ Support matrix is incomplete
▶ Extensibility
• Write code for each
estimator/transformer
▶ To support tensorflow, need
customer build tf-java binding,
and is under experiment
Summary
Wrap up
▶ Seldon tightly integrates with k8s to support the scalability of model serving,
and it’s graph function is powerful.
▶ Clipper provides good interaction, while the code is not stable enough
▶ MLflow’s model serving is simple, with less functions
▶ MLeap targets to provide inter-operation between different tools which is very
nice, while there is still a long way to go to support all the features.
• PMML is not covered
▶ Some other tools are not touched
• MXnet model server
• Oracle Graphpipe
Wrap up
Model Persistent ML Tools K8s Integration Version License Implementation
Seldon
Core
S2i + Pickle Tensorflow, SKlearn,
Keras, R, H2O,
Nodejs, PMML
Yes 0.3.2 Apache Docker + K8s CRD
Clipper Pickle Python, PySpark,
PyTorch, Tensorflow,
MXnet, Customer
Container
Yes 0.3.0 Apache CPP / Python
MLFlow Directory +
Metadata
Python, H2O, Kera,
MLeap, PyTorch,
Sklearn, Spark,
Tensorflow, R
No Alpha Apache Python
MLeap Spark,Sklearn,
Tensorflow
No 0.12.0 Apache Scala/Java
Other findings
▶ Enabling Spark is not easy
• Version, pyspark version, java version
• Build spark image with glibc support
• Java gateway process exited before sending its port number
• Access spark from k8s is not easy
▶ Some K8s pods are pending with Unknown status
• kubectl delete pod {} --grace-period=0 --force
▶ Building your own ML image from python is not easy, use
continuumio/miniconda may save you some time
▶ Using batch command to clean the docker images
• docker images | grep "something_to_search" | awk '{print $1 ":" $2}' |xargs docker rmi -f
• docker system prune
Some other findings
References
▶ https://cmry.github.io/notes/serialize
▶ https://cmry.github.io/notes/serialize-sk
▶ https://github.com/hiveml/simple-ml-serving
▶ https://medium.com/@vikati/the-rise-of-the-model-servers-9395522b6c58
▶ https://qconsp.com/system/files/presentation-slides/qconsp18-deployingml-
may18-npentreath.pdf
▶ https://www.slideshare.net/dscrankshaw/veloxampcamp5-final
References

More Related Content

What's hot

MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleDatabricks
 
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model ServingDAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model Servingamesar0
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreMoritz Meister
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro sessionAvinash Patil
 
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinarSameer Mahajan
 
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...Databricks
 
Automating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowAutomating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowStepan Pushkarev
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdaysRyan Dawson
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowDatabricks
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLJordan Birdsell
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusManasi Vartak
 

What's hot (20)

MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model ServingDAIS Europe Nov. 2020 presentation on MLflow Model Serving
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
 
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Nasscom ml ops webinar
Nasscom ml ops webinarNasscom ml ops webinar
Nasscom ml ops webinar
 
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
 
Automating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowAutomating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflow
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdays
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflow
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
 

Similar to Scale machine learning deployment

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleDatabricks
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAnimesh Singh
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
How to Choose a Deep Learning Framework
How to Choose a Deep Learning FrameworkHow to Choose a Deep Learning Framework
How to Choose a Deep Learning FrameworkNavid Kalaei
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesDatabricks
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondProvectus
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformGetInData
 
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache SparkDistributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache SparkGuglielmo Iozzia
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with DatabricksLiangjun Jiang
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Nisha Talagala
 
Democratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDemocratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDocker, Inc.
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in ProductionMatthias Feys
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayNick Pentreath
 
Triton As NLP Model Inference Back-end
 Triton As NLP Model Inference Back-end Triton As NLP Model Inference Back-end
Triton As NLP Model Inference Back-endKo Ko
 

Similar to Scale machine learning deployment (20)

Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
How to Choose a Deep Learning Framework
How to Choose a Deep Learning FrameworkHow to Choose a Deep Learning Framework
How to Choose a Deep Learning Framework
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU Tensorflow 2.0 and Coral Edge TPU
Tensorflow 2.0 and Coral Edge TPU
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache SparkDistributed Deep Learning with Keras and TensorFlow on Apache Spark
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
 
Democratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDemocratizing machine learning on kubernetes
Democratizing machine learning on kubernetes
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI Day
 
Triton As NLP Model Inference Back-end
 Triton As NLP Model Inference Back-end Triton As NLP Model Inference Back-end
Triton As NLP Model Inference Back-end
 

More from Gang Tao

Critical thinking
Critical thinkingCritical thinking
Critical thinkingGang Tao
 
Cloud monitoring
Cloud monitoringCloud monitoring
Cloud monitoringGang Tao
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing ArchitectureGang Tao
 
Splunk Spark Integration
Splunk Spark IntegrationSplunk Spark Integration
Splunk Spark IntegrationGang Tao
 
Regression
RegressionRegression
RegressionGang Tao
 
Bayesian Classification
Bayesian ClassificationBayesian Classification
Bayesian ClassificationGang Tao
 
Quality attributes in software architecture
Quality attributes in software architectureQuality attributes in software architecture
Quality attributes in software architectureGang Tao
 
Great bychoice
Great bychoiceGreat bychoice
Great bychoiceGang Tao
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science IntroductionGang Tao
 
Now you see it
Now you see itNow you see it
Now you see itGang Tao
 

More from Gang Tao (10)

Critical thinking
Critical thinkingCritical thinking
Critical thinking
 
Cloud monitoring
Cloud monitoringCloud monitoring
Cloud monitoring
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing Architecture
 
Splunk Spark Integration
Splunk Spark IntegrationSplunk Spark Integration
Splunk Spark Integration
 
Regression
RegressionRegression
Regression
 
Bayesian Classification
Bayesian ClassificationBayesian Classification
Bayesian Classification
 
Quality attributes in software architecture
Quality attributes in software architectureQuality attributes in software architecture
Quality attributes in software architecture
 
Great bychoice
Great bychoiceGreat bychoice
Great bychoice
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science Introduction
 
Now you see it
Now you see itNow you see it
Now you see it
 

Recently uploaded

Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 

Recently uploaded (20)

young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 

Scale machine learning deployment

  • 2. Data Science Project Life Cycle
  • 4. ▶ Python pickle based code serialization ▶ sklearn.externals.joblib ▶ Spark provide api to save model/pipeline as file ▶ Tensorflow provide tf.train.Saver that persists the tensor graph ▶ It is pickle + metadata + checkpoint Python Sklearn / Spark / Tensorflow
  • 5.
  • 6. ▶ Models from different tools are not compatible ▶ Code serialization has dependency on python version ▶ Code serialization has potential security concerns ▶ For tf model, those tensor names are required ( need check if there are in the meta data) ▶ tf mode has dependency on customer code which defined customer operations Issues and Limitations
  • 7. A simple view of model deployment
  • 8. ▶ Enable wide range of ML modeling tools : Python, R, Tensorflow, Spark ▶ Scale up and down ▶ Performance, Latency optimization ▶ Accessing model, API ▶ Audit and Versioning ▶ CI/CD ▶ Metrics and Monitoring ▶ Optimization, AB Tests ML Deployment Challenges
  • 10. ▶ Seldon, A London Company focuses on providing control over Machine Learning based on open source software ▶ Seldon Core is a open source platform for deploying machine learning model on Kubernetes • Python/Spark/H2O/R model support • REST and gRPC API • Deploy Inference graph of Model/Routers/Combiner/Transformers as microservices • Leveraging K8s to provide scale, security, monitoring etc Seldon
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. Pros Cons ▶ Seamless K8s integration ▶ Graph definition to support AB test and ensembling ▶ No Scala support for Spark ▶ Need customer image for pySpark ▶ No customization support for liveness/readiness check due to CRD Summary
  • 17. ▶ Clipper.ai is a system developed by UC Berkeley RISE lab. ▶ Clipper is a prediction serving system that sits between user-facing applications and a wide range of commonly used machine learning models and frameworks. Clipper
  • 18.
  • 19.
  • 20. Pros Cons ▶ Easy to use interactive model deploy ▶ Support Docker and K8s ▶ Query Latency Objective support ▶ Model Version management • Update and Rollback ▶ Cloud pickle version issue ▶ Python only ▶ Less examples/Documents ▶ Not friendly to AWS • use_internal_ip does not work well • need manually create repo for model • Failed to pull image from ecr ▶ Cluster creation is not stable ▶ Tensorflow failed to pickle Summary
  • 22. ▶ MLflow is an open source platform for managing the end-to-end machine learning lifecycle. ▶ MLFlow is developed by Databricks MLFlow
  • 23.
  • 24.
  • 25. Pros Cons ▶ Flexible ▶ Easy to do with SKlearn ▶ Cloud integration to support sagemaker and azure ▶ No K8s integration ▶ Spark/Tensorflow support is based on Python ▶ Projects are better managed by container Summary
  • 26. MLeap
  • 27. ▶ MLeap allows data scientists and engineers to deploy machine learning pipelines from Spark and Scikit-learn to a portable format and execution engine. • A JSON base serialization • A Runtime execution engine • Benchmarks ▶ http://mleap-docs.combust.ml/core-concepts/transformers/support.html MLeap
  • 28.
  • 30.
  • 31. Pros Cons ▶ Portable model between Spark and Sklearn ▶ Human readable model ▶ Easy model serving ▶ Support matrix is incomplete ▶ Extensibility • Write code for each estimator/transformer ▶ To support tensorflow, need customer build tf-java binding, and is under experiment Summary
  • 33. ▶ Seldon tightly integrates with k8s to support the scalability of model serving, and it’s graph function is powerful. ▶ Clipper provides good interaction, while the code is not stable enough ▶ MLflow’s model serving is simple, with less functions ▶ MLeap targets to provide inter-operation between different tools which is very nice, while there is still a long way to go to support all the features. • PMML is not covered ▶ Some other tools are not touched • MXnet model server • Oracle Graphpipe Wrap up
  • 34. Model Persistent ML Tools K8s Integration Version License Implementation Seldon Core S2i + Pickle Tensorflow, SKlearn, Keras, R, H2O, Nodejs, PMML Yes 0.3.2 Apache Docker + K8s CRD Clipper Pickle Python, PySpark, PyTorch, Tensorflow, MXnet, Customer Container Yes 0.3.0 Apache CPP / Python MLFlow Directory + Metadata Python, H2O, Kera, MLeap, PyTorch, Sklearn, Spark, Tensorflow, R No Alpha Apache Python MLeap Spark,Sklearn, Tensorflow No 0.12.0 Apache Scala/Java
  • 36. ▶ Enabling Spark is not easy • Version, pyspark version, java version • Build spark image with glibc support • Java gateway process exited before sending its port number • Access spark from k8s is not easy ▶ Some K8s pods are pending with Unknown status • kubectl delete pod {} --grace-period=0 --force ▶ Building your own ML image from python is not easy, use continuumio/miniconda may save you some time ▶ Using batch command to clean the docker images • docker images | grep "something_to_search" | awk '{print $1 ":" $2}' |xargs docker rmi -f • docker system prune Some other findings
  • 37.
  • 39. ▶ https://cmry.github.io/notes/serialize ▶ https://cmry.github.io/notes/serialize-sk ▶ https://github.com/hiveml/simple-ml-serving ▶ https://medium.com/@vikati/the-rise-of-the-model-servers-9395522b6c58 ▶ https://qconsp.com/system/files/presentation-slides/qconsp18-deployingml- may18-npentreath.pdf ▶ https://www.slideshare.net/dscrankshaw/veloxampcamp5-final References