Py data scikit-production

•Download as PPTX, PDF•

5 likes•4,120 views

The document discusses deploying machine learning models in production environments. It outlines several challenges with current approaches such as models being opaque objects and a focus on training rather than prediction. It then proposes six requirements for an architecture to handle live traffic directly from trained models: 1) easy integration, 2) high performance, 3) fault tolerance, 4) scalability, 5) maintainability, and 6) extensibility. Finally, it introduces Dato Predictive Services as a platform that meets these requirements by deploying models as low-latency REST services that can elastically scale and includes monitoring and model management capabilities.

Data & Analytics

Deploying scikit-learn
Models in Production
Rajat Arya (@rajatarya)
Product Manager, Dato Inc.
1

2
Dato provides a platform for building intelligent
apps
Data
Engineering
Data
Intelligence
Deployment
• Fast & scalable
• Rich data type support
• Visualization
• App-oriented ML
• Supporting utils
• Extensibility
• Batch & always-on
• RESTful interface
• Elastic & robust
Build, deploy, & manage your intelligent apps with Dato.

3
DATA
ML
Algorithm
How Everyone Starts with ML
• Running experiments
• Plots are the results
• Not clear how to get this deployed

4
DATA
ML
Algorithm
Deployment?
• Write a spec for other team to
implement in ‘production’ language
• Translate code in 6-12 months
• Stale / irrelevant model implemented
• Two teams maintaining two systems
Custom
Model
Data Engineers, Data Architects,
DevOps, App Developers
App
A
P
I
Data Scientist

5
Current Challenges
• Machine Learning Models
are opaque objects
• Export format like PMML
don’t support many
models
• Focus on training, not
prediction

6
Starting from the Beginning
GOAL: Handle live production traffic directly served from
the trained machine learning model
What are the requirements if we wanted to build a
similar architecture for ML Models?

One: Easy to Integrate
• REST APIs for both querying
and management
• Have client libraries in other
languages (no Python lock-in)
7
App
A
P
I

Two: High Performance
• Utilize Load Balancer for
distributing request load
• Integrated distributed cache
so repeated queries are only
answered once
8
App
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
Engine
A
P
I
C
A
C
H
E
LB

Three: Fault Tolerant
• Model running on many
machines
• System operational during
node failure
9
App
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
LB
GLC
Model
GLC
Model
GLC
Model
Engine
Engine
Engine

Four: Scalable
• Elastic scale nodes in cluster
up and down
• Easy to configure, cache
automatically updates with
cluster changes
10
App
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
LB
GLC
Model
GLC
Model
Engine
Engine
A
P
I
C
A
C
H
E Engine
A
P
I
C
A
C
H
E Engine

Five: Maintainable
• Zero downtime during model
deployment
• Metrics & logs
• Model management
11
App
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
LB
GLC
Model
GLC
Model
GLC
Model
Engine
Engine
Engine

Six: Extensible
• Arbitrary Python
• Use any set of Python
packages
• Model ensembling
12
App
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
LB
GLC
Model
GLC
Model
GLC
Model
Python
Python
Python

13
Requirements Recap
1. Easy to Integrate
2. High Performance
3. Fault Tolerant
4. Scalable
5. Maintainable
6. Extensible
App
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
A
P
I
C
A
C
H
E
LB
GLC
Model
GLC
Model
GLC
Model
Python
Python
Python

14
Do-It-Yourself
• Web Service layer:
- Tornado, Flask, Keen, Django, etc
• Caching layer:
- Redis, Cassandra, Memcached, DynamoDb, BerkeleyDb,
MySQL, etc
• Logs:
- Logback, LogStash, Splunk, Loggly
• Metrics:
- AWS CloudWatch, Mixpanel, Librato, etc

15
… or use Dato Predictive Services
We set out with this goal, and used these requirements
… and now I'd like to show it to you.

DEMO: Deploying a scikit-learn model using
Dato Predictive Services
16

17
Models as Services
• Deploy models as low-latency REST services
• Elastically scale up or out with one command
• Monitoring & Model Management
• Deploy existing Python models
• Run on AWS EC2 or Hadoop YARN
Dato Predictive Services
Predictive Engine
REST Client Direct
Model Mgmt

What's hot

Data ops: Machine Learning in productionStepan Pushkarev

Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Weaveworks

Workshop: Your first machine learning projectAlex Austin

Managers guide to effective building of machine learning productsGianmario Spacagna

Magdalena Stenius: MLOPS Will Change Machine LearningLviv Startup Club

From Data Science to MLOpsCarl W. Handlin

ML-Ops: From Proof-of-Concept to Production ApplicationHunter Carlisle

Challenges of Operationalising Data Science in Productioniguazio

Ml infra at an early stageNick Handel

Why is dev ops for machine learning so different - dataxdaysRyan Dawson

Code to Release using Artificial Intelligence and Machine LearningSTePINForum

Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...Justin Basilico

Ai use casesSparsh Agarwal

Version Control in AI/Machine Learning by DatmoNicholas Walsh

Ml ops past_present_futureNisha Talagala

Feature drift monitoring as a service for machine learning models at scaleNoriaki Tatsumi

Architecting for Data ScienceJohann Schleier-Smith

Data ops in practiceLars Albertsson

Machine Learning system architecture – Microsoft Translator, a Case Study : ...Vishal Chowdhary

Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati

What's hot (20)

Data ops: Machine Learning in production

Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...

Workshop: Your first machine learning project

Managers guide to effective building of machine learning products

Magdalena Stenius: MLOPS Will Change Machine Learning

From Data Science to MLOps

ML-Ops: From Proof-of-Concept to Production Application

Challenges of Operationalising Data Science in Production

Ml infra at an early stage

Why is dev ops for machine learning so different - dataxdays

Code to Release using Artificial Intelligence and Machine Learning

Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...

Ai use cases

Version Control in AI/Machine Learning by Datmo

Ml ops past_present_future

Feature drift monitoring as a service for machine learning models at scale

Architecting for Data Science

Data ops in practice

Machine Learning system architecture – Microsoft Translator, a Case Study : ...

Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...

Viewers also liked

Machine Learning in Big DataDataWorks Summit/Hadoop Summit

LINK UP - How your business can benefit from LinkedInIntranet Future

PancreatitisAlcantara Julio

Ahead Week 1 Key Slidesaltonbaird

Unidad iii mantencion_de_personalrichard rivera

Salud y seguridad de los trabajadores del sector salud.pdfAna González Sánchez

The Impact of a Medical Device RecallCoverity

Empowerment Awarenessaltonbaird

VIH-AIDS 2008.Rafa Cofiño

SBK Kongress 2010 - Informierte PatientInnen – ist die Pflege darauf vorbere...smayer

Cascalog workshopnathanmarz

Lab safety 12_10_13skwahl

BNI 10 Minute Presentation from Supply My SchoolDavid du Plessis

Social Tools in the Enterprise - SXSWMichael Diliberto

Dynamic Wellness JourneyCare Goal setting and researchaltonbaird

Final Braziliharkavy

Ss abaleroy walker

Cascalog at Hadoop Daynathanmarz

Insight family space, Graham Cadlelocalinsight

Cloud Computing - Gina FrancoImage Tech - Web & Multimedia Solutions

Viewers also liked (20)

Machine Learning in Big Data

LINK UP - How your business can benefit from LinkedIn

Pancreatitis

Ahead Week 1 Key Slides

Unidad iii mantencion_de_personal

Salud y seguridad de los trabajadores del sector salud.pdf

The Impact of a Medical Device Recall

Empowerment Awareness

VIH-AIDS 2008.

SBK Kongress 2010 - Informierte PatientInnen – ist die Pflege darauf vorbere...

Cascalog workshop

Lab safety 12_10_13

BNI 10 Minute Presentation from Supply My School

Social Tools in the Enterprise - SXSW

Dynamic Wellness JourneyCare Goal setting and research

Final Brazil

Ss aba

Cascalog at Hadoop Day

Insight family space, Graham Cadle

Cloud Computing - Gina Franco

Similar to Py data scikit-production

Deploying ML models in the enterprisedoppenhe

Machine Learning Platform @Flipkart - Slash N Conference 2018Naresh Sankapelly

World Artificial Intelligence Conference Shanghai 2018Adam Gibson

SigOpt at MLconf - Reducing Operational Barriers to Model TrainingSigOpt

Alexandra johnson reducing operational barriers to model trainingMLconf

Pitfalls of machine learning in productionAntoine Sauray

Machine Learning InfrastructureSigOpt

[DSC Europe 23] Petar Zecevic - ML in Production on DatabricksDataScienceConferenc1

Michelangelo - Machine Learning Platform - 2018Karthik Murugesan

Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Costanoa Ventures

Benefits of a Homemade ML PlatformGetInData

EPAM ML/AI Accelerator - ODAHUDmitrii Suslov

Weave AI Controllers (Weave GitOps Office Hours)Weaveworks

DutchMLSchool. ML for Energy Trading and Automotive SectorBigML, Inc

Bodywork - GitOps for Machine LearningAlex Ioannides

DOES15 - Rosalind Radcliffe - Test Automation For Mainframe Applications Gene Kim

DevOps Enterprise Summit: Mainframe Automated TestingDevOps for Enterprise Systems

Consolidating MLOps at One of Europe’s Biggest AirportsDatabricks

Anypoint new features_coimbatore_mule_meetupMergeStack

Dmitry Spodarets: Modern MLOps toolchain 2023Lviv Startup Club

Similar to Py data scikit-production (20)

Deploying ML models in the enterprise

Machine Learning Platform @Flipkart - Slash N Conference 2018

World Artificial Intelligence Conference Shanghai 2018

SigOpt at MLconf - Reducing Operational Barriers to Model Training

Alexandra johnson reducing operational barriers to model training

Pitfalls of machine learning in production

Machine Learning Infrastructure

[DSC Europe 23] Petar Zecevic - ML in Production on Databricks

Michelangelo - Machine Learning Platform - 2018

Using Machine Learning & Artificial Intelligence to Create Impactful Customer...

Benefits of a Homemade ML Platform

EPAM ML/AI Accelerator - ODAHU

Weave AI Controllers (Weave GitOps Office Hours)

DutchMLSchool. ML for Energy Trading and Automotive Sector

Bodywork - GitOps for Machine Learning

DOES15 - Rosalind Radcliffe - Test Automation For Mainframe Applications

DevOps Enterprise Summit: Mainframe Automated Testing

Consolidating MLOps at One of Europe’s Biggest Airports

Anypoint new features_coimbatore_mule_meetup

Dmitry Spodarets: Modern MLOps toolchain 2023

Recently uploaded

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823

Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra

Midocean dropshipping via API with DroFxolyaivanovalion

Anomaly detection and data imputation within time seriesParis Women in Machine Learning and Data Science

April 2024 - Crypto Market Report's Analysismanisha194592

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823

Halmar dropshipping via API with DroFxolyaivanovalion

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795

Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7Call Girls in Nagpur High Profile Call Girls

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

Capstone Project on IBM Data Analytics ProgramMoniSankarHazra

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Recently uploaded (20)

Generative AI on Enterprise Cloud with NiFi and Milvus

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...

Sampling (random) method and Non random.ppt

Midocean dropshipping via API with DroFx

Anomaly detection and data imputation within time series

April 2024 - Crypto Market Report's Analysis

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Halmar dropshipping via API with DroFx

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Capstone Project on IBM Data Analytics Program

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand

Py data scikit-production

1. Deploying scikit-learn Models in Production Rajat Arya (@rajatarya) Product Manager, Dato Inc. 1

2. 2 Dato provides a platform for building intelligent apps Data Engineering Data Intelligence Deployment • Fast & scalable • Rich data type support • Visualization • App-oriented ML • Supporting utils • Extensibility • Batch & always-on • RESTful interface • Elastic & robust Build, deploy, & manage your intelligent apps with Dato.

3. 3 DATA ML Algorithm How Everyone Starts with ML • Running experiments • Plots are the results • Not clear how to get this deployed

4. 4 DATA ML Algorithm Deployment? • Write a spec for other team to implement in ‘production’ language • Translate code in 6-12 months • Stale / irrelevant model implemented • Two teams maintaining two systems Custom Model Data Engineers, Data Architects, DevOps, App Developers App A P I Data Scientist

5. 5 Current Challenges • Machine Learning Models are opaque objects • Export format like PMML don’t support many models • Focus on training, not prediction

6. 6 Starting from the Beginning GOAL: Handle live production traffic directly served from the trained machine learning model What are the requirements if we wanted to build a similar architecture for ML Models?

7. One: Easy to Integrate • REST APIs for both querying and management • Have client libraries in other languages (no Python lock-in) 7 App A P I

8. Two: High Performance • Utilize Load Balancer for distributing request load • Integrated distributed cache so repeated queries are only answered once 8 App A P I C A C H E A P I C A C H E Engine A P I C A C H E LB

9. Three: Fault Tolerant • Model running on many machines • System operational during node failure 9 App A P I C A C H E A P I C A C H E A P I C A C H E LB GLC Model GLC Model GLC Model Engine Engine Engine

10. Four: Scalable • Elastic scale nodes in cluster up and down • Easy to configure, cache automatically updates with cluster changes 10 App A P I C A C H E A P I C A C H E LB GLC Model GLC Model Engine Engine A P I C A C H E Engine A P I C A C H E Engine

11. Five: Maintainable • Zero downtime during model deployment • Metrics & logs • Model management 11 App A P I C A C H E A P I C A C H E A P I C A C H E LB GLC Model GLC Model GLC Model Engine Engine Engine

12. Six: Extensible • Arbitrary Python • Use any set of Python packages • Model ensembling 12 App A P I C A C H E A P I C A C H E A P I C A C H E LB GLC Model GLC Model GLC Model Python Python Python

13. 13 Requirements Recap 1. Easy to Integrate 2. High Performance 3. Fault Tolerant 4. Scalable 5. Maintainable 6. Extensible App A P I C A C H E A P I C A C H E A P I C A C H E LB GLC Model GLC Model GLC Model Python Python Python

14. 14 Do-It-Yourself • Web Service layer: - Tornado, Flask, Keen, Django, etc • Caching layer: - Redis, Cassandra, Memcached, DynamoDb, BerkeleyDb, MySQL, etc • Logs: - Logback, LogStash, Splunk, Loggly • Metrics: - AWS CloudWatch, Mixpanel, Librato, etc

15. 15 … or use Dato Predictive Services We set out with this goal, and used these requirements … and now I'd like to show it to you.

16. DEMO: Deploying a scikit-learn model using Dato Predictive Services 16

17. 17 Models as Services • Deploy models as low-latency REST services • Elastically scale up or out with one command • Monitoring & Model Management • Deploy existing Python models • Run on AWS EC2 or Hadoop YARN Dato Predictive Services Predictive Engine REST Client Direct Model Mgmt

Editor's Notes

So I got started with ML by taking a class. Data -> to ML algo, and then generate a plot. Of course this isn’t how actual applications are written, but this is often where customers are starting when approaching taking ML to production.
So I got started with ML by taking a class. Data -> to ML algo, and then generate a plot. Of course this isn’t how actual applications are written, but this is often where customers are starting when approaching taking ML to production.

Py data scikit-production

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Py data scikit-production

Similar to Py data scikit-production (20)

More from Turi, Inc.

More from Turi, Inc. (20)

Recently uploaded

Recently uploaded (20)

Py data scikit-production

Editor's Notes