SlideShare a Scribd company logo
1 of 42
5/15/2017
Continuous Delivery Principles
for Machine Learning
Rajesh Muppalla
rajesh@indix.com
@codingnirvana
About Me
• Co-Founder, Indix
• From Chennai
◊ 200 miles to the east (and north) of Bangalore
◊ S Ramanujan (19th Century Mathematician), Sundar Pichai
◊ Three Seasons - Hot, Hotter, Hottest
• Previously
◊ ScaleByTheBay 2016 - Data Pipelines Panelist
◊ Microservices, Lambda Architecture
• Ex-Thoughtworks
◊ Tech Lead - Go-CD - an open source CI/CD Tool
5/15/2017
About Indix
Six Business Critical Indexes
People
Documents Businesses
Places Products
Connected
Devices
Enabling businesses to build
location-aware software.
~3.6 million websites use Google maps
Enabling businesses to build
product-aware software.
Indix catalogs over 2.1 billion product offers
Indix – the “Google Maps” of Products
Data Pipeline @ Indix
Crawling Pipeline
Data PipelineML
AggregateMatchStandardizeExtract AttributesClassifyDedupe
Parse
Crawl
Data
CrawlSeed
Brand & Retailer
Websites
Feeds Pipeline
Transform Clean Connect
Feed
Data
Brand & Retailer
Feeds
Indix Product
Catalog
Customizable
Feeds
Search &
Analytics
Index
Indexing PipelineReal Time
Index Analyze Derive Join
API
(Bulk &
Synchronous)
Product Data
Transformation
Service
E-Tailers & Marketplaces
Original Catalog
Title Brand Color Size
Product 1 Running Shoes Adidas Blk 9
Product 2 Yoga Pants Black 32
Product 3 Jacket TNF White
Enriched Catalog
Title Brand Color Size Material
Product 1 Running Shoes Adidas Black 9 US Leather
Product 2 Yoga Pants Lululemon Black 32"" Polyester
Product 3 Jacket The North Face White Leather
Ad Display & Exchange Platforms
• Advertisers - Standardize, Enrich and Augment Product Information for
better relevance
• Retailers - Enrich, Match and Normalize their catalog for better targeting of
native Ads
• Publishers - Classify and tag publisher site content
Data Scale @ Indix
2.1
Billion
Product
URLs 8 TB
HTML Data
Crawled
Daily
1B
Unique
Products
7000
Categories
120 B
Price
Points
3000
Sites
3/31/16
Auto Parsers to detect and extract
Product content from Web pages, using
Machine Vision algorithms
Predictive Scheduler for deciding
re-crawl frequency using various signals
like Seasonality, Product Type, Store
Multi-label classifier Categorizing products into
a hierarchical taxonomy using text information
Inferring Product vs Listing vs Other
Pages using either just URL patterns or
using Page Content
Adaptive Crawlers that modifies the
crawl rate based dynamic
characteristics like Site traffic, Number
of products, Robots.txt settings
Deep learning - Categorizing
products using Product images
Predicting which products are an
exact match or similar products
NER based Attribute extraction algorithm
that mines text like Title, Descriptions,
Specifications to build structured Key:Value
Attributes
Fusion/Enrichment - An algorithm
that uses the data to learn and build
golden product record using
disparate sources
Product Rank - algorithm that uses
multiple signals like product
popularity, price, data quality, store
popularity, brand popularity to build
dynamic relevance/rank score
Recommendation Engines that suggest
Tags where Product information can be
found on a web page
Deep learning - Extracting visual
product attributes using Product
images
NLG algorithms to generate product
descriptions
Product GPS - Universal Product
Identifier using machine learning
algorithms and allowing Search &
Discovery
ML @ Indix
ML @ Indix - Classification
ML @ Indix - Attribute Extraction
5/15/2017
Machine Learning Workflow
Define Business
Objective
Explore &
Transform
Pull and Acquire
Data
Develop Model
Evaluate Model
Meets
Business
Needs?
Build Production
System
DeployMeasure Metrics
Yes!
Not Yet!
Human in
the Loop
Machine Learning Workflow
Machine Learning Sandwich?*
* - https://techcrunch.com/2017/08/08/the-evolution-of-machine-learning/
Explore &
Transform
Pull and Acquire
Data
Deploy
Build Production
System
Develop Model
Model Evaluation &
Validation
The MEAT is not in the middle
Experts agree with us
D. Sculley, et al. Hidden technical debt in machine learning systems. In Neural Information Processing Systems (NIPS). 2015
Only a small fraction of real-world ML systems is a composed of ML code, as shown by the
small black box in the middle. The required surrounding infrastructure is fast and complex.
Different Skillsets
Explore &
Transform
Pull and Acquire
Data
Deploy
Build Production
System
Develop Model
Model Evaluation &
Validation
Data Pipelines
App
Model
Separate Talk
Explore &
Transform
Pull and Acquire
Data
My Talk
Explore &
Transform
Pull and Acquire
Data
Deploy
Build Production
System
Develop Model
Model Evaluation &
Validation
Focus of
this talk
Pain Points
● A key employee in the team had to abruptly go on leave
○ Unable to reproduce the performance of an existing production model
■ Training Data Missing/Not known
■ Scripts not there for Pre-processing
■ Hyperparameters not known
● It takes 3 Months to productionize a model
■ Lot of glue code
■ Custom code developed every time
■ Frequent updates to model takes long time
● Heterogeneous Systems
■ Eg. - Sharing stuff between Python and JVM
Reality
● Confidence in Test Set != Confidence in Production
■ Confidence of model performance on a sample set not good enough
Dejavu?
Continuous Delivery is a software engineering
approach that aims at building, testing and
releasing software faster and more frequently.
A straightforward and repeatable process is
important from continuous delivery
What is Continuous Delivery?
5/15/2017
Principles from CD in ML
Principle #1
Automation via CI + CD
pipelines Automation of ML Training,
Evaluation and Offline
Prediction Pipelines
Continuous Delivery
for
Software
Continuous Delivery
for
Machine Learning
Training Pipelines
● Training pipelines are modelled like a build pipeline
● Customized Go-CD, an open source CI & CD tool
● Created plugins to help us with our ML workflows
Pre-process Data
(Spark Job)
Build Model
(Python Script)
Evaluate Model
(Python)
Training Pipeline (3 Flavors)
Build Model
(Spark Job)
OR
Build Model
(Zeppelin Notebook)
OR
Training
Data
Go-CD - Demo
Principle #2
Source Code and
Artifact Repository
for
Reproducibility
Source Code, Data and
Model Repository
for
Reproducibility
Continuous Delivery
for
Software
Continuous Delivery
for
Machine Learning
Model Repository
● Similar to an artifact repository like Maven, Ivy
○ Directory Structure, Versioning, Publishing of models
● Has clients to publish models for most commonly used frameworks
○ scikit-learn, Spark MLLib, Keras
● For a model,
○ Data
■ Stored in S3
■ In Different formats
● Parquet (Spark MLLib), Scikit-Learn - Pickle, Keras - HDF5
○ Metadata
■ Training/Validation/Test Datasets
■ Hyper-parameters used
■ Evaluation Metrics
Model Repository
Pre-process Data
(Spark Job)
Build Model
(Python)
Evaluate Model
(Python)
Publish Model
Training Pipeline
Training
Data
Training
Data
Model Promotion
● Tagging the “latest good” version that needs to be deployed
● Not all models need/can be promoted
○ Experimental models
○ Models that fail the test set or performance/latency metrics
● Easy rollback - tag the “last good” version as the latest
Pre-process Data
(Spark Job)
Build Model
(Python)
Evaluate Model
(Python)
Publish Model Promote Model
Manual Stage
Training Pipeline
Principle #3
Containers
for
µServices
Model Containers
for
Model Prediction µServices
Continuous Delivery
for
Software
Continuous Delivery
for
Machine Learning
Model Container
● Hosts a single model to be used for predictions
● Exposes API for prediction and are “dockerized”
● Containers can be replicated to handle scale
● Two µServices
○ Scala
■ Handles pre-processing
○ Python
■ Loads model and exposes the predict on the model
■ Can also predict in batches for better throughput
■ Handles ensembles of models
○ Scala µservice delegates the predict and predict_batch functions to the
Python µservice
Model Container
Docker Host
Scala µService
predict(input)
predict_batch(inputs)
_preprocess(input)
Python µService
Model
Model
Model
predict(input)
predict_batch(inputs)
Create Docker
Image
(Docker)
Push to Docker
Registry
(Docker)
Publish Model Promote Model
Training
Pipeline
Model Deployment
● Two Modes - Offline (Batch) and/or Online
● Offline Mode
○ Package model containers into an AMI (Amazon Machine Image)
○ Start the container as part of your Spark/Hadoop clusters on the
Executors/Task Trackers
○ Within a job call the local Scala Service for prediction for each record
● Online Mode
○ Deploy the model containers into a Mesos + Marathon or a Kubernetes
cluster
○ (Auto) Scaling is managed by the cluster
Principle #4
A/B Testing
Using
Canary Releases
A/B Testing
Using
Request Shadowing
Continuous Delivery
for
Software
Continuous Delivery
for
Machine Learning
Model A/B Testing
● We don’t use Multi Armed Bandit Testing (MAB)
○ Reason - Payout is not easily measurable unlike CTR (for example)
● Instead we use Request Shadowing pattern
○ Input to both old and new both, but serve output only from old
○ Find deltas and do spot checking
● For Offline, we only do deltas + spot checking
● We have built an in-house data turking tool for spot checking
Spot Checking Example 1
Spot Checking Example 2
Future Work
● Lot more to be done
○ Support deep learning based models as a first class solution
○ Model Repository visualization
○ Add more plugins in Go-CD to better support ML workflows natively
● Open Source
○ Model Serving Repository + Clients (WIP)
Indix & Open Source
● oss.indix.com
Thank You
Questions

More Related Content

What's hot

Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureDatabricks
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at TwitterPrasad Wagle
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solidLars Albertsson
 
A primer on building real time data-driven products
A primer on building real time data-driven productsA primer on building real time data-driven products
A primer on building real time data-driven productsLars Albertsson
 
Superworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueSuperworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueDatabricks
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseBig Data Spain
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Khai Tran
 
Realtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIOJozo Kovac
 
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaDatabricks
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of MusicLars Albertsson
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data PipelineManish Kumar
 
Traveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsTraveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsRendy Bambang Junior
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big dataTrieu Nguyen
 
Spline 0.3 User Guide
Spline 0.3 User GuideSpline 0.3 User Guide
Spline 0.3 User GuideVaclav Kosar
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsAsis Mohanty
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceWei Di
 
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...Databricks
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyIt's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyHostedbyConfluent
 

What's hot (20)

Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
 
Data pipelines from zero to solid
Data pipelines from zero to solidData pipelines from zero to solid
Data pipelines from zero to solid
 
A primer on building real time data-driven products
A primer on building real time data-driven productsA primer on building real time data-driven products
A primer on building real time data-driven products
 
Superworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueSuperworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and Fugue
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
 
Realtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIO
 
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 
Data Infrastructure for a World of Music
Data Infrastructure for a World of MusicData Infrastructure for a World of Music
Data Infrastructure for a World of Music
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
Traveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsTraveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analytics
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics
 
Spline 0.3 User Guide
Spline 0.3 User GuideSpline 0.3 User Guide
Spline 0.3 User Guide
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
 
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyIt's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
 

Viewers also liked

Indix at Fifth Elephant 2015
Indix at Fifth Elephant 2015Indix at Fifth Elephant 2015
Indix at Fifth Elephant 2015Anu Hastings
 
AWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWS
AWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWSAWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWS
AWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWSAmazon Web Services
 
Optimizing SlideShare for Twitter
Optimizing SlideShare for TwitterOptimizing SlideShare for Twitter
Optimizing SlideShare for TwitterKevin Baldacci
 
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017Nina Zakharenko
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Julien Le Dem
 
Mining Functional Patterns
Mining Functional PatternsMining Functional Patterns
Mining Functional PatternsDebasish Ghosh
 
Democratization of Data @Indix
Democratization of Data @IndixDemocratization of Data @Indix
Democratization of Data @IndixManoj Mahalingam
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017Carol Smith
 

Viewers also liked (9)

Indix at Fifth Elephant 2015
Indix at Fifth Elephant 2015Indix at Fifth Elephant 2015
Indix at Fifth Elephant 2015
 
AWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWS
AWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWSAWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWS
AWS Cloud Kata 2013 | Singapore - Achieving Profitability on AWS
 
Optimizing SlideShare for Twitter
Optimizing SlideShare for TwitterOptimizing SlideShare for Twitter
Optimizing SlideShare for Twitter
 
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013
 
Mining Functional Patterns
Mining Functional PatternsMining Functional Patterns
Mining Functional Patterns
 
Democratization of Data @Indix
Democratization of Data @IndixDemocratization of Data @Indix
Democratization of Data @Indix
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
 

Similar to Continuous Delivery Principles for Machine Learning Models

Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in ProductionDataWorks Summit
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...All Things Open
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)dtz001
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....Databricks
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...DataWorks Summit
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018Adam Gibson
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to ProductionMostafa Majidpour
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...PAPIs.io
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Sotrender
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-useltonrodriguez11
 
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Bhakthi Liyanage
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)Arnab Biswas
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Neotys_Partner
 

Similar to Continuous Delivery Principles for Machine Learning Models (20)

Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
 
Aakanksha_Agnani_j2016
Aakanksha_Agnani_j2016Aakanksha_Agnani_j2016
Aakanksha_Agnani_j2016
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 

Recently uploaded

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 

Recently uploaded (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 

Continuous Delivery Principles for Machine Learning Models

  • 1. 5/15/2017 Continuous Delivery Principles for Machine Learning Rajesh Muppalla rajesh@indix.com @codingnirvana
  • 2. About Me • Co-Founder, Indix • From Chennai ◊ 200 miles to the east (and north) of Bangalore ◊ S Ramanujan (19th Century Mathematician), Sundar Pichai ◊ Three Seasons - Hot, Hotter, Hottest • Previously ◊ ScaleByTheBay 2016 - Data Pipelines Panelist ◊ Microservices, Lambda Architecture • Ex-Thoughtworks ◊ Tech Lead - Go-CD - an open source CI/CD Tool
  • 4. Six Business Critical Indexes People Documents Businesses Places Products Connected Devices
  • 5. Enabling businesses to build location-aware software. ~3.6 million websites use Google maps Enabling businesses to build product-aware software. Indix catalogs over 2.1 billion product offers Indix – the “Google Maps” of Products
  • 6. Data Pipeline @ Indix Crawling Pipeline Data PipelineML AggregateMatchStandardizeExtract AttributesClassifyDedupe Parse Crawl Data CrawlSeed Brand & Retailer Websites Feeds Pipeline Transform Clean Connect Feed Data Brand & Retailer Feeds Indix Product Catalog Customizable Feeds Search & Analytics Index Indexing PipelineReal Time Index Analyze Derive Join API (Bulk & Synchronous) Product Data Transformation Service
  • 7. E-Tailers & Marketplaces Original Catalog Title Brand Color Size Product 1 Running Shoes Adidas Blk 9 Product 2 Yoga Pants Black 32 Product 3 Jacket TNF White Enriched Catalog Title Brand Color Size Material Product 1 Running Shoes Adidas Black 9 US Leather Product 2 Yoga Pants Lululemon Black 32"" Polyester Product 3 Jacket The North Face White Leather
  • 8. Ad Display & Exchange Platforms • Advertisers - Standardize, Enrich and Augment Product Information for better relevance • Retailers - Enrich, Match and Normalize their catalog for better targeting of native Ads • Publishers - Classify and tag publisher site content
  • 9. Data Scale @ Indix 2.1 Billion Product URLs 8 TB HTML Data Crawled Daily 1B Unique Products 7000 Categories 120 B Price Points 3000 Sites
  • 10. 3/31/16 Auto Parsers to detect and extract Product content from Web pages, using Machine Vision algorithms Predictive Scheduler for deciding re-crawl frequency using various signals like Seasonality, Product Type, Store Multi-label classifier Categorizing products into a hierarchical taxonomy using text information Inferring Product vs Listing vs Other Pages using either just URL patterns or using Page Content Adaptive Crawlers that modifies the crawl rate based dynamic characteristics like Site traffic, Number of products, Robots.txt settings Deep learning - Categorizing products using Product images Predicting which products are an exact match or similar products NER based Attribute extraction algorithm that mines text like Title, Descriptions, Specifications to build structured Key:Value Attributes Fusion/Enrichment - An algorithm that uses the data to learn and build golden product record using disparate sources Product Rank - algorithm that uses multiple signals like product popularity, price, data quality, store popularity, brand popularity to build dynamic relevance/rank score Recommendation Engines that suggest Tags where Product information can be found on a web page Deep learning - Extracting visual product attributes using Product images NLG algorithms to generate product descriptions Product GPS - Universal Product Identifier using machine learning algorithms and allowing Search & Discovery ML @ Indix
  • 11. ML @ Indix - Classification
  • 12. ML @ Indix - Attribute Extraction
  • 14. Define Business Objective Explore & Transform Pull and Acquire Data Develop Model Evaluate Model Meets Business Needs? Build Production System DeployMeasure Metrics Yes! Not Yet! Human in the Loop Machine Learning Workflow
  • 15. Machine Learning Sandwich?* * - https://techcrunch.com/2017/08/08/the-evolution-of-machine-learning/ Explore & Transform Pull and Acquire Data Deploy Build Production System Develop Model Model Evaluation & Validation The MEAT is not in the middle
  • 16. Experts agree with us D. Sculley, et al. Hidden technical debt in machine learning systems. In Neural Information Processing Systems (NIPS). 2015 Only a small fraction of real-world ML systems is a composed of ML code, as shown by the small black box in the middle. The required surrounding infrastructure is fast and complex.
  • 17. Different Skillsets Explore & Transform Pull and Acquire Data Deploy Build Production System Develop Model Model Evaluation & Validation Data Pipelines App Model
  • 19. My Talk Explore & Transform Pull and Acquire Data Deploy Build Production System Develop Model Model Evaluation & Validation Focus of this talk
  • 20. Pain Points ● A key employee in the team had to abruptly go on leave ○ Unable to reproduce the performance of an existing production model ■ Training Data Missing/Not known ■ Scripts not there for Pre-processing ■ Hyperparameters not known ● It takes 3 Months to productionize a model ■ Lot of glue code ■ Custom code developed every time ■ Frequent updates to model takes long time ● Heterogeneous Systems ■ Eg. - Sharing stuff between Python and JVM
  • 21. Reality ● Confidence in Test Set != Confidence in Production ■ Confidence of model performance on a sample set not good enough
  • 23. Continuous Delivery is a software engineering approach that aims at building, testing and releasing software faster and more frequently. A straightforward and repeatable process is important from continuous delivery What is Continuous Delivery?
  • 25. Principle #1 Automation via CI + CD pipelines Automation of ML Training, Evaluation and Offline Prediction Pipelines Continuous Delivery for Software Continuous Delivery for Machine Learning
  • 26. Training Pipelines ● Training pipelines are modelled like a build pipeline ● Customized Go-CD, an open source CI & CD tool ● Created plugins to help us with our ML workflows Pre-process Data (Spark Job) Build Model (Python Script) Evaluate Model (Python) Training Pipeline (3 Flavors) Build Model (Spark Job) OR Build Model (Zeppelin Notebook) OR Training Data
  • 28. Principle #2 Source Code and Artifact Repository for Reproducibility Source Code, Data and Model Repository for Reproducibility Continuous Delivery for Software Continuous Delivery for Machine Learning
  • 29. Model Repository ● Similar to an artifact repository like Maven, Ivy ○ Directory Structure, Versioning, Publishing of models ● Has clients to publish models for most commonly used frameworks ○ scikit-learn, Spark MLLib, Keras ● For a model, ○ Data ■ Stored in S3 ■ In Different formats ● Parquet (Spark MLLib), Scikit-Learn - Pickle, Keras - HDF5 ○ Metadata ■ Training/Validation/Test Datasets ■ Hyper-parameters used ■ Evaluation Metrics
  • 30. Model Repository Pre-process Data (Spark Job) Build Model (Python) Evaluate Model (Python) Publish Model Training Pipeline Training Data
  • 31. Training Data Model Promotion ● Tagging the “latest good” version that needs to be deployed ● Not all models need/can be promoted ○ Experimental models ○ Models that fail the test set or performance/latency metrics ● Easy rollback - tag the “last good” version as the latest Pre-process Data (Spark Job) Build Model (Python) Evaluate Model (Python) Publish Model Promote Model Manual Stage Training Pipeline
  • 32. Principle #3 Containers for µServices Model Containers for Model Prediction µServices Continuous Delivery for Software Continuous Delivery for Machine Learning
  • 33. Model Container ● Hosts a single model to be used for predictions ● Exposes API for prediction and are “dockerized” ● Containers can be replicated to handle scale ● Two µServices ○ Scala ■ Handles pre-processing ○ Python ■ Loads model and exposes the predict on the model ■ Can also predict in batches for better throughput ■ Handles ensembles of models ○ Scala µservice delegates the predict and predict_batch functions to the Python µservice
  • 34. Model Container Docker Host Scala µService predict(input) predict_batch(inputs) _preprocess(input) Python µService Model Model Model predict(input) predict_batch(inputs) Create Docker Image (Docker) Push to Docker Registry (Docker) Publish Model Promote Model Training Pipeline
  • 35. Model Deployment ● Two Modes - Offline (Batch) and/or Online ● Offline Mode ○ Package model containers into an AMI (Amazon Machine Image) ○ Start the container as part of your Spark/Hadoop clusters on the Executors/Task Trackers ○ Within a job call the local Scala Service for prediction for each record ● Online Mode ○ Deploy the model containers into a Mesos + Marathon or a Kubernetes cluster ○ (Auto) Scaling is managed by the cluster
  • 36. Principle #4 A/B Testing Using Canary Releases A/B Testing Using Request Shadowing Continuous Delivery for Software Continuous Delivery for Machine Learning
  • 37. Model A/B Testing ● We don’t use Multi Armed Bandit Testing (MAB) ○ Reason - Payout is not easily measurable unlike CTR (for example) ● Instead we use Request Shadowing pattern ○ Input to both old and new both, but serve output only from old ○ Find deltas and do spot checking ● For Offline, we only do deltas + spot checking ● We have built an in-house data turking tool for spot checking
  • 40. Future Work ● Lot more to be done ○ Support deep learning based models as a first class solution ○ Model Repository visualization ○ Add more plugins in Go-CD to better support ML workflows natively ● Open Source ○ Model Serving Repository + Clients (WIP)
  • 41. Indix & Open Source ● oss.indix.com