SlideShare a Scribd company logo
1 of 43
MODEL DEPLOYMENT
Challenges and Best Practices
WHO AM I?
Srivatsan Srinivasan
Chief Data Scientist at Cognizant
https://www.linkedin.com/in/srivatsan-srinivasan-b8131b/
https://www.youtube.com/c/AIEngineeringLife
AIEngineering
Looking to Deep Dive on Model Deployment with Hands on scenario
Checkout my free courses on YouTube AIEngineering Channel (Click link for details)
Click Here for Link to this hands on Free Course
Click Here for Link to this hands on Free Course
Complexity and Challenges with model deployment
Model Deployment Options
Common Mistakes to Avoid
2
1
3
4
How do most of us see Model Deployment?
Model Training
Model Model
Pickle
PMML
Booster
Protobuf
Model Deployment
Flask
grpc
MOJO
If model deployment was so easy..
“Launching pilots is deceptively easy but deploying
them into production is notoriously challenging” –
Gartner CIO Survey results
"Organizations have been developing many machine
learning models, but only 47% of those models makes it
into production“ – Information Week
• Data Integration and Strategy
• Leadership Knowledge and Commitment
• Model Deployment
• Skill Availability
• Business, Data Scientist and IT working in Silos
Top 5 reasons AI fails
HOW REAL WORLD DEPLOYMENT LOOKS LIKE?
Business Case – Reduce Cart Abandonment
US Based Online Retailer called “Amazing LLC”
Current cart abandonment rate of 80%
Industry average of cart abandonment is
below 71%
Tried out all options of better and faster
checkout process, Support for all payment
types, increase transparency etc.
Hypothesize to see if cart abandonment can
be reduced by 5% via targeted offers based on
customer history
Source : https://www.barilliance.com/cart-abandonment-rate-statistics/
Objective : Convert abandoned cart to recovered ones.
Increase Revenue and Enhance Customer Experience
Simple Model Training Architecture
Feature
Engineering
Data
Preparation
Model Training
Model Evaluation
Hyper parameter
Selection/ Tuning
Saved Model
Data
Collection
Customer
Transaction
Campaign
Data
Analysis
RAW +
Cust_1_7_days_abandon_rate
Cust_7_14_days_abandon_rate
Cust_LTV
Cust_Retail_abandon_rate
Cust_home_abandon_rate
Cust_other_abandon_rate
Cust_avg_page_view_time_abandon
(7,14)
Cust_1_hour_avg_page_view_time
Cust_3_hour_avg_page_view_time
Cust_current_day_login_count
Probability_to_checkout
Daily/EOD
Call
Center/Chat
Enterprise Data Store
NFR’s – Reduce Cart Abandonment
100 million customers
1 million website visits per day (2X during holidays)
Peak daily volume to support 500 transaction per second (1000 tps during holidays)
Time to score each customer for targeted offer display < 20 ms on web and batch scoring for
targeting via email
99.95% availability. Near zero downtime
Simple Model Training Architecture
Feature
Engineering
Data
Preparation
Model Training
Model Evaluation
Hyper parameter
Selection/ Tuning
Saved Model
Data
Collection
Customer
Transaction
Campaign
Data
Analysis
RAW +
Cust_1_7_days_abandon_rate
Cust_7_14_days_abandon_rate
Cust_LTV
Cust_Retail_abandon_rate
Cust_home_abandon_rate
Cust_other_abandon_rate
Cust_avg_page_view_time_abandon
(7,14)
Cust_1_hour_avg_page_view_time
Cust_3_hour_avg_page_view_time
Cust_current_day_login_count
Probability_to_checkout
Daily/EOD
Call
Center/Chat
Enterprise Data Store
Simple Model Deployment Architecture
RAW+
Cust_1_7_days_abandon_rate
Cust_7_14_days_abandon_rate
Cust_LTV
Cust_Retail_abandon_rate
Cust_home_abandon_rate
Cust_other_abandon_rate
Cust_avg_page_view_time_abandon
(7,14)
Cust_1_hour_avg_page_view_time
Cust_3_hour_avg_page_view_time
Cust_current_day_login_count
Offer_accept_flag
Cart_checkedout_with_offer
Average_time_spent_on_cart_post_offer
Items_removed_added_post_checkout
Current Session
In memory/NoSQL
Real Time
Kafka
ETL
Model Deployment in Real World
Model Training
Model Model
Model Deployment
Data Engineering
Software Engineering
What we thought?
Reality…
Simple Model Deployment Architecture
Saved ModelData
Preparation
Feature
Engineering
Serving
Interface
REST API
Data
Collection
In memory/NoSQL
Offline/EOD
Enterprise Data Store
“Data Scientist and Software Engineers in many deployment
cases have different code path”
Simple Model Deployment Architecture
Saved ModelData
Preparation
Feature
Engineering
Serving
Interface
REST API
Data
Collection
In memory/NoSQL
Functionality Testing
Enterprise Data Store
“Data Scientist and Software Engineers in many deployment
cases have different code path”
Functionality Testing
Model Deployment in Real World
Model Training
Model Model
Model Deployment
Data Engineering
Software Engineering
What we thought?
Reality…
Functionality Testing
Simple Model Deployment Architecture
Saved ModelData
Preparation
Feature
Engineering
Serving
Interface
REST API
Data
Collection
In memory/NoSQL
Offline/EOD
20 ms
Enterprise Data Store
1 User
10 User
100 User
1000 User
Performance Testing
NFR
< 20 ms response time
1000 tps volume
Highly Available
Typical Resource Bottleneck
Shared Resources – (CPU, Memory, Bandwidth etc)
Garbage Collection
Network bottleneck
Memory Management
Shared Resources – (Network switches and Shared file system)
CPU starvation
Daemons and Background Activity
Model Deployment in Real World
Model Training
Model Model
Model DeploymentData Engineering
Software Engineering
What we thought?
Reality… Functionality Testing
Load/Performance
Testing
Model Deployment - Scaling
Flask Application
Saved ModelData
Preparation
Feature
Engineering
Serving
Interface
REST API
Data
Collection
In memory/NoSQL
Model Deployment - Scaling
VM/Containerize
Flask
ApplicationGunicornNgnix
Model Deployment – Scaling (Virtual Machine)
Flask
ApplicationGunicorn
Ngnix
Flask
ApplicationGunicorn
Flask
ApplicationGunicorn
Flask
ApplicationGunicorn
Ngnix
Load
balancer
Model Deployment - Scaling
Scale up to
1000 tps
Scale down to
500 tps
Model Deployment in Real World
Model Training
Model Model
Model DeploymentData Engineering
Software Engineering
What we thought?
Reality… Functionality Testing
Load/Performance
Testing
Infrastructure
Infrastructure
Bare Metal Infra vs Kubernetes vs Cloud
CPU vs GPU
Number of Servers required to Scale to peak volume
Cores and Memory per server
Disaster Recovery/HA – Active/Active or Active/Passive
Remember:
“Infrastructure cost can play key Factor
in application with low latency and
high availability need”
Model Deployment in Real World
Model Training
Model Model
Model DeploymentData Engineering
Software Engineering
What we thought?
Reality… Functionality Testing
Load/Performance
Testing
Infrastructure
CI/CD Pipeline
Be Prepared …
Model Monitoring
• Model performance can deteriorate any time no matter
how good your model training performance is
• In some cases it can be as soon as you start testing with
real world data
Model Monitoring
Simple Model Deployment Architecture
Saved ModelData
Preparation
Feature
Engineering
Serving
Interface
REST API
Data
Collection
In memory/NoSQL
Offline/EOD
Enterprise Data Store
Model Monitoring
Real Time
Dashboard (Business and Model KPI)
Model Deployment in Real World
Model Training
Model Model
Model DeploymentData Engineering
Software Engineering
What we thought?
Reality… Functionality Testing
Load/Performance
Testing
Infrastructure
CI/CD Pipeline
Model Monitoring
Model Deployment – Zero downtime deployment
Model Version 1
Image Source: https://www.ianlewis.org/en/bluegreen-deployments-kubernetes
Model Deployment – Zero downtime deployment
Model Version 1 Model Version 2
Model Deployment – Zero downtime deployment
Model Version 1 Model Version 2
Model Deployment – Champion Challenger Deployment
Champion Challenger90% 10%
Model Deployment in Real World
Model Training
Model Model
Model DeploymentData Engineering
Software Engineering
What we thought?
Reality… Functionality Testing
Load/Performance
Testing
Infrastructure
CI/CD Pipeline
Model Monitoring
MLOps
Simple Model Deployment Architecture
Application/Infrastructure Monitoring
MLOps (Model Monitoring, Management)
Hidden Technical Debt
DEPLOYMENT PATTERNS
Deployment Patterns
Batch Near Real
Time
Real Time Edge
Simple Hard
BEST PRACTICES
Best Practice
• Start model deployment and integration understanding along with
business problem framing
• Have right participation from business, machine learning, data
engineering and software engineering team during framing session
• Keep model pipelines as simple as possible and as long as possible
• Balance between performance and simplicity where possible
• Plan for Infrastructure at an very early stage of the project
• Invest in building cross project capability
• Model Monitoring
• Setting up CI/CD pipeline
• Model Retraining automation
• Feature Store
THANK YOU

More Related Content

Recently uploaded

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Machine Learning Model Deployment Challenges

  • 2. WHO AM I? Srivatsan Srinivasan Chief Data Scientist at Cognizant https://www.linkedin.com/in/srivatsan-srinivasan-b8131b/ https://www.youtube.com/c/AIEngineeringLife AIEngineering
  • 3. Looking to Deep Dive on Model Deployment with Hands on scenario Checkout my free courses on YouTube AIEngineering Channel (Click link for details) Click Here for Link to this hands on Free Course Click Here for Link to this hands on Free Course
  • 4. Complexity and Challenges with model deployment Model Deployment Options Common Mistakes to Avoid 2 1 3 4
  • 5. How do most of us see Model Deployment? Model Training Model Model Pickle PMML Booster Protobuf Model Deployment Flask grpc MOJO
  • 6. If model deployment was so easy.. “Launching pilots is deceptively easy but deploying them into production is notoriously challenging” – Gartner CIO Survey results "Organizations have been developing many machine learning models, but only 47% of those models makes it into production“ – Information Week • Data Integration and Strategy • Leadership Knowledge and Commitment • Model Deployment • Skill Availability • Business, Data Scientist and IT working in Silos Top 5 reasons AI fails
  • 7. HOW REAL WORLD DEPLOYMENT LOOKS LIKE?
  • 8. Business Case – Reduce Cart Abandonment US Based Online Retailer called “Amazing LLC” Current cart abandonment rate of 80% Industry average of cart abandonment is below 71% Tried out all options of better and faster checkout process, Support for all payment types, increase transparency etc. Hypothesize to see if cart abandonment can be reduced by 5% via targeted offers based on customer history Source : https://www.barilliance.com/cart-abandonment-rate-statistics/ Objective : Convert abandoned cart to recovered ones. Increase Revenue and Enhance Customer Experience
  • 9. Simple Model Training Architecture Feature Engineering Data Preparation Model Training Model Evaluation Hyper parameter Selection/ Tuning Saved Model Data Collection Customer Transaction Campaign Data Analysis RAW + Cust_1_7_days_abandon_rate Cust_7_14_days_abandon_rate Cust_LTV Cust_Retail_abandon_rate Cust_home_abandon_rate Cust_other_abandon_rate Cust_avg_page_view_time_abandon (7,14) Cust_1_hour_avg_page_view_time Cust_3_hour_avg_page_view_time Cust_current_day_login_count Probability_to_checkout Daily/EOD Call Center/Chat Enterprise Data Store
  • 10. NFR’s – Reduce Cart Abandonment 100 million customers 1 million website visits per day (2X during holidays) Peak daily volume to support 500 transaction per second (1000 tps during holidays) Time to score each customer for targeted offer display < 20 ms on web and batch scoring for targeting via email 99.95% availability. Near zero downtime
  • 11. Simple Model Training Architecture Feature Engineering Data Preparation Model Training Model Evaluation Hyper parameter Selection/ Tuning Saved Model Data Collection Customer Transaction Campaign Data Analysis RAW + Cust_1_7_days_abandon_rate Cust_7_14_days_abandon_rate Cust_LTV Cust_Retail_abandon_rate Cust_home_abandon_rate Cust_other_abandon_rate Cust_avg_page_view_time_abandon (7,14) Cust_1_hour_avg_page_view_time Cust_3_hour_avg_page_view_time Cust_current_day_login_count Probability_to_checkout Daily/EOD Call Center/Chat Enterprise Data Store
  • 12. Simple Model Deployment Architecture RAW+ Cust_1_7_days_abandon_rate Cust_7_14_days_abandon_rate Cust_LTV Cust_Retail_abandon_rate Cust_home_abandon_rate Cust_other_abandon_rate Cust_avg_page_view_time_abandon (7,14) Cust_1_hour_avg_page_view_time Cust_3_hour_avg_page_view_time Cust_current_day_login_count Offer_accept_flag Cart_checkedout_with_offer Average_time_spent_on_cart_post_offer Items_removed_added_post_checkout Current Session In memory/NoSQL Real Time Kafka ETL
  • 13. Model Deployment in Real World Model Training Model Model Model Deployment Data Engineering Software Engineering What we thought? Reality…
  • 14. Simple Model Deployment Architecture Saved ModelData Preparation Feature Engineering Serving Interface REST API Data Collection In memory/NoSQL Offline/EOD Enterprise Data Store “Data Scientist and Software Engineers in many deployment cases have different code path”
  • 15. Simple Model Deployment Architecture Saved ModelData Preparation Feature Engineering Serving Interface REST API Data Collection In memory/NoSQL Functionality Testing Enterprise Data Store “Data Scientist and Software Engineers in many deployment cases have different code path” Functionality Testing
  • 16. Model Deployment in Real World Model Training Model Model Model Deployment Data Engineering Software Engineering What we thought? Reality… Functionality Testing
  • 17. Simple Model Deployment Architecture Saved ModelData Preparation Feature Engineering Serving Interface REST API Data Collection In memory/NoSQL Offline/EOD 20 ms Enterprise Data Store 1 User 10 User 100 User 1000 User Performance Testing NFR < 20 ms response time 1000 tps volume Highly Available
  • 18. Typical Resource Bottleneck Shared Resources – (CPU, Memory, Bandwidth etc) Garbage Collection Network bottleneck Memory Management Shared Resources – (Network switches and Shared file system) CPU starvation Daemons and Background Activity
  • 19. Model Deployment in Real World Model Training Model Model Model DeploymentData Engineering Software Engineering What we thought? Reality… Functionality Testing Load/Performance Testing
  • 20. Model Deployment - Scaling Flask Application Saved ModelData Preparation Feature Engineering Serving Interface REST API Data Collection In memory/NoSQL
  • 21. Model Deployment - Scaling VM/Containerize Flask ApplicationGunicornNgnix
  • 22. Model Deployment – Scaling (Virtual Machine) Flask ApplicationGunicorn Ngnix Flask ApplicationGunicorn Flask ApplicationGunicorn Flask ApplicationGunicorn Ngnix Load balancer
  • 23. Model Deployment - Scaling Scale up to 1000 tps Scale down to 500 tps
  • 24. Model Deployment in Real World Model Training Model Model Model DeploymentData Engineering Software Engineering What we thought? Reality… Functionality Testing Load/Performance Testing Infrastructure
  • 25. Infrastructure Bare Metal Infra vs Kubernetes vs Cloud CPU vs GPU Number of Servers required to Scale to peak volume Cores and Memory per server Disaster Recovery/HA – Active/Active or Active/Passive
  • 26. Remember: “Infrastructure cost can play key Factor in application with low latency and high availability need”
  • 27. Model Deployment in Real World Model Training Model Model Model DeploymentData Engineering Software Engineering What we thought? Reality… Functionality Testing Load/Performance Testing Infrastructure CI/CD Pipeline
  • 28. Be Prepared … Model Monitoring • Model performance can deteriorate any time no matter how good your model training performance is • In some cases it can be as soon as you start testing with real world data
  • 30. Simple Model Deployment Architecture Saved ModelData Preparation Feature Engineering Serving Interface REST API Data Collection In memory/NoSQL Offline/EOD Enterprise Data Store Model Monitoring Real Time Dashboard (Business and Model KPI)
  • 31. Model Deployment in Real World Model Training Model Model Model DeploymentData Engineering Software Engineering What we thought? Reality… Functionality Testing Load/Performance Testing Infrastructure CI/CD Pipeline Model Monitoring
  • 32. Model Deployment – Zero downtime deployment Model Version 1 Image Source: https://www.ianlewis.org/en/bluegreen-deployments-kubernetes
  • 33. Model Deployment – Zero downtime deployment Model Version 1 Model Version 2
  • 34. Model Deployment – Zero downtime deployment Model Version 1 Model Version 2
  • 35. Model Deployment – Champion Challenger Deployment Champion Challenger90% 10%
  • 36. Model Deployment in Real World Model Training Model Model Model DeploymentData Engineering Software Engineering What we thought? Reality… Functionality Testing Load/Performance Testing Infrastructure CI/CD Pipeline Model Monitoring MLOps
  • 37. Simple Model Deployment Architecture Application/Infrastructure Monitoring MLOps (Model Monitoring, Management)
  • 40. Deployment Patterns Batch Near Real Time Real Time Edge Simple Hard
  • 42. Best Practice • Start model deployment and integration understanding along with business problem framing • Have right participation from business, machine learning, data engineering and software engineering team during framing session • Keep model pipelines as simple as possible and as long as possible • Balance between performance and simplicity where possible • Plan for Infrastructure at an very early stage of the project • Invest in building cross project capability • Model Monitoring • Setting up CI/CD pipeline • Model Retraining automation • Feature Store