SlideShare a Scribd company logo
1 of 24
1
CD4ML and the challenges
of testing and quality in ML
systems
TensorFlow London Meetup, May 2020
Danilo Sato
@dtsato
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
7000+ technologists with 43 offices in 14 countries
We help clients become Modern Digital Businesses
DELIVER VALUE MOVE FASTTHINK BIG
#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Techniques
Continuous delivery
for machine
learning (CD4ML)
TRIAL
7
https://www.thoughtworks.com/radar
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
6
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
7
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
Machine Learning is:
● Non-deterministic
● Hard to test
● Hard to explain
● Hard to improve
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
MANY SOURCES OF CHANGE
8
ModelData Code
+ +
Schema
Sampling over Time
Volume
Algorithms
More Training
Experiments
Business Needs
Bug Fixes
Configuration
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
“Continuous Delivery is the ability to get changes of
all types — including new features, configuration
changes, bug fixes and experiments — into
production, or into the hands of users, safely and
quickly in a sustainable way.”
- Jez Humble & Dave Farley
9
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
PRINCIPLES OF CONTINUOUS DELIVERY
10
→ Create a Repeatable, Reliable Process for Releasing
Software
→ Automate Almost Everything
→ Build Quality In
→ Work in Small Batches
→ Keep Everything in Source Control
→ Done Means “Released”
→ Improve Continuously
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TECHNICAL
COMPONENTS OF
CD4ML
Implementation requires lots of tools,
technologies, and architecture decisions
to fully automate the end-to-end process.
This presentation will focus on the
testing and quality aspects of CD4ML.
11
DOING CD4ML IS STILL A HARD PROBLEM
DISCOVERABLE AND
ACCESSIBLE DATA
REPRODUCIBLE
MODEL TRAINING
EXPERIMENTS
TRACKING
ELASTIC
INFRASTRUCTURE
VERSION CONTROL
& ARTIFACTS REPOS
MODEL SERVING
MODEL
DEPLOYMENT
TESTING & QUALITY
MONITORING &
OBSERVABILITY
CD
ORCHESTRATION
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
https://martinfowler.com/articles/cd4ml.html
“CLASSIC” SOFTWARE TEST PYRAMID
12
UI
Tests
Service Tests
Unit Tests
https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Speed
Cost
AS SOFTWARE BECAME MORE COMPLEX
13
https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TESTING IN PRODUCTION
14
https://sookocheff.com/post/architecture/testing-in-production/©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
15
ModelData Code
+ +
??
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TESTS FOR DATA
16
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
- Adherence to schemas
- Features can be used
- Schema versioning and
compatibility
- Integration tests against
(small) sample input
- Adherence to privacy
controls
- On-demand quality
checks
TESTS FOR MODEL
17
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Compare against a
simple model
- Numerical stability
(behaviour when NaN or
infinite values appear)
Unit Tests
(Model Specification)
Model
Quality
ML Training Pipeline
- Training is reproducible
(Watch out for sources of
non-determinism – e.g. RNG
seeds, initialization order)
- Integration test
18
ModelData Code
+ +
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
19
Model Performance
Contract Tests
Model Bias and Fairness
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
Unit Tests
(Model Specification)
Model
Quality
UI
Tests
Service Tests
Unit Tests
ML Training Pipeline
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Model evaluation against
different validation
datasets
- Thresholds for model
metrics and execution
performance
- Different data slices
- Feature generation is
same for training/serving
- Model contract is
adhered in production
- When model is exported,
test it still works
TESTING WHERE THEY OVERLAP
20
Model Performance
Contract Tests
Model Bias and Fairness
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
Unit Tests
(Model Specification)
Model
Quality
UI
Tests
Service Tests
Unit Tests
End-to-End Tests
Production Monitoring
Exploratory
Tests
ML Training Pipeline
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Model degradation
- Training/serving skew
- Operational metrics
(latency, throughput,
resource usage)
- Real impact! (KPIs)
21
“Inspection does not improve the
quality, nor guarantee quality.
Inspection is too late. The quality,
good or bad, is already in the
product.”
- W. Edward Deming
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
QUESTIONS?
22
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
WORKSHOPS,
PRESENTATIONS &
ARTICLES
Workshops:
https://github.com/ThoughtWorksInc/cd4ml-workshop
https://github.com/ThoughtWorksInc/CD4ML-Scenarios
Articles:
https://martinfowler.com/articles/cd4ml.html
https://www.thoughtworks.com/insights/articles/intelligent-enterprise-series-cd4ml
Paper:
“The ML Test Score: A Rubric for ML Production Readiness and Technical Debt
Reduction”, Breck et al (Google)
2323
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
2424
THANK YOU!
Danilo Sato (dsato@thoughtworks.com)
@dtsato
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020

More Related Content

What's hot

Endüstriyel Yapay Zeka ve Otonom Sistemler
Endüstriyel Yapay Zeka ve Otonom SistemlerEndüstriyel Yapay Zeka ve Otonom Sistemler
Endüstriyel Yapay Zeka ve Otonom SistemlerCihan Özhan
 
Scrum ve Redmine ile yazılım projesi yönetimi
Scrum ve Redmine ile yazılım projesi yönetimiScrum ve Redmine ile yazılım projesi yönetimi
Scrum ve Redmine ile yazılım projesi yönetimiGokhan Boranalp
 
Yetenek Yönetimi Zirvesi "Onboarding Programları"
Yetenek Yönetimi Zirvesi "Onboarding Programları"Yetenek Yönetimi Zirvesi "Onboarding Programları"
Yetenek Yönetimi Zirvesi "Onboarding Programları"Elif Duru Gönen, ACC
 
DEFINICION DE CALIDAD Y CALIDAD DE SOFTWARE
DEFINICION DE CALIDAD Y CALIDAD DE SOFTWAREDEFINICION DE CALIDAD Y CALIDAD DE SOFTWARE
DEFINICION DE CALIDAD Y CALIDAD DE SOFTWARELidizz Garcia Alvarado
 
ITIL 4 Verses ITIL v3
ITIL 4 Verses ITIL v3ITIL 4 Verses ITIL v3
ITIL 4 Verses ITIL v3Mamdouh Sakr
 
Introduction to ITIL 4 and IT service management
Introduction to ITIL 4 and IT service managementIntroduction to ITIL 4 and IT service management
Introduction to ITIL 4 and IT service managementChristian F. Nissen
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Machine Learning using Kubeflow and Kubernetes
Machine Learning using Kubeflow and KubernetesMachine Learning using Kubeflow and Kubernetes
Machine Learning using Kubeflow and KubernetesArun Gupta
 
MLaaS - Presenting & Scaling Machine Learning Models as Microservices
MLaaS - Presenting & Scaling Machine Learning Models as MicroservicesMLaaS - Presenting & Scaling Machine Learning Models as Microservices
MLaaS - Presenting & Scaling Machine Learning Models as MicroservicesCihan Özhan
 
Windows virtual desktop l100 presentation
Windows virtual desktop l100 presentationWindows virtual desktop l100 presentation
Windows virtual desktop l100 presentationkiefter
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOpsMarco Parenzan
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleDatabricks
 
Cuadro comparativo entre moprosoft y cmmi
Cuadro comparativo entre moprosoft y cmmi Cuadro comparativo entre moprosoft y cmmi
Cuadro comparativo entre moprosoft y cmmi Darthuz Kilates
 
Introduction to Mulesoft
Introduction to MulesoftIntroduction to Mulesoft
Introduction to Mulesoftvenkata20k
 
Mulesoft Anypoint platform introduction
Mulesoft Anypoint platform introductionMulesoft Anypoint platform introduction
Mulesoft Anypoint platform introductiongijish
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)Julien SIMON
 

What's hot (20)

Endüstriyel Yapay Zeka ve Otonom Sistemler
Endüstriyel Yapay Zeka ve Otonom SistemlerEndüstriyel Yapay Zeka ve Otonom Sistemler
Endüstriyel Yapay Zeka ve Otonom Sistemler
 
Scrum ve Redmine ile yazılım projesi yönetimi
Scrum ve Redmine ile yazılım projesi yönetimiScrum ve Redmine ile yazılım projesi yönetimi
Scrum ve Redmine ile yazılım projesi yönetimi
 
Yetenek Yönetimi Zirvesi "Onboarding Programları"
Yetenek Yönetimi Zirvesi "Onboarding Programları"Yetenek Yönetimi Zirvesi "Onboarding Programları"
Yetenek Yönetimi Zirvesi "Onboarding Programları"
 
Tarea
TareaTarea
Tarea
 
DEFINICION DE CALIDAD Y CALIDAD DE SOFTWARE
DEFINICION DE CALIDAD Y CALIDAD DE SOFTWAREDEFINICION DE CALIDAD Y CALIDAD DE SOFTWARE
DEFINICION DE CALIDAD Y CALIDAD DE SOFTWARE
 
ITIL 4 Verses ITIL v3
ITIL 4 Verses ITIL v3ITIL 4 Verses ITIL v3
ITIL 4 Verses ITIL v3
 
Introduction to ITIL 4 and IT service management
Introduction to ITIL 4 and IT service managementIntroduction to ITIL 4 and IT service management
Introduction to ITIL 4 and IT service management
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Machine Learning using Kubeflow and Kubernetes
Machine Learning using Kubeflow and KubernetesMachine Learning using Kubeflow and Kubernetes
Machine Learning using Kubeflow and Kubernetes
 
AXELOS - ITIL® Foundation
AXELOS - ITIL® FoundationAXELOS - ITIL® Foundation
AXELOS - ITIL® Foundation
 
MLaaS - Presenting & Scaling Machine Learning Models as Microservices
MLaaS - Presenting & Scaling Machine Learning Models as MicroservicesMLaaS - Presenting & Scaling Machine Learning Models as Microservices
MLaaS - Presenting & Scaling Machine Learning Models as Microservices
 
Itil v4-mindmap
Itil v4-mindmapItil v4-mindmap
Itil v4-mindmap
 
Windows virtual desktop l100 presentation
Windows virtual desktop l100 presentationWindows virtual desktop l100 presentation
Windows virtual desktop l100 presentation
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOps
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Cuadro comparativo entre moprosoft y cmmi
Cuadro comparativo entre moprosoft y cmmi Cuadro comparativo entre moprosoft y cmmi
Cuadro comparativo entre moprosoft y cmmi
 
Introduction to Mulesoft
Introduction to MulesoftIntroduction to Mulesoft
Introduction to Mulesoft
 
Mulesoft Anypoint platform introduction
Mulesoft Anypoint platform introductionMulesoft Anypoint platform introduction
Mulesoft Anypoint platform introduction
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)
 

Similar to CD4ML and the challenges of testing and quality in ML systems

Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionDr. Arif Wider
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyDr. Arif Wider
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGroup
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AISanjana Chowdhury
 
Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine LearningThoughtworks
 
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019Christoph Windheuser
 
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfData Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfHemaVeeradhi1
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Open Data Group
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringJordi Cabot
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisVivek Raja P S
 
Continuous delivery practices and real experiences
Continuous delivery   practices and real experiencesContinuous delivery   practices and real experiences
Continuous delivery practices and real experiencesEduardo Ferro Aldama
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation Profinit
 
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Rik Marselis
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Lionel Briand
 
Week 3 data journey and data storage
Week 3   data journey and data storageWeek 3   data journey and data storage
Week 3 data journey and data storageAjay Taneja
 

Similar to CD4ML and the challenges of testing and quality in ML systems (20)

Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
IBM Think Milano
IBM Think MilanoIBM Think Milano
IBM Think Milano
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production Reliably
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Eliminate 7 Mudas
Eliminate 7 MudasEliminate 7 Mudas
Eliminate 7 Mudas
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
 
Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine Learning
 
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
 
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfData Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software Engineering
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
 
Continuous delivery practices and real experiences
Continuous delivery   practices and real experiencesContinuous delivery   practices and real experiences
Continuous delivery practices and real experiences
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 
Week 3 data journey and data storage
Week 3   data journey and data storageWeek 3   data journey and data storage
Week 3 data journey and data storage
 

More from Seldon

TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsSeldon
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiSeldon
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...Seldon
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...Seldon
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...Seldon
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...Seldon
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAISeldon
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow Seldon
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial servicesSeldon
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code Seldon
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...Seldon
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Seldon
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Seldon
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...Seldon
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'Seldon
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Seldon
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'Seldon
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoSeldon
 

More from Seldon (20)

TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial services
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya Dmitrichenko
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

CD4ML and the challenges of testing and quality in ML systems

  • 1. 1 CD4ML and the challenges of testing and quality in ML systems TensorFlow London Meetup, May 2020 Danilo Sato @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 2. 7000+ technologists with 43 offices in 14 countries We help clients become Modern Digital Businesses DELIVER VALUE MOVE FASTTHINK BIG
  • 3. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 4.
  • 5. Techniques Continuous delivery for machine learning (CD4ML) TRIAL 7 https://www.thoughtworks.com/radar ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 6. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 6 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 7. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 7 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving Machine Learning is: ● Non-deterministic ● Hard to test ● Hard to explain ● Hard to improve HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 8. MANY SOURCES OF CHANGE 8 ModelData Code + + Schema Sampling over Time Volume Algorithms More Training Experiments Business Needs Bug Fixes Configuration ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 9. “Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes and experiments — into production, or into the hands of users, safely and quickly in a sustainable way.” - Jez Humble & Dave Farley 9 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 10. PRINCIPLES OF CONTINUOUS DELIVERY 10 → Create a Repeatable, Reliable Process for Releasing Software → Automate Almost Everything → Build Quality In → Work in Small Batches → Keep Everything in Source Control → Done Means “Released” → Improve Continuously ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 11. TECHNICAL COMPONENTS OF CD4ML Implementation requires lots of tools, technologies, and architecture decisions to fully automate the end-to-end process. This presentation will focus on the testing and quality aspects of CD4ML. 11 DOING CD4ML IS STILL A HARD PROBLEM DISCOVERABLE AND ACCESSIBLE DATA REPRODUCIBLE MODEL TRAINING EXPERIMENTS TRACKING ELASTIC INFRASTRUCTURE VERSION CONTROL & ARTIFACTS REPOS MODEL SERVING MODEL DEPLOYMENT TESTING & QUALITY MONITORING & OBSERVABILITY CD ORCHESTRATION ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 https://martinfowler.com/articles/cd4ml.html
  • 12. “CLASSIC” SOFTWARE TEST PYRAMID 12 UI Tests Service Tests Unit Tests https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Speed Cost
  • 13. AS SOFTWARE BECAME MORE COMPLEX 13 https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 15. 15 ModelData Code + + ?? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 16. TESTS FOR DATA 16 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) - Adherence to schemas - Features can be used - Schema versioning and compatibility - Integration tests against (small) sample input - Adherence to privacy controls - On-demand quality checks
  • 17. TESTS FOR MODEL 17 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Compare against a simple model - Numerical stability (behaviour when NaN or infinite values appear) Unit Tests (Model Specification) Model Quality ML Training Pipeline - Training is reproducible (Watch out for sources of non-determinism – e.g. RNG seeds, initialization order) - Integration test
  • 18. 18 ModelData Code + + ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 19. 19 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model evaluation against different validation datasets - Thresholds for model metrics and execution performance - Different data slices - Feature generation is same for training/serving - Model contract is adhered in production - When model is exported, test it still works TESTING WHERE THEY OVERLAP
  • 20. 20 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests End-to-End Tests Production Monitoring Exploratory Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model degradation - Training/serving skew - Operational metrics (latency, throughput, resource usage) - Real impact! (KPIs)
  • 21. 21 “Inspection does not improve the quality, nor guarantee quality. Inspection is too late. The quality, good or bad, is already in the product.” - W. Edward Deming ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 22. QUESTIONS? 22 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 24. 2424 THANK YOU! Danilo Sato (dsato@thoughtworks.com) @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020