SlideShare a Scribd company logo
1 of 41
Download to read offline
Deterministic Machine
Learning with MLflow and
mlf-core
19.11.2020
Lukas Heumos
About me
● Bioinformatics MSc from the University of
Tübingen
● Research Software Engineer at the Quantitative
Biology Center Tübingen
● Expert for reproducible research
2
About the Quantitative Biology Center (QBiC)
● Bioinformatics core facility at the University of Tübingen
○ Data management and data analysis
● Strong contributor to reproducible research
● Job opening for a Scientific Data Steward
3
Why do we even care?
● 400 papers in machine learning evaluated [1]
○ Only 24% reproducible
[1] Gundersen, Odd Erik & Kjensmo, Sigbjørn. (2018). State of the Art: Reproducibility in Artificial Intelligence.
Auditing Experimentation
Debugging Regression
Science
Primary reasons for non-reproducible machine learning [1]
● Data and code not shared
● Documentation insufficient
○ Hyperparameters, metrics, ...
○ Used hardware
[1] Gundersen, Odd Erik & Kjensmo, Sigbjørn. (2018). State of the Art: Reproducibility in Artificial Intelligence.
● Irreproducible environment
● Usage of GPUs
○ Non-deterministic operations
The elephant in the deterministic machine learning room
● Sum-reduce algorithm
○ Based on CUDA atomic add operations
○ GPUs operate highly in parallel
○ Summing up requires synchronization
On highly parallel floating point addition
● Order of thread synchronization leads to
different floating point errors
● Summation is not associative
● Many applications of the algorithm lead
to amplified differences
● Most machine learning libraries are
based on atomic operations
● Plenty more reasons for
non-deterministic behavior
Recent developments
9
Deterministic algorithms working as expected?
Options for all algorithms available?
Effect on the run time?
● (Optional) deterministic algorithms are now offered
○ Implemented without atomic operations
since v0.4.0 (2017)
since v2.1.0 (2020)
since v1.1.0 (2020)
Evaluating determinism - the setup
● Containerized projects
○ Pytorch, Tensorflow: MNIST
○ XGBoost: Covertype
10
● Three settings for CPU, single GPU and multiple GPUs
○ No random seeds
○ All possible random seeds
○ Deterministic algorithms and random seeds
● 5 runs per setting
System Hardware
1 - Laptop Intel I5 7300 HQ and NVIDIA 1050M
2 - deNBI K80 Intel 12 core and 2 NVIDIA Tesla k80s
3/4 - deNBI V100 Intel 24 core and 2 NVIDIA Tesla V100s
GPU with just seeds is non-deterministic
I5 7300 HQ
1050M
11
Can appear deterministic if cuDNN benchmark was lucky
12
12 core
2x K80s
Same hardware - same results
13
24 core
2x V100s
Marginal effect on runtime
14
12 core
2x K80s
Primary takeaways
● Deterministic algorithms work
○ Badly tested
○ Need to be forced
15[1] https://github.com/NVIDIA/framework-determinism
● Not every algorithm has a deterministic option
○ Difficult to get complete lists
○ Even harder to keep track
● Determinism is hardware architecture dependent
● Neglectable effect on the runtime
○ Duncan Riach (NVIDIA): ~ 6% [1]
Requirements for Deterministic Machine Learning
Run 1
Run 2
Run 3
Complex requirements demand
an intuitive software solution
Enabling deterministic machine learning with mlf-core [1][2]
[1] https://mlf-core.com
[2] https://pypi.org/project/mlf-core/ Inspired by
Overview
Templates Continuous
Integration
Documentation Community
Linting Sync
Available project templates
Experiments and Tracking
Model and Report Artifacts
mlf-core lint ensures that code stays deterministic
Enabling deterministic machine learning
Run 1
Run 2
Run 3
mlf-core together with Nextflow enables end-to-end
deterministic machine learning
Further work
● mlf-core
○ New templates
■ Machine learning libraries
■ Python packages
○ Improving existing templates
■ Cloud configurations
■ Adding hyperparameter optimization
■ Restructuring templates
● Add mlf-core projects for popular
architectures
○ Serve as reference implementations, which can be
verified
28
Join mlf-core
29
● pip install mlf-core
● https://discord.gg/Mv8sAcq
● github.com/mlf-core
● mlf-core.com
Acknowledgements
● Sven Nahnsen
● Philipp Hennig
● Gisela Gabernet
● Duncan Riach
● nf-core
● deNBI
30
This work was supported by the BMBF-funded de.NBI Cloud within the German Network
for Bioinformatics Infrastructure (de.NBI)(031A537B, 031A533A, 031A538A, 031A533B,
031A535A, 031A537C, 031A534A, 031A532B).
31
Further reasons for non-determinism
● NVIDIA cuDNN benchmark
○ Disable benchmarking
● Bias additions, max-pooling, batch normalization
○ Usually based on atomic add
○ Circumvent with deterministic algorithms
● Many many non obvious functions
○ Index_add
○ Gate_gradients
○ …
○ Usually no solution available
● GPU batch distribution
○ Library specific
● CUDA version
○ Must be compiled with the correct version 33
Deterministic sum_reduce
● One of the easier algorithms to replace atomic add in
● Multiply transpose of a column vector with a column of ones
def reduce_sum_det(x):
v = tf.reshape(x, [1, -1])
return tf.reshape(tf.matmul(v, tf.ones_like(v), transpose_b=True), []
● GPUs are good at matrix multiplication
34
Architecture of pytorch & tensorflow & xgboost params
● Pytorch & Tensorflow
○ 2x conv, 2x dropout, 1x 2d max pooling, 2x fc, 1x log softmax
○ Rectified linear activation functions
○ Adam optimizer
● XGBoost
○ Hist & gpu_hist algorithms
○ Subsample, colsample_bytree, colsample_bylevel = 0.5
35
Datasets mnist, covertype
● MNIST
○ 60000 training, 10000 test
○ Classify 10 handwritten digits
● Covertype
○ 581012 instances
○ Classify tree cover type
36
Tensorflow results
37
Tensorflow results
38
XGBoost results
39
XGBoost results
40
mlf-core sync
41

More Related Content

Similar to Deterministic Machine Learning with MLflow and mlf-core

Similar to Deterministic Machine Learning with MLflow and mlf-core (20)

Devoxx : being productive with JHipster
Devoxx : being productive with JHipsterDevoxx : being productive with JHipster
Devoxx : being productive with JHipster
 
Google Cloud - Stand Out Features
Google Cloud - Stand Out FeaturesGoogle Cloud - Stand Out Features
Google Cloud - Stand Out Features
 
Computer Architecture and Organization
Computer Architecture and OrganizationComputer Architecture and Organization
Computer Architecture and Organization
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
 
Technologies comparison: Genuino 101 vs uTensor
Technologies comparison: Genuino 101 vs uTensor Technologies comparison: Genuino 101 vs uTensor
Technologies comparison: Genuino 101 vs uTensor
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Data ops in practice - Swedish style
Data ops in practice - Swedish styleData ops in practice - Swedish style
Data ops in practice - Swedish style
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
 
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons LearntLast Conference 2017: Big Data in a Production Environment: Lessons Learnt
Last Conference 2017: Big Data in a Production Environment: Lessons Learnt
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...
 
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
 
Centernet
CenternetCenternet
Centernet
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Integrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsIntegrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperations
 
Client side machine learning
Client side machine learningClient side machine learning
Client side machine learning
 

More from Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 

Recently uploaded (20)

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 

Deterministic Machine Learning with MLflow and mlf-core

  • 1. Deterministic Machine Learning with MLflow and mlf-core 19.11.2020 Lukas Heumos
  • 2. About me ● Bioinformatics MSc from the University of Tübingen ● Research Software Engineer at the Quantitative Biology Center Tübingen ● Expert for reproducible research 2
  • 3. About the Quantitative Biology Center (QBiC) ● Bioinformatics core facility at the University of Tübingen ○ Data management and data analysis ● Strong contributor to reproducible research ● Job opening for a Scientific Data Steward 3
  • 4.
  • 5. Why do we even care? ● 400 papers in machine learning evaluated [1] ○ Only 24% reproducible [1] Gundersen, Odd Erik & Kjensmo, Sigbjørn. (2018). State of the Art: Reproducibility in Artificial Intelligence. Auditing Experimentation Debugging Regression Science
  • 6. Primary reasons for non-reproducible machine learning [1] ● Data and code not shared ● Documentation insufficient ○ Hyperparameters, metrics, ... ○ Used hardware [1] Gundersen, Odd Erik & Kjensmo, Sigbjørn. (2018). State of the Art: Reproducibility in Artificial Intelligence. ● Irreproducible environment ● Usage of GPUs ○ Non-deterministic operations
  • 7. The elephant in the deterministic machine learning room ● Sum-reduce algorithm ○ Based on CUDA atomic add operations ○ GPUs operate highly in parallel ○ Summing up requires synchronization
  • 8. On highly parallel floating point addition ● Order of thread synchronization leads to different floating point errors ● Summation is not associative ● Many applications of the algorithm lead to amplified differences ● Most machine learning libraries are based on atomic operations ● Plenty more reasons for non-deterministic behavior
  • 9. Recent developments 9 Deterministic algorithms working as expected? Options for all algorithms available? Effect on the run time? ● (Optional) deterministic algorithms are now offered ○ Implemented without atomic operations since v0.4.0 (2017) since v2.1.0 (2020) since v1.1.0 (2020)
  • 10. Evaluating determinism - the setup ● Containerized projects ○ Pytorch, Tensorflow: MNIST ○ XGBoost: Covertype 10 ● Three settings for CPU, single GPU and multiple GPUs ○ No random seeds ○ All possible random seeds ○ Deterministic algorithms and random seeds ● 5 runs per setting System Hardware 1 - Laptop Intel I5 7300 HQ and NVIDIA 1050M 2 - deNBI K80 Intel 12 core and 2 NVIDIA Tesla k80s 3/4 - deNBI V100 Intel 24 core and 2 NVIDIA Tesla V100s
  • 11. GPU with just seeds is non-deterministic I5 7300 HQ 1050M 11
  • 12. Can appear deterministic if cuDNN benchmark was lucky 12 12 core 2x K80s
  • 13. Same hardware - same results 13 24 core 2x V100s
  • 14. Marginal effect on runtime 14 12 core 2x K80s
  • 15. Primary takeaways ● Deterministic algorithms work ○ Badly tested ○ Need to be forced 15[1] https://github.com/NVIDIA/framework-determinism ● Not every algorithm has a deterministic option ○ Difficult to get complete lists ○ Even harder to keep track ● Determinism is hardware architecture dependent ● Neglectable effect on the runtime ○ Duncan Riach (NVIDIA): ~ 6% [1]
  • 16. Requirements for Deterministic Machine Learning Run 1 Run 2 Run 3 Complex requirements demand an intuitive software solution
  • 17.
  • 18. Enabling deterministic machine learning with mlf-core [1][2] [1] https://mlf-core.com [2] https://pypi.org/project/mlf-core/ Inspired by
  • 21.
  • 22.
  • 24. Model and Report Artifacts
  • 25. mlf-core lint ensures that code stays deterministic
  • 26. Enabling deterministic machine learning Run 1 Run 2 Run 3
  • 27. mlf-core together with Nextflow enables end-to-end deterministic machine learning
  • 28. Further work ● mlf-core ○ New templates ■ Machine learning libraries ■ Python packages ○ Improving existing templates ■ Cloud configurations ■ Adding hyperparameter optimization ■ Restructuring templates ● Add mlf-core projects for popular architectures ○ Serve as reference implementations, which can be verified 28
  • 29. Join mlf-core 29 ● pip install mlf-core ● https://discord.gg/Mv8sAcq ● github.com/mlf-core ● mlf-core.com
  • 30. Acknowledgements ● Sven Nahnsen ● Philipp Hennig ● Gisela Gabernet ● Duncan Riach ● nf-core ● deNBI 30 This work was supported by the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI)(031A537B, 031A533A, 031A538A, 031A533B, 031A535A, 031A537C, 031A534A, 031A532B).
  • 31. 31
  • 32.
  • 33. Further reasons for non-determinism ● NVIDIA cuDNN benchmark ○ Disable benchmarking ● Bias additions, max-pooling, batch normalization ○ Usually based on atomic add ○ Circumvent with deterministic algorithms ● Many many non obvious functions ○ Index_add ○ Gate_gradients ○ … ○ Usually no solution available ● GPU batch distribution ○ Library specific ● CUDA version ○ Must be compiled with the correct version 33
  • 34. Deterministic sum_reduce ● One of the easier algorithms to replace atomic add in ● Multiply transpose of a column vector with a column of ones def reduce_sum_det(x): v = tf.reshape(x, [1, -1]) return tf.reshape(tf.matmul(v, tf.ones_like(v), transpose_b=True), [] ● GPUs are good at matrix multiplication 34
  • 35. Architecture of pytorch & tensorflow & xgboost params ● Pytorch & Tensorflow ○ 2x conv, 2x dropout, 1x 2d max pooling, 2x fc, 1x log softmax ○ Rectified linear activation functions ○ Adam optimizer ● XGBoost ○ Hist & gpu_hist algorithms ○ Subsample, colsample_bytree, colsample_bylevel = 0.5 35
  • 36. Datasets mnist, covertype ● MNIST ○ 60000 training, 10000 test ○ Classify 10 handwritten digits ● Covertype ○ 581012 instances ○ Classify tree cover type 36