SlideShare a Scribd company logo
1 of 28
Download to read offline
Workshop:
Machine Learning at
scale with GCP
using ML Engine & Python Dataflow
25/01/2018
Robbe Sneyders & Juta Staes
2
About ML6
We are a team of data scientists, machine learning
experts, software engineers and mathematicians.
Our mission is to provide tailor-made systems to help
your organization get smart actionable insights from
large data volumes.
+ Specialized Machine Learning partner of Google
Cloud
Robbe Sneyders
ML engineer @ ML6
Juta Staes
ML engineer @ ML6
3 Outline
GCP tools overview
Workshop part 1: Dataflow
Workshop part 2: ML Engine
1
2
3
4 Machine Learning Pipeline
Collect
data
Create
model
Train model with
organized data
Organize
data
Deploy trained
model
iterate
5 Mapping to GCP Products
Collect
data
Create
model
Train model with
organized data
Organize
data
Deploy trained
model
Cloud Machine
Learning Engine
Cloud
Dataflow
Cloud Machine
Learning Engine
Cloud
Storage
Tensorflow
6 Google Cloud Products
Compute Storage
Data &
Analytics
Machine
Learning
Cloud
Functions
7 Google Cloud Platform: Open Cloud Philosophy
● Powerful open source frameworks that run everywhere
● Fully managed compute and storage services to run it more easily
● Free trial for 1 year with $300 worth in credits
Cloud Machine
Learning Engine
TensorFlow
Cloud
Dataflow
Apache
Beam
8 Outline
GCP tools overview
Workshop part 1: Dataflow
Workshop part 2: ML Engine
1
2
3
9 Workshop: overview
● Build ML model to classify flower images
10 Workshop: overview
Cloud Machine
Learning Engine
Cloud
Dataflow
Tensorflow
Part 1: Transform images from Cloud Storage into TF
records and split into train, test and validation set
Part 2: Build ML model, deploy it and use it to
make predictions
11 Google Cloud Storage (GCS)
● Object Storage Service
● Your data lives here:
○ Raw input data
○ Cleaned examples for TF models
○ Serialized Tensorflow models
● Single interface/API, multiple
offerings
Name Access Frequency
Multi-Regional Frequent, Cross-regional
Regional Frequent, Single-region
Nearline Less than once per month
Coldline Less than once per year
12 Apache Beam running on Cloud Dataflow
● Open source, unified model for defining both
batch and streaming data-parallel processing
pipelines.
● Using one of the open source Beam SDKs, you
build a program that defines the pipeline.
● The pipeline is then executed by one of Beam’s
supported distributed processing back-ends,
which include Apache Apex, Apache Flink,
Apache Spark, and Google Cloud Dataflow.
Beam Model: Fn Runners
Apache
Flink
Apache
Spark
Beam Model: Pipeline
Construction
Other
LanguagesBeam Java
Beam
Python
Execution Execution
Cloud
Dataflow
Execution
Source: https://beam.apache.org
13 Apache Beam key concepts
● Pipelines: data processing job made of a
series of computations including input,
processing, and output
● PCollections: bounded (or unbounded)
datasets which represent the input,
intermediate and output data in pipelines
● PTransforms: data processing step in a
pipeline in which one or more PCollections are
an input and output
● I/O Sources and Sinks: APIs for reading and
writing data which are the roots and
endpoints of the pipeline.
Source: https://beam.apache.org
14 Apache Beam running on Cloud Dataflow
● Fully-managed data processing service
to run Apache Beam pipelines:
○ Automated and optimized
work partitioning which can
dynamically rebalance lagging work
○ Horizontal dynamic autoscaling of
worker resources
15 Input data Flowers sample
Hosted publicly by google
● Csv file on Google Cloud Storage
○ One line per sample
○ Format: image uri, label
● Text file on Google Cloud Storage
○ All labels
16 Collect & organize data with Cloud Dataflow
Flowers sample steps
● ReadData:
○ Read metadata from 1 csv-file
○ Output one string per line
● Split:
○ Transform to string to tuple
■ (uri, label)
● ReadDictionary
○ Read labels from text file
17 One hot encoding
Daisy Dandelion Roses Sunflowers Tulips
Daisy 1 0 0 0 0
Dandelion 0 1 0 0 0
Roses 0 0 1 0 0
Sunflowers 0 0 0 1 0
Tulips 0 0 0 0 1
18 Collect & organize data with Cloud Dataflow
Flowers sample steps
● OneHotEncoding:
○ Main input: (uri, label)
○ Side input: labels
○ Output: (uri, one hot encoding)
● ReadImage:
○ Read image from uri and convert to
pixels
● BuildExamples
○ Build a dictionary for each sample
to store as TFRecord
19 Collect & organize data with Cloud Dataflow
Flowers sample steps
● Partition
○ Partition data into train, validation
and test set
● WriteExamples
○ Write TFRecords to GC Storage
20 AstroHack apache_beam code example
1 1
2
2
3 3
4
4
5
5
21 Starting from boilerplate code
https://github.com/Fematich/mlengine-boilerplate
22 Start coding!
goo.gl/qxG9Ln
23 Outline
GCP tools overview
Workshop part 1: Dataflow
Workshop part 2: ML Engine
1
2
3
24 Tensorflow
● Open-source library for machine learning
● Single API for multiple platforms/devices:
cpu(s), gpu(s),tpu(s), mobile phones...
● 2 step approach:
○ Construct your model as a
computational graph
○ Train your model by pushing data
through the graph
● Big community with lots of SotA model
implementations
25 ML Engine Training
● Tensorflow Training As a Service
● Data needs to be available online
● No fancy interface (only logging +
Tensorboard)
● Same code can run locally to test on small
datasets
● Nice features:
○ Easy setup of (GPU) clusters for
distributed Tensorflow models
○ Automatic parallel hyperparameter
tuning with Hypertune
26 ML Engine Predictions
● Deploy trained model:
○ model (container)
○ version (actual code)
● Predictions:
○ batch
○ online
● Autoscaling
27 Start coding!
goo.gl/qxG9Ln
Copyright © 2018 ML6. All rights reserved. ML6 Confidential Information | 28
Apache Beam
Beam.apache.org
Tensorflow
tensorflow.org
ML6
ml6.eu
Developer codelabs
codelabs.developers.google.com

More Related Content

Recently uploaded

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 

Recently uploaded (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 

Featured

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming LanguageSimplilearn
 

Featured (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Workshop Machine Learning at Scale with Google Cloud Platform

  • 1. Workshop: Machine Learning at scale with GCP using ML Engine & Python Dataflow 25/01/2018 Robbe Sneyders & Juta Staes
  • 2. 2 About ML6 We are a team of data scientists, machine learning experts, software engineers and mathematicians. Our mission is to provide tailor-made systems to help your organization get smart actionable insights from large data volumes. + Specialized Machine Learning partner of Google Cloud Robbe Sneyders ML engineer @ ML6 Juta Staes ML engineer @ ML6
  • 3. 3 Outline GCP tools overview Workshop part 1: Dataflow Workshop part 2: ML Engine 1 2 3
  • 4. 4 Machine Learning Pipeline Collect data Create model Train model with organized data Organize data Deploy trained model iterate
  • 5. 5 Mapping to GCP Products Collect data Create model Train model with organized data Organize data Deploy trained model Cloud Machine Learning Engine Cloud Dataflow Cloud Machine Learning Engine Cloud Storage Tensorflow
  • 6. 6 Google Cloud Products Compute Storage Data & Analytics Machine Learning Cloud Functions
  • 7. 7 Google Cloud Platform: Open Cloud Philosophy ● Powerful open source frameworks that run everywhere ● Fully managed compute and storage services to run it more easily ● Free trial for 1 year with $300 worth in credits Cloud Machine Learning Engine TensorFlow Cloud Dataflow Apache Beam
  • 8. 8 Outline GCP tools overview Workshop part 1: Dataflow Workshop part 2: ML Engine 1 2 3
  • 9. 9 Workshop: overview ● Build ML model to classify flower images
  • 10. 10 Workshop: overview Cloud Machine Learning Engine Cloud Dataflow Tensorflow Part 1: Transform images from Cloud Storage into TF records and split into train, test and validation set Part 2: Build ML model, deploy it and use it to make predictions
  • 11. 11 Google Cloud Storage (GCS) ● Object Storage Service ● Your data lives here: ○ Raw input data ○ Cleaned examples for TF models ○ Serialized Tensorflow models ● Single interface/API, multiple offerings Name Access Frequency Multi-Regional Frequent, Cross-regional Regional Frequent, Single-region Nearline Less than once per month Coldline Less than once per year
  • 12. 12 Apache Beam running on Cloud Dataflow ● Open source, unified model for defining both batch and streaming data-parallel processing pipelines. ● Using one of the open source Beam SDKs, you build a program that defines the pipeline. ● The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. Beam Model: Fn Runners Apache Flink Apache Spark Beam Model: Pipeline Construction Other LanguagesBeam Java Beam Python Execution Execution Cloud Dataflow Execution Source: https://beam.apache.org
  • 13. 13 Apache Beam key concepts ● Pipelines: data processing job made of a series of computations including input, processing, and output ● PCollections: bounded (or unbounded) datasets which represent the input, intermediate and output data in pipelines ● PTransforms: data processing step in a pipeline in which one or more PCollections are an input and output ● I/O Sources and Sinks: APIs for reading and writing data which are the roots and endpoints of the pipeline. Source: https://beam.apache.org
  • 14. 14 Apache Beam running on Cloud Dataflow ● Fully-managed data processing service to run Apache Beam pipelines: ○ Automated and optimized work partitioning which can dynamically rebalance lagging work ○ Horizontal dynamic autoscaling of worker resources
  • 15. 15 Input data Flowers sample Hosted publicly by google ● Csv file on Google Cloud Storage ○ One line per sample ○ Format: image uri, label ● Text file on Google Cloud Storage ○ All labels
  • 16. 16 Collect & organize data with Cloud Dataflow Flowers sample steps ● ReadData: ○ Read metadata from 1 csv-file ○ Output one string per line ● Split: ○ Transform to string to tuple ■ (uri, label) ● ReadDictionary ○ Read labels from text file
  • 17. 17 One hot encoding Daisy Dandelion Roses Sunflowers Tulips Daisy 1 0 0 0 0 Dandelion 0 1 0 0 0 Roses 0 0 1 0 0 Sunflowers 0 0 0 1 0 Tulips 0 0 0 0 1
  • 18. 18 Collect & organize data with Cloud Dataflow Flowers sample steps ● OneHotEncoding: ○ Main input: (uri, label) ○ Side input: labels ○ Output: (uri, one hot encoding) ● ReadImage: ○ Read image from uri and convert to pixels ● BuildExamples ○ Build a dictionary for each sample to store as TFRecord
  • 19. 19 Collect & organize data with Cloud Dataflow Flowers sample steps ● Partition ○ Partition data into train, validation and test set ● WriteExamples ○ Write TFRecords to GC Storage
  • 20. 20 AstroHack apache_beam code example 1 1 2 2 3 3 4 4 5 5
  • 21. 21 Starting from boilerplate code https://github.com/Fematich/mlengine-boilerplate
  • 23. 23 Outline GCP tools overview Workshop part 1: Dataflow Workshop part 2: ML Engine 1 2 3
  • 24. 24 Tensorflow ● Open-source library for machine learning ● Single API for multiple platforms/devices: cpu(s), gpu(s),tpu(s), mobile phones... ● 2 step approach: ○ Construct your model as a computational graph ○ Train your model by pushing data through the graph ● Big community with lots of SotA model implementations
  • 25. 25 ML Engine Training ● Tensorflow Training As a Service ● Data needs to be available online ● No fancy interface (only logging + Tensorboard) ● Same code can run locally to test on small datasets ● Nice features: ○ Easy setup of (GPU) clusters for distributed Tensorflow models ○ Automatic parallel hyperparameter tuning with Hypertune
  • 26. 26 ML Engine Predictions ● Deploy trained model: ○ model (container) ○ version (actual code) ● Predictions: ○ batch ○ online ● Autoscaling
  • 28. Copyright © 2018 ML6. All rights reserved. ML6 Confidential Information | 28 Apache Beam Beam.apache.org Tensorflow tensorflow.org ML6 ml6.eu Developer codelabs codelabs.developers.google.com