SlideShare a Scribd company logo
1 of 42
Build Deep Learning
Pipelines on Apache
Spark for Ads
Optimization
Big Data Consultant & Senior Data Scientist
Craig Chao
chaocraig@gmail.com
Slideshare: Craig Chao
Agenda
!  Prolog
!  Data Become a Weapon of New Colonialism
!  Why Not Tensorflow but Deep Learning on Apache Spark?
!  Data Engineer * Data Science
!  ML Pipelines on Apache Spark
!  ML & DL for Ads Optimization
!  Deep Learning on Apache Spark
!  Conclusion
Prolog
!  Data Become a Weapon of New Colonialism
!  Why Not Tensorflow but Deep Learning on Apache
Spark?
!  Data Engineer * Data Science
Data Become a Weapon of New Colonialism
順豐、菜鳥互踢數據接口	
華為手機上面騰訊APP的使用者數據
是誰的?	
美國MIT譽為「中國最聰明公司」科大訊飛
人臉識別的「偷食神器」	
A Judge Just Ordered
LinkedIn to Allow Scraping
08/2017
Data Become a Weapon of New Colonialism
Src: https://twitter.com/jason_kint/ 	
Src: https://www.iab.com/insights/iab-internet-advertising-revenue-report-conducted-by-pricewaterhousecoopers-pwc-2/
Data Become a Weapon of New Colonialism
Data Become a Weapon of New Colonialism
Why Not Tensorflow but Deep
Learning on Apache Spark?
Data Developer/Engineer vs. Data Scientist
Data Developer/Engineer vs. Data Scientist
Src: https://www.stitchdata.com/resources/reports/the-state-of-data-engineering/ 	 https://www.oreilly.com/ideas/2016-data-science-salary-survey-results 	
5 ~ 10 : 1
ML Pipelines on Apache Spark
Src: https://dzone.com/articles/distingish-pop-music-from-heavy-metal-using-apache6
ML Pipelines on Apache Spark
!  Dataframe
!  ML dataset holding a variety of data types
!  Transformer
!  an algorithm transforming one DataFrame into another
DataFrame
!  Estimator
!  an algorithm being fit on a DataFrame to produce a
Transformer
!  Pipeline
!  chains multiple Transformers and Estimators together to
specify an ML workflow
!  Parameter
!  Parameters belong to specific instances of Estimators and
Transformers
!  Any parameters in the ParamMap will override parameters
previously specified via setter methods.
ML Pipelines on Apache Spark
Src: https://dzone.com/articles/distingish-pop-music-from-heavy-metal-using-apache6
ML Pipelines on Apache Spark
Raw unknown lyrics	 After Cleanser	 After StopWordsRemover	 After Stemmer	
After Word2Vec	 After LogisticRegression	
Pop or Heavy Metal?
ML Pipelines on Apache Spark
ML Pipelines on Apache Spark
ML Pipelines on Apache Spark
!  Advantages
!  Model selection (a.k.a.
hyperparameter tuning)
via cross-validation &
train validation split
!  Pipeline/Model save/
reload
https://github.com/tmatyashovsky/spark-ml-samples
ML Pipelines on Apache Spark
https://github.com/tmatyashovsky/spark-ml-samples
ML & DL for Ads Optimization
ML & DL for Ads Optimization
Rose Navy Olive
Alice 0 +4 0
Bob 0 0 +2
Carol -1 0 -2
Dave +3 0 0
(Alice)
(Blue)
(Navy)
(Periwinkle)
ML & DL for Ads Optimization
•  Optimizing X, Y simultaneously is non-convex, hard
•  If X or Y are fixed, system of linear equations: convex,
easy
•  Initialize Y with random values
•  Solve for X
•  Fix X, solve for Y
•  Repeat (“Alternating”)
X
YT
ML & DL for Ads Optimization
A m
=
n
S
k
k• T’
n
m
•Σ
Singular Value Decomposition(SVD)	 Context-aware Matrix Factorization
ML & DL for Ads Optimization
ML & DL for Ads Optimization
Deep Walk(2014)	A Multi-View Deep Learning(2015)
ML & DL for Ads Optimization
Wide & Deep Learning Models((Youtube, 2016)	
Deep Candidate Generation Model(Youtube, 2016)	 Session-based Recommendation With
RNN(2016)
Deep Learning on Apache Spark
Spark	 MMLSpark	 DL4J	 SystemML	 BigDL	
Vendor	 Databricks	 Microsoft	 DeepLearning4J	 Apache 	 Intel	
Tensorflow
OnSpark	
DeepDist	 OpenDL	 CaffeOnSpark	 TensorFrames	 Dist-keras	
Reference	 https://
github.com/
yahoo/
TensorFlowO
nSpark 	
http://
deepdist.c
om/ 	
https://
github.com/
guoding831
28/OpenDL 	
https://
github.com/
yahoo/
CaffeOnSpar k	
https://
github.com/
databricks/
tensorframes 	
https://
github.com
/cerndb/
dist-keras 	
Source: Craig Chao, DataConf 2017, Taipei
Deep Learning on Apache Spark
Apache SystemML
!  Apache Top-Level-Project
!  Declarative Large-Scale
Machine Learning
!  OS‎: ‎Linux‎, ‎macOS‎, ‎Windows
!  Written in‎: ‎Java
!  Open-sourced by IBM in
2015
A machine learning platform optimal for big data
Deep Learning on Apache Spark
Apache SystemML
https://github.com/dusenberrymw/systemml-nn/blob/master/nn/examples/mnist_lenet.dml 	
Build-in NN modules
Deep Learning on Apache Spark:
Apache SystemML
!  Seamless integration of Spark Machine Learning
pipelines with Microsoft Cognitive Toolkit (CNTK) and
OpenCV
!  CNTK Model Gallery
!  https://www.microsoft.com/en-us/cognitive-toolkit/features/
model-gallery/
!  Including GAN, Reinforcement Learning, ResNet152…
Deep Learning on Apache Spark:
MS MMLSpark
Deep Learning on Apache Spark:
MS MMLSpark
it implicitly converts the data
into the format expected by the
algorithm: tokenize and hash
strings, one-hot encodes
categorical variables,
assembles the features into
vector and so on.
Deep Learning on Apache Spark:
MS MMLSpark
ML Pipeline to evaluate CNTK model.	
Windows Azure Storage Blob
Deep Learning on Apache Spark:
Databricks
!  Founded by the creators of
Apache Spark, Ali Ghodsi,
CEO, adjunct professor of
UC Berkeley
!  The total funding is $100M+
!  Import model from TF,
MXNet, Keras, PyTorch,
Caffe, CNTK, Theano, Jcuda
Deep Learning on Apache Spark:
DataBricks
Deep Learning on Apache Spark:
DataBricks
Build a NN model from scratch	
Easy on a driver-only cluster,
complicated on distributed nodes.
Deep Learning on Apache Spark:
DL4J
!  DeepLearning4J is a java based
toolkit for building, training and
deploying Neural Networks
!  An open-source, distributed deep-
learning project in Java and Scala
spearheaded by the people at
Skymind
!  ND4J is the Java scientific computing
engine powering our matrix
manipulations. ND4S is its Scala wrapper.
!  Including RL and model import from
Keras(Theano, Tensorflow, Caffe and
CNTK)	
Machine learning models are served in
production with Skymind's model server.	
Secure, Scalable, Stable, Debuggable, Certified
Deep Learning on Apache Spark:
DL4J
Src: Anatolii(2017)
Deep Learning on Apache Spark
BigDL
!  A distributed deep learning library for
Apache Spark released by Intel®
!  Can load pre-trained Caffe or Torch models
!  Uses Intel MKL(Intel® Math Kernel Library)
and multi-threaded programming in each
Spark task
Deep Learning on Apache Spark
BigDL
Build a NN model from scratch
Deep Learning on Apache Spark
BigDL	 DL4J	 Databricks	 MMLSpark	 SystemML	
Vendor	 Intel	 DeepLearning4J	 Databricks	 Microsoft	 Apache 	
Pre-trained models	 Caffe/Torch/
Tensorflow	
Keras, TensorFlow,
Caffe and Theano	
TF, MXNet, Keras, PyTorch,
Caffe, CNTK, Theano, JCuda	
CNTK Gallery/
Keras	
DML/Caffe2DML	
Train a NN from scratch	 Y	 Y	 Y	 N	 Y / DML	
Notebook	 Python/Scala	 Scala / Reactive	 Python/Scala/R/SQL	 Python/Scala	 Python/Scala	
Free	 Y	 N / if model server	 N	 Y	 Y	
Usability	 High	 High	 High	 Middle	 Low	
Docker	 Y	 Y / Spark Notebook	 N	 Y	 Y	
Cloud	 Y / (AWS, Azure,
Cloudera…)	
N	 Y / AWS	 Azure	 N	
Source: Craig Chao, DataConf 2017
Conclusions
!  Data Wars
!  Unified Data Platform
!  Data Engineer/Developers are key
roles
!  Reusable/Portable ML Pipelines
!  DL has deep layers of hidden factors
!  DL models for Ads/RecSys
!  Codes level intro. of DL solutions on
Apache Spark
Add a Slide Title - 3
chaocraig@gmail.com	
Slideshare: Craig Chao

More Related Content

What's hot

Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...Databricks
 
Scaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersScaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersJen Aman
 
Scaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowScaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowDatabricks
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Databricks
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...Spark Summit
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Databricks
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksAnyscale
 
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...Databricks
 
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsApache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsTimothy Spann
 
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Databricks
 
Functional Programming and Big Data
Functional Programming and Big DataFunctional Programming and Big Data
Functional Programming and Big DataDataWorks Summit
 
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...Databricks
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Edureka!
 
When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?DataWorks Summit
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark Hubert Fan Chiang
 
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Using SparkR to Scale Data Science Applications in Production. Lessons from t...Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Using SparkR to Scale Data Science Applications in Production. Lessons from t...Spark Summit
 
What's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareWhat's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareDatabricks
 
An Insider’s Guide to Maximizing Spark SQL Performance
 An Insider’s Guide to Maximizing Spark SQL Performance An Insider’s Guide to Maximizing Spark SQL Performance
An Insider’s Guide to Maximizing Spark SQL PerformanceTakuya UESHIN
 

What's hot (20)

Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
 
Scaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersScaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of Parameters
 
Scaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowScaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlow
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
 
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsApache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
 
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
 
Functional Programming and Big Data
Functional Programming and Big DataFunctional Programming and Big Data
Functional Programming and Big Data
 
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark w...
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
 
When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?
 
Introduction to Apache Spark
Introduction to Apache Spark Introduction to Apache Spark
Introduction to Apache Spark
 
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Using SparkR to Scale Data Science Applications in Production. Lessons from t...Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
 
What's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareWhat's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You Care
 
An Insider’s Guide to Maximizing Spark SQL Performance
 An Insider’s Guide to Maximizing Spark SQL Performance An Insider’s Guide to Maximizing Spark SQL Performance
An Insider’s Guide to Maximizing Spark SQL Performance
 

Viewers also liked

Spark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. JyotiskaSpark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. JyotiskaSigmoid
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf
 
Deep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce SpitlerDeep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce SpitlerDatabricks
 
A Brief Intro to Scala
A Brief Intro to ScalaA Brief Intro to Scala
A Brief Intro to ScalaTim Underwood
 
Spark machine learning & deep learning
Spark machine learning & deep learningSpark machine learning & deep learning
Spark machine learning & deep learninghoondong kim
 
20170210 sapporotechbar7
20170210 sapporotechbar720170210 sapporotechbar7
20170210 sapporotechbar7Ryuji Tamagawa
 

Viewers also liked (6)

Spark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. JyotiskaSpark Dataframe - Mr. Jyotiska
Spark Dataframe - Mr. Jyotiska
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
Deep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce SpitlerDeep Learning with Apache Spark and GPUs with Pierce Spitler
Deep Learning with Apache Spark and GPUs with Pierce Spitler
 
A Brief Intro to Scala
A Brief Intro to ScalaA Brief Intro to Scala
A Brief Intro to Scala
 
Spark machine learning & deep learning
Spark machine learning & deep learningSpark machine learning & deep learning
Spark machine learning & deep learning
 
20170210 sapporotechbar7
20170210 sapporotechbar720170210 sapporotechbar7
20170210 sapporotechbar7
 

Similar to Build a deep learning pipeline on apache spark for ads optimization

Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018Codemotion
 
What's New in Spark 2?
What's New in Spark 2?What's New in Spark 2?
What's New in Spark 2?Eyal Ben Ivri
 
Databricks with R: Deep Dive
Databricks with R: Deep DiveDatabricks with R: Deep Dive
Databricks with R: Deep DiveDatabricks
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsMiklos Christine
 
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonDataWorks Summit
 
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScyllaDB
 
Media_Entertainment_Veriticals
Media_Entertainment_VeriticalsMedia_Entertainment_Veriticals
Media_Entertainment_VeriticalsPeyman Mohajerian
 
Scalable Machine Learning with PySpark
Scalable Machine Learning with PySparkScalable Machine Learning with PySpark
Scalable Machine Learning with PySparkLadle Patel
 
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...Amazon Web Services
 
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...Spark Summit
 
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataFrom Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataDatabricks
 
Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!ankitbhandari32
 
Spark Summit East 2016 - MLeap Presentation
Spark Summit East 2016 -   MLeap PresentationSpark Summit East 2016 -   MLeap Presentation
Spark Summit East 2016 - MLeap PresentationMikhail Semeniuk
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupPhaneendra Chiruvella
 

Similar to Build a deep learning pipeline on apache spark for ads optimization (20)

Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
MLeap: Release Spark ML Pipelines
MLeap: Release Spark ML PipelinesMLeap: Release Spark ML Pipelines
MLeap: Release Spark ML Pipelines
 
What's New in Spark 2?
What's New in Spark 2?What's New in Spark 2?
What's New in Spark 2?
 
Databricks with R: Deep Dive
Databricks with R: Deep DiveDatabricks with R: Deep Dive
Databricks with R: Deep Dive
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced AnalyticsETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics
 
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
 
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
 
Media_Entertainment_Veriticals
Media_Entertainment_VeriticalsMedia_Entertainment_Veriticals
Media_Entertainment_Veriticals
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
Scalable Machine Learning with PySpark
Scalable Machine Learning with PySparkScalable Machine Learning with PySpark
Scalable Machine Learning with PySpark
 
Spark m llib
Spark m llibSpark m llib
Spark m llib
 
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
 
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataFrom Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
 
Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!
 
Spark Summit East 2016 - MLeap Presentation
Spark Summit East 2016 -   MLeap PresentationSpark Summit East 2016 -   MLeap Presentation
Spark Summit East 2016 - MLeap Presentation
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark Group
 

More from Craig Chao

人工智慧與物聯網的創新與服務模式
人工智慧與物聯網的創新與服務模式人工智慧與物聯網的創新與服務模式
人工智慧與物聯網的創新與服務模式Craig Chao
 
從新一波人工智慧與大數據浪潮看「不當行為」
從新一波人工智慧與大數據浪潮看「不當行為」從新一波人工智慧與大數據浪潮看「不當行為」
從新一波人工智慧與大數據浪潮看「不當行為」Craig Chao
 
Ai 管理人看人工智慧、發展與應用變革
Ai 管理人看人工智慧、發展與應用變革Ai 管理人看人工智慧、發展與應用變革
Ai 管理人看人工智慧、發展與應用變革Craig Chao
 
The sharing economy matchmaker-chinese-20170409
The sharing economy matchmaker-chinese-20170409The sharing economy matchmaker-chinese-20170409
The sharing economy matchmaker-chinese-20170409Craig Chao
 
Ai plus-ai intro 02-20170605
Ai plus-ai intro 02-20170605Ai plus-ai intro 02-20170605
Ai plus-ai intro 02-20170605Craig Chao
 
AI and its revolution
AI and its revolutionAI and its revolution
AI and its revolutionCraig Chao
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916Craig Chao
 
Key Failure Factors of Building a Data Science Team
Key Failure Factors of Building a Data Science TeamKey Failure Factors of Building a Data Science Team
Key Failure Factors of Building a Data Science TeamCraig Chao
 
Business Opportunities, Challenges, Strategies and Execution in Big Data Era ...
Business Opportunities, Challenges, Strategies and Execution in Big Data Era...Business Opportunities, Challenges, Strategies and Execution in Big Data Era...
Business Opportunities, Challenges, Strategies and Execution in Big Data Era ...Craig Chao
 
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Craig Chao
 
行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行Craig Chao
 

More from Craig Chao (11)

人工智慧與物聯網的創新與服務模式
人工智慧與物聯網的創新與服務模式人工智慧與物聯網的創新與服務模式
人工智慧與物聯網的創新與服務模式
 
從新一波人工智慧與大數據浪潮看「不當行為」
從新一波人工智慧與大數據浪潮看「不當行為」從新一波人工智慧與大數據浪潮看「不當行為」
從新一波人工智慧與大數據浪潮看「不當行為」
 
Ai 管理人看人工智慧、發展與應用變革
Ai 管理人看人工智慧、發展與應用變革Ai 管理人看人工智慧、發展與應用變革
Ai 管理人看人工智慧、發展與應用變革
 
The sharing economy matchmaker-chinese-20170409
The sharing economy matchmaker-chinese-20170409The sharing economy matchmaker-chinese-20170409
The sharing economy matchmaker-chinese-20170409
 
Ai plus-ai intro 02-20170605
Ai plus-ai intro 02-20170605Ai plus-ai intro 02-20170605
Ai plus-ai intro 02-20170605
 
AI and its revolution
AI and its revolutionAI and its revolution
AI and its revolution
 
從行動廣告大數據觀點談 Big data 20150916
從行動廣告大數據觀點談 Big data   20150916從行動廣告大數據觀點談 Big data   20150916
從行動廣告大數據觀點談 Big data 20150916
 
Key Failure Factors of Building a Data Science Team
Key Failure Factors of Building a Data Science TeamKey Failure Factors of Building a Data Science Team
Key Failure Factors of Building a Data Science Team
 
Business Opportunities, Challenges, Strategies and Execution in Big Data Era ...
Business Opportunities, Challenges, Strategies and Execution in Big Data Era...Business Opportunities, Challenges, Strategies and Execution in Big Data Era...
Business Opportunities, Challenges, Strategies and Execution in Big Data Era ...
 
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
 
行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行行動廣告與大數據資料分析策略與執行
行動廣告與大數據資料分析策略與執行
 

Recently uploaded

Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Standkumarajju5765
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 

Recently uploaded (20)

Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 

Build a deep learning pipeline on apache spark for ads optimization

  • 1. Build Deep Learning Pipelines on Apache Spark for Ads Optimization Big Data Consultant & Senior Data Scientist Craig Chao chaocraig@gmail.com Slideshare: Craig Chao
  • 2. Agenda !  Prolog !  Data Become a Weapon of New Colonialism !  Why Not Tensorflow but Deep Learning on Apache Spark? !  Data Engineer * Data Science !  ML Pipelines on Apache Spark !  ML & DL for Ads Optimization !  Deep Learning on Apache Spark !  Conclusion
  • 3. Prolog !  Data Become a Weapon of New Colonialism !  Why Not Tensorflow but Deep Learning on Apache Spark? !  Data Engineer * Data Science
  • 4. Data Become a Weapon of New Colonialism 順豐、菜鳥互踢數據接口 華為手機上面騰訊APP的使用者數據 是誰的? 美國MIT譽為「中國最聰明公司」科大訊飛 人臉識別的「偷食神器」 A Judge Just Ordered LinkedIn to Allow Scraping 08/2017
  • 5. Data Become a Weapon of New Colonialism Src: https://twitter.com/jason_kint/ Src: https://www.iab.com/insights/iab-internet-advertising-revenue-report-conducted-by-pricewaterhousecoopers-pwc-2/
  • 6. Data Become a Weapon of New Colonialism
  • 7. Data Become a Weapon of New Colonialism
  • 8. Why Not Tensorflow but Deep Learning on Apache Spark?
  • 10. Data Developer/Engineer vs. Data Scientist Src: https://www.stitchdata.com/resources/reports/the-state-of-data-engineering/ https://www.oreilly.com/ideas/2016-data-science-salary-survey-results 5 ~ 10 : 1
  • 11. ML Pipelines on Apache Spark Src: https://dzone.com/articles/distingish-pop-music-from-heavy-metal-using-apache6
  • 12. ML Pipelines on Apache Spark !  Dataframe !  ML dataset holding a variety of data types !  Transformer !  an algorithm transforming one DataFrame into another DataFrame !  Estimator !  an algorithm being fit on a DataFrame to produce a Transformer !  Pipeline !  chains multiple Transformers and Estimators together to specify an ML workflow !  Parameter !  Parameters belong to specific instances of Estimators and Transformers !  Any parameters in the ParamMap will override parameters previously specified via setter methods.
  • 13. ML Pipelines on Apache Spark Src: https://dzone.com/articles/distingish-pop-music-from-heavy-metal-using-apache6
  • 14. ML Pipelines on Apache Spark Raw unknown lyrics After Cleanser After StopWordsRemover After Stemmer After Word2Vec After LogisticRegression Pop or Heavy Metal?
  • 15. ML Pipelines on Apache Spark
  • 16. ML Pipelines on Apache Spark
  • 17. ML Pipelines on Apache Spark !  Advantages !  Model selection (a.k.a. hyperparameter tuning) via cross-validation & train validation split !  Pipeline/Model save/ reload https://github.com/tmatyashovsky/spark-ml-samples
  • 18. ML Pipelines on Apache Spark https://github.com/tmatyashovsky/spark-ml-samples
  • 19. ML & DL for Ads Optimization
  • 20. ML & DL for Ads Optimization Rose Navy Olive Alice 0 +4 0 Bob 0 0 +2 Carol -1 0 -2 Dave +3 0 0 (Alice) (Blue) (Navy) (Periwinkle)
  • 21. ML & DL for Ads Optimization •  Optimizing X, Y simultaneously is non-convex, hard •  If X or Y are fixed, system of linear equations: convex, easy •  Initialize Y with random values •  Solve for X •  Fix X, solve for Y •  Repeat (“Alternating”) X YT
  • 22. ML & DL for Ads Optimization A m = n S k k• T’ n m •Σ Singular Value Decomposition(SVD) Context-aware Matrix Factorization
  • 23. ML & DL for Ads Optimization
  • 24. ML & DL for Ads Optimization Deep Walk(2014) A Multi-View Deep Learning(2015)
  • 25. ML & DL for Ads Optimization Wide & Deep Learning Models((Youtube, 2016) Deep Candidate Generation Model(Youtube, 2016) Session-based Recommendation With RNN(2016)
  • 26. Deep Learning on Apache Spark Spark MMLSpark DL4J SystemML BigDL Vendor Databricks Microsoft DeepLearning4J Apache Intel Tensorflow OnSpark DeepDist OpenDL CaffeOnSpark TensorFrames Dist-keras Reference https:// github.com/ yahoo/ TensorFlowO nSpark http:// deepdist.c om/ https:// github.com/ guoding831 28/OpenDL https:// github.com/ yahoo/ CaffeOnSpar k https:// github.com/ databricks/ tensorframes https:// github.com /cerndb/ dist-keras Source: Craig Chao, DataConf 2017, Taipei
  • 27. Deep Learning on Apache Spark Apache SystemML !  Apache Top-Level-Project !  Declarative Large-Scale Machine Learning !  OS‎: ‎Linux‎, ‎macOS‎, ‎Windows !  Written in‎: ‎Java !  Open-sourced by IBM in 2015 A machine learning platform optimal for big data
  • 28. Deep Learning on Apache Spark Apache SystemML https://github.com/dusenberrymw/systemml-nn/blob/master/nn/examples/mnist_lenet.dml Build-in NN modules
  • 29. Deep Learning on Apache Spark: Apache SystemML
  • 30. !  Seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK) and OpenCV !  CNTK Model Gallery !  https://www.microsoft.com/en-us/cognitive-toolkit/features/ model-gallery/ !  Including GAN, Reinforcement Learning, ResNet152… Deep Learning on Apache Spark: MS MMLSpark
  • 31. Deep Learning on Apache Spark: MS MMLSpark it implicitly converts the data into the format expected by the algorithm: tokenize and hash strings, one-hot encodes categorical variables, assembles the features into vector and so on.
  • 32. Deep Learning on Apache Spark: MS MMLSpark ML Pipeline to evaluate CNTK model. Windows Azure Storage Blob
  • 33. Deep Learning on Apache Spark: Databricks !  Founded by the creators of Apache Spark, Ali Ghodsi, CEO, adjunct professor of UC Berkeley !  The total funding is $100M+ !  Import model from TF, MXNet, Keras, PyTorch, Caffe, CNTK, Theano, Jcuda
  • 34. Deep Learning on Apache Spark: DataBricks
  • 35. Deep Learning on Apache Spark: DataBricks Build a NN model from scratch Easy on a driver-only cluster, complicated on distributed nodes.
  • 36. Deep Learning on Apache Spark: DL4J !  DeepLearning4J is a java based toolkit for building, training and deploying Neural Networks !  An open-source, distributed deep- learning project in Java and Scala spearheaded by the people at Skymind !  ND4J is the Java scientific computing engine powering our matrix manipulations. ND4S is its Scala wrapper. !  Including RL and model import from Keras(Theano, Tensorflow, Caffe and CNTK) Machine learning models are served in production with Skymind's model server. Secure, Scalable, Stable, Debuggable, Certified
  • 37. Deep Learning on Apache Spark: DL4J Src: Anatolii(2017)
  • 38. Deep Learning on Apache Spark BigDL !  A distributed deep learning library for Apache Spark released by Intel® !  Can load pre-trained Caffe or Torch models !  Uses Intel MKL(Intel® Math Kernel Library) and multi-threaded programming in each Spark task
  • 39. Deep Learning on Apache Spark BigDL Build a NN model from scratch
  • 40. Deep Learning on Apache Spark BigDL DL4J Databricks MMLSpark SystemML Vendor Intel DeepLearning4J Databricks Microsoft Apache Pre-trained models Caffe/Torch/ Tensorflow Keras, TensorFlow, Caffe and Theano TF, MXNet, Keras, PyTorch, Caffe, CNTK, Theano, JCuda CNTK Gallery/ Keras DML/Caffe2DML Train a NN from scratch Y Y Y N Y / DML Notebook Python/Scala Scala / Reactive Python/Scala/R/SQL Python/Scala Python/Scala Free Y N / if model server N Y Y Usability High High High Middle Low Docker Y Y / Spark Notebook N Y Y Cloud Y / (AWS, Azure, Cloudera…) N Y / AWS Azure N Source: Craig Chao, DataConf 2017
  • 41. Conclusions !  Data Wars !  Unified Data Platform !  Data Engineer/Developers are key roles !  Reusable/Portable ML Pipelines !  DL has deep layers of hidden factors !  DL models for Ads/RecSys !  Codes level intro. of DL solutions on Apache Spark
  • 42. Add a Slide Title - 3 chaocraig@gmail.com Slideshare: Craig Chao