SlideShare a Scribd company logo
1 of 34
Download to read offline
YongGang Hu, Chao Xue, IBM
Hyper-Parameter Selection
and Adaptive Model Tuning
for Deep Neural Networks
#AssignedHashtagGoesHere
From	Hyper	parameter/Network	Search	to	On-line	Tuning
Image
recognition
DeepLearningApplications
Object
detection
Translation
Others
Minimize ( , )
Outline
• Hyper-parameter selections & Neural network search
– Bayesian optimization
– Hyperband
– Reinforcement learning
• High performance AutoML
– Transfer AutoML - Using historical hyper-parameter configurations for different
tasks
– Joint optimization with AutoML and finetune
• Training visualization and Interactive tuning
• Real world experience and Implementation
3
From	Hyper	parameter/Network	Search	to	On-line	Tuning
Monitor
Advice
Hyper-parameter & Network
Search
criterion and
model
parameters
optimizing
procedure
parameters
Real-time monitor for running application
Real-time adviser for running application
overflow
underfitting
overfitting
divergence
convergence
checkthetraining
process
learning curve
weight/gradient/a
ctivation
histogram and
norm
worst cases of
training samples
example: learning
rate, batch size,
momentum, learning
rate scheme
example: number of
hidden units, number
of layers, filter kernel
size, max-pool size,
weight decay
Image
recognition
DeepLearningApplications
Object
detection
Translation
Others
Optimize
Standard AutoML--from Random search, Bayesian Optimization to
Reinforcement Learning
5
Adaptive
Random
Search
(Hyperband)
Reinforceme
nt Learning
Bayesian
Optimization
Neural Network auto
selected
Hyperparameters auto
selected
Qs1
sN
u1 uM
Results of AdaRandom and Bayesian Optimization for Object detection
6
• The raw data (no data augment)
0
0.2
0.4
0.6
0.8
avg Best case Worst case
mAP
default
AdaRandom(HyperBand)
Bayesian Optimization
Machine IBM P8
GPU Tesla K80
GPU MEM 10G
CPU ppc64le
CPU cores 160
MEM 55G
Frequency 3.5 GHz
OS Ubuntu
14.04
caffe version Py-faster-
rcnn
Dataset User-defined
Model vgg16
§ The average results for the three (HPP)
hyperparameters combinations at 4000
iterations are (0.49, 0.52, 0.55), that is,
using AdaRandom and Bayesian
optimization recommended HPP, you can
gain 6% and 12% in the average results
comparing to default setting. AdaRandom
method has more variance among
different tries(train-test dataset split). The
Bayesian models are more stable while
having better average performance.
§ The below pictures show accuracy during
the 0-4000 iterations, with different tries,
under the two HPP configurations. We
can see that: 1) It can be early stopped at
about 1300 iterations. 2) the performance
with different tries differ significantly, it
caused by in some tries, training dataset
has the invariance property according to
test dataset, but some doesn’t have. It
need to augment data to gain the stable
performance. 3) different HPP
combinations(models) may have different
sensitivity to the different tries.
Neural Network:
C: Convolution, P: pooling, FC: Full
connection.
Default manual: (9 layers)
C(3,32)+C(3,32)+P(3,3)+C(3,64)+C(3,64)+P
(3,3)+FC(64)+FC(32)+FC(6)
AdaRandom generated: (6 layers)
C(3,128)+C(4,64)+C(3,32)+P(3,3)+FC(16)+
FC(6)
Reinforce generated: (5 layers)
C(3,32)+P(2,2)+P(3,2)+C(5,32)+FC(6)
Evaluations with Acoustic Applications
• The best results for the three networks are (0.703,0.673, 0.695) (the smaller the better), that is, using AdaRandom and Reinforce recommended
models, you can gain 4.3% and 1.1% in the best results comparisons. The average result of the three networks is (0.817,0.776, 0.763), that is, the
DL Insight recommended modes can increase about 5.0% and 6.6% in the average case performance. And from the standard deviation view, the
recommended models are clearly more stable.
• The CDF (cumulative distribution function) curve is more intuitive to illustrate the comparison of the three models(the more left the better). For
example, using reinforce recommended model, ER has more than 60% probability (frequency) less than 0.75, while the default only has the 30%.
7
• We implement the AdaRandom (adaptive random search scheme) and Reinforce (reinforcement learning search
scheme) methods to generate deep learning neural network automatically.
• We are trying the new methods in different areas. Here is the example for acoustic. Default is the best scheme by manual
tuning.
Better accuracy More stableLower complexity
8
Traditional fine-tune
Standalone AutoML
Dataset
Virtual Dataset
Model
ParameterWeights and bias
Neural network Hyper-parameter
Dataset group
Dog Car Unknown
Collaborative AutoML
Transfer AutoML Architecture
Upload
datasets
White-box
analysis
Black-box
analysis
Benchmark
models
selection
Clients
Clients
Knowledge base
Model
selection
Joint Optimization
with transfer
learning
RL
HyperBand
Bayesian
ParallelizationB
Virtual datasets group
AutoML process
Path 1
Path1
Server
Server
Spark
Challenges for AutoML with Transfer Learning
• Training small user dataset leads to convergence problem à Transfer learning is needed
• When considering transfer learning, the pretrained model need to be chosen, usually in the
computer vision, we choose image-net as the base dataset to get the initial weights as the
pretrained model, but it can’t fit many specific user datasets.
• To solve this transfer learning’s problem, we can let the user to classify his dataset into some
predefined categories, and in each category, the pretrained model was trained separately. It
can improve the performance of transfer learning but involve user’s intervention with their
datasets.
• Using AutoML with transfer learning can improve transfer learning’s performance without user’s
intervention. But considering the transfer learning’s properties, there are two challenges for
AutoML:
– Since reusing the initial weights, transfer learning limits the searching space of AutoML, how to use
AutoML based on the pretrained model is a question.
– We can’t use AutoML to build one model for every user dataset, it is too expensive. How to reuse the
model for transfer learning is a question.
10
Joint optimization: AutoML with the fine-tune
Search space:
• lr_policy: LR_POLICY
stepsize: STEPSIZE
gamma: GAMMA
momentum: MOMENTUM
solver_mode: GPU
max_iter: MAX_ITER
test_iter: TEST_ITER
test_interval: TEST_INTERVAL
base_lr: BASE_LR
weight_decay: WEIGHT_DECAY
solver_type: SGD
layer {
param {
lr_mult: LR_MULT_C_W_0
decay_mult:
DECAY_MULT_C_W_0
}
param {
lr_mult: LR_MULT_C_B_0
decay_mult:
DECAY_MULT_C_B_0
}
convolution_param {
num_output: NUM_OUTPUT_0
pad: PAD_0
kernel_size: KERNEL_SIZE_C_0
group: GROUP_0
weight_filler {
type: TYPE_C_W_0
std: STD_C_W_0
}
bias_filler {
type: TYPE_C_B_0
std: STD_C_B_0
}
}
}
layer {
name: "conv0_relu"
type: TYPE_C_AF_0
bottom: "conv0"
top: "conv0"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv0"
top: "pool1"
pooling_param {
pool: AVE
kernel_size:
KERNEL_SIZE_P_0
stride: STRIDE_P_0
}
}
layer {
name: "last_fc"
type: "InnerProduct"
bottom: "pool1"
top: "last_fc"
param {
lr_mult: LR_MULT_FC_W_0
decay_mult:
DECAY_MULT_FC_W_0
}
param {
lr_mult: LR_MULT_FC_B_0
decay_mult:
DECAY_MULT_FC_B_0
}
inner_product_param {
num_output: OUTPUT_NUMS
weight_filler {
type: TYPE_FC_W_0
std: STD_FC_W_0
}
bias_filler {
type: TYPE_FC_B_0
std: STD_FC_B_0
}
}
}
c p c p fc
1) Neural network at the last stage:
Example:
2) Below hyper-parameter
Results and Analysis
0
0.2
0.4
0.6
0.8
1
1.2
Food(30 epoch) TV show(10
epoch)
Car(50 epoch) Sence(100
epoch)
Action(20
epoch)
Accuracy
GoogleCAM+Finetune(Imagenet)
GoogleCAM+Finetune(imagenet+ same category dataset)
AutoML+Finetune(Imagenet)
Advantages of AutoML
with finetune:
(+) Best accuracy
(+) Most stable
(+) Don’t need separate
pretrained models by
predefining dataset
categories. No user’s
interventions.
14
Expert optimization advice
for hyper parameter
selection and tuning
Traffic light alerting
for required parameter
optimization with early
stop advice and more
CPU, GPU, memory
utilization info, comms
overhead, +++
Interactive Tuning of Hyper Parameters – DL Insight
15
CIFAR-10
Traditional method: underfitting
Our method: still going down
Auto-detection method for training process
16
Traditional method: underfitting Our method: good game
MNIST
Auto-detection method for training process
17
Interactive Tuning – Example - 1st run of Caffe
CIFAR-10
We will run three times of caffe on spark, to show DL Insight’s capabilities for DL
Interactive Tuning – Example - 1st run of Caffe
18
The DL Insight suggests user to stop the
training early because overflow detected, and
give the user advisor to add clip gradient to
0.1
Interactive Tuning – Example - 2nd run of Caffe
19
On the second run of Caffe, we accept the
suggestion which DL Insight gave, the add
clip gradient to 0.1, what happened then?
Interactive Tuning – Example - 2nd run of Caffe
20
DL Insight suggests user to stop the training
Interactive Tuning – Example -3rd run of Caffe
21
On the third run of Caffe, we accept the
suggestion which DL Insight gave, and
more layers such as RELU, what
happened then?
Interactive Tuning – Example - 3rd run of Caffe
22
No more suggestion. You can do other
things or have a rest while training the
dataset. When you feel worried about it,
just have a look at DL InsightJ
2
base_lr weight_decay momentum early stop
batch
size/rpn
batch size
split ratio max_iters display
applause01 0.001 0.0005 0.9
yes
divergence
128/256 0.7(94/39) 2000 10 appl = 0.3409 0.3409
iter=370
mAP=0.221060
runs
Hyperparameters in suggestions
AP mAP
mAP at early
stop iteration
Parameters
Interactive Tuning - Example: “Applause” datasets
Good judgement
2
base_lr
weight_de
cay
momentu
m
early stop
batch
size/rpn
batch size
split ratio max_iters display
applause-aug01-01 0.001 0.0005 0.9
yes
divergence
128/256 0.7(98/42) 2000 10 AP for appl = 0.4848 0.4848
iter=350
mAP=0.313152
runs
Hyperparameters in suggestions Parameters
AP mAP
mAP at early
stop iteration
Good judgement
Learning Curve Trend Judgement – ‘Applause’ (Continue)
2
Good judgement
base_lr
weight_de
cay
momentu
m
early stop
batch
size/rpn
batch size
split ratio max_iters display
applause-aug03-01 0.001 0.0005 0.9 128/256 0.7(164/69) 2000 10 AP for appl = 0.7847 0.7847
iter=1510
mAP=0.784068
runs
Hyperparameters in suggestions Parameters
AP mAP
mAP at early
stop iteration
Learning Curve Trend Judgement – ‘Applause’ (Continue)
Hyper-Parameter Search Implementation
26
Search hyper-parameters
space :
– Learning rate
– Decay rate
– Batch size
– Optimizer:
• GradientDecedent,
• Adadelta,
• …
– Momentum (for some optimizers)
– LSTM hidden unit size
Random, Bayesian, TPE
Based Search Types
IBM CONFIDENTIAL
Hyper-Parameter Search Implementation
Think 2018 / 5613A.pdf / March 22, 2018 / © 2018 IBM Corporation 27
Spark search jobs are generated dynamically and executed in parallel
Random
TPE
Tree-based Parzen Estimator Bayesian
Multitenant Spark Cluster
IBM Spectrum Conductor
Hyper-Parameter Search Implementation
ModelTuningMgrl
startModelAutoTuning()
stopModelAutoTuning()
deleteModelAutoTuning()
TuningTask
(Thread)
run()
TuningJobCtr
runTuningJobs()
SparkTuningJobCtrI
TuningFrameworkMgr frameworkMgr
HPTAlgorithm
TuneInputParam
inputP
TuningJobCtr jobctl
search()
BayesianAlg
RandomAlg
TPEAlg
PythonAlg
UserPlugInAl
g
TuningFrameworkMgr
initialize()
getLossValueFromLog()
prepareJob()
CoSFrameWorkMgrl
TfFrameworkMgrl
PytorchFrameworkMg
rl
Enterprise Class Deep Learning Solution
IBM Spectrum Conductor Deep Learning Impact, IBM PowerAI, IBM Storage
29
Monitoring&Reporting
Workload Management / Scheduling
Resource Management & Orchestration
Native Services Management
Services & Support
Deep
Learning
Impact
Tensorflow
Caffe
Red Hat Linux
x86…
IBM Power Systems
IBM Storage
IBM PowerAI
IBM CONFIDENTIAL
Reference
• [1] David Schaffer, Darrell Whitley and Larry J Eshelman, Combinations of genetic algorithms and neural networks: A survey of the
state of the art. International Workshop on Combinations of Genetic Algorithms and Neural Networks, 1992.
• [2] J.Snoek, H.Larochelle and R.P.Adams, Practical Bayesian optimization of machine learning algorithms. In Advances in Neural
Information Processing Systems(NIPS), 2012. 

• [3] Bergstra, James and Yoshua Bengio, Random search for hyper-parameter optimization. Journal of Machine Learning Research,
2012. 

• [4] Lisha Li, Kevin Jamieson and Giulia DeSalvo, HYPERBAND: BANDIT- BASED CONFIGURATION EVALUATION FOR
HYPERPARAMETER OPTIMIZATION. ICLR, 2017. 

• [5] James Bergstra, etc. Algorithms for Hyper-Parameter Optimization. Proceedings of the IEEE, 2012. 

• [6] Bowen Baker, Otkrist Gupta, Nikhil Naik and RameshRaskar, DESIGNING NEURAL NETWORK ARCHITECTURES USING
REINFORCEMENT LEARNING. ICLR, 2017.
Notices and disclaimers
Think 2018 / 5613A.pdf / March 22, 2018 / © 2018 IBM Corporation
•© 2018 International Business Machines Corporation. No part of this
document may be reproduced or transmitted in any form without
written permission from IBM.
•U.S. Government Users Restricted Rights — use, duplication or
disclosure restricted by GSA ADP Schedule Contract with IBM.
•Information in these presentations (including information relating to
products that have not yet been announced by IBM) has been
reviewed for accuracy as of the date of initial publication and could
include unintentional technical or typographical errors. IBM shall have
no responsibility to update this information. This document is
distributed “as is” without any warranty, either express or
implied. In no event, shall IBM be liable for any damage arising
from the use of this information, including but not limited to, loss
of data, business interruption, loss of profit or loss of
opportunity. IBM products and services are warranted per the terms
and conditions of the agreements under which they are provided.
•IBM products are manufactured from new parts or new and used
parts.
In some cases, a product may not be new and may have been
previously installed. Regardless, our warranty terms apply.”
•Any statements regarding IBM's future direction, intent or
product plans are subject to change or withdrawal without
notice.
•Performance data contained herein was generally obtained in a
controlled, isolated environments. Customer examples are presented
as illustrations of how those
•customers have used IBM products and the results they may have
achieved. Actual performance, cost, savings or other results in other
operating environments may vary.
•References in this document to IBM products, programs, or services
does not imply that IBM intends to make such products, programs or
services available in all countries in which IBM operates or does
business.
•Workshops, sessions and associated materials may have been
prepared by independent session speakers, and do not necessarily
reflect the views of IBM. All materials and discussions are provided for
informational purposes only, and are neither intended to, nor shall
constitute legal or other guidance or advice to any individual
participant or their specific situation.
•It is the customer’s responsibility to insure its own compliance
with legal requirements and to obtain advice of competent legal
counsel as to the identification and interpretation of any relevant laws
and regulatory requirements that may affect the customer’s business
and any actions the customer may need to take to comply with such
laws. IBM does not provide legal advice or represent or warrant that
its services or products will ensure that the customer follows any law.
31
Notices and disclaimers
continued
Think 2018 / 5613A.pdf / March 22, 2018 / © 2018 IBM Corporation
•Information concerning non-IBM products was obtained from the
suppliers of those products, their published announcements or other
publicly available sources. IBM has not tested those products about
this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be
addressed to the suppliers of those products. IBM does not warrant
the quality of any third-party products, or the ability of any such third-
party products to interoperate with IBM’s products. IBM expressly
disclaims all warranties, expressed or implied, including but not
limited to, the implied warranties of merchantability and fitness
for a purpose.
•The provision of the information contained herein is not intended to,
and does not, grant any right or license under any IBM patents,
copyrights, trademarks or other intellectual property right.
•IBM, the IBM logo, ibm.com and [names of other referenced IBM
products and services used in the presentation] are trademarks of
International Business Machines Corporation, registered in many
jurisdictions worldwide. Other product and service names might
be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark
information" at: www.ibm.com/legal/copytrade.shtml.
32
Date, Time, Location & Duration Session title and Speaker
Tue, June 5 | 11 AM
2010-2012, 30 mins
Productionizing Spark ML Pipelines with the Portable Format for Analytics
Nick Pentreath (IBM)
Tue, June 5 | 2 PM
2018, 30 mins
Making PySpark Amazing—From Faster UDFs to Dependency Management and Graphing!
Holden Karau (Google) Bryan Cutler (IBM)
Tue, June 5 | 2 PM
Nook by 2001, 30 mins
Making Data and AI Accessible for All
Armand Ruiz Gabernet (IBM)
Tue, June 5 | 2:40 PM
2002-2004, 30 mins
Cognitive Database: An Apache Spark-Based AI-Enabled Relational Database System
Rajesh Bordawekar (IBM T.J. Watson Research Center)
Tue, June 5 | 3:20 PM
3016-3022, 30 mins
Dynamic Priorities for Apache Spark Application’s Resource Allocations
Michael Feiman (IBM Spectrum Computing) Shinnosuke Okada (IBM Canada Ltd.)
Tue, June 5 | 3:20 PM
2001-2005, 30 mins
Model Parallelism in Spark ML Cross-Validation
Nick Pentreath (IBM) Bryan Cutler (IBM)
Tue, June 5 | 3:20 PM
2007, 30 mins
Serverless Machine Learning on Modern Hardware Using Apache Spark
Patrick Stuedi (IBM)
Tue, June 5 | 5:40 PM
2002-2004, 30 mins
Create a Loyal Customer Base by Knowing Their Personality Using AI-Based Personality Recommendation Engine;
Sourav Mazumder (IBM Analytics) Aradhna Tiwari (University of South Florida)
Tue, June 5 | 5:40 PM
2007, 30 mins
Transparent GPU Exploitation on Apache Spark
Dr. Kazuaki Ishizaki (IBM) Madhusudanan Kandasamy (IBM)
Tue, June 5 | 5:40 PM
2009-2011, 30 mins
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks
Yonggang Hu (IBM) Chao Xue (IBM)
IBM Sessions at Spark+AI Summit 2018 (Tuesday, June 5)
IBM Cloud / Watson and Cloud Platform / © 2018 IBM Corporation 33
Date, Time, Location & Duration Session title and Speaker
Wed, June 6 | 12:50 PM Birds of a Feather: Apache Arrow in Spark and More
Bryan Cutler (IBM) Li Jin (Two Sigma Investments, LP)
Wed, June 6 | 2 PM
2002-2004, 30 mins
Deep Learning for Recommender Systems
Nick Pentreath (IBM) )
Wed, June 6 | 3:20 PM
2018, 30 mins
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer
Frederick Reiss (IBM) Vijay Bommireddipalli (IBM Center for Open-Source Data & AI Technologies)
IBM Sessions at Spark+AI Summit 2018 (Wednesday, June 6)
IBM Cloud / Watson and Cloud Platform / © 2018 IBM Corporation 34
Meet us at IBM booth in the Expo area.

More Related Content

What's hot

Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016MLconf
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Fwdays
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Nikhil Garg
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningDatabricks
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonSimon Frid
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learningjie cao
 
MLeap: Productionize Data Science Workflows Using Spark
MLeap: Productionize Data Science Workflows Using SparkMLeap: Productionize Data Science Workflows Using Spark
MLeap: Productionize Data Science Workflows Using SparkJen Aman
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark mldatamantra
 
Braden Hancock "Programmatically creating and managing training data with Sno...
Braden Hancock "Programmatically creating and managing training data with Sno...Braden Hancock "Programmatically creating and managing training data with Sno...
Braden Hancock "Programmatically creating and managing training data with Sno...Fwdays
 
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016MLconf
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016MLconf
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Databricks
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...asimkadav
 
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...Spark Summit
 
Concept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML ApplicationsConcept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML ApplicationsLightbend
 

What's hot (20)

Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine Learning
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 
MLeap: Productionize Data Science Workflows Using Spark
MLeap: Productionize Data Science Workflows Using SparkMLeap: Productionize Data Science Workflows Using Spark
MLeap: Productionize Data Science Workflows Using Spark
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 
Braden Hancock "Programmatically creating and managing training data with Sno...
Braden Hancock "Programmatically creating and managing training data with Sno...Braden Hancock "Programmatically creating and managing training data with Sno...
Braden Hancock "Programmatically creating and managing training data with Sno...
 
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
 
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
 
Concept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML ApplicationsConcept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML Applications
 

Similar to Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks with Yonggang Hu and Chao Xue

Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Databricks
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfGDG Bujumbura
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdDatabricks
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoMLArpitha Gurumurthy
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsNECST Lab @ Politecnico di Milano
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachYusuf Uzun
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...inside-BigData.com
 
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptRuochun Tzeng
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning SystemsAnuj Gupta
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917Bill Liu
 

Similar to Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks with Yonggang Hu and Chao Xue (20)

Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoML
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN Approach
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
 
C3 w1
C3 w1C3 w1
C3 w1
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 

Recently uploaded (17)

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 

Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks with Yonggang Hu and Chao Xue

  • 1. YongGang Hu, Chao Xue, IBM Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks #AssignedHashtagGoesHere
  • 3. Outline • Hyper-parameter selections & Neural network search – Bayesian optimization – Hyperband – Reinforcement learning • High performance AutoML – Transfer AutoML - Using historical hyper-parameter configurations for different tasks – Joint optimization with AutoML and finetune • Training visualization and Interactive tuning • Real world experience and Implementation 3
  • 4. From Hyper parameter/Network Search to On-line Tuning Monitor Advice Hyper-parameter & Network Search criterion and model parameters optimizing procedure parameters Real-time monitor for running application Real-time adviser for running application overflow underfitting overfitting divergence convergence checkthetraining process learning curve weight/gradient/a ctivation histogram and norm worst cases of training samples example: learning rate, batch size, momentum, learning rate scheme example: number of hidden units, number of layers, filter kernel size, max-pool size, weight decay Image recognition DeepLearningApplications Object detection Translation Others Optimize
  • 5. Standard AutoML--from Random search, Bayesian Optimization to Reinforcement Learning 5 Adaptive Random Search (Hyperband) Reinforceme nt Learning Bayesian Optimization Neural Network auto selected Hyperparameters auto selected Qs1 sN u1 uM
  • 6. Results of AdaRandom and Bayesian Optimization for Object detection 6 • The raw data (no data augment) 0 0.2 0.4 0.6 0.8 avg Best case Worst case mAP default AdaRandom(HyperBand) Bayesian Optimization Machine IBM P8 GPU Tesla K80 GPU MEM 10G CPU ppc64le CPU cores 160 MEM 55G Frequency 3.5 GHz OS Ubuntu 14.04 caffe version Py-faster- rcnn Dataset User-defined Model vgg16 § The average results for the three (HPP) hyperparameters combinations at 4000 iterations are (0.49, 0.52, 0.55), that is, using AdaRandom and Bayesian optimization recommended HPP, you can gain 6% and 12% in the average results comparing to default setting. AdaRandom method has more variance among different tries(train-test dataset split). The Bayesian models are more stable while having better average performance. § The below pictures show accuracy during the 0-4000 iterations, with different tries, under the two HPP configurations. We can see that: 1) It can be early stopped at about 1300 iterations. 2) the performance with different tries differ significantly, it caused by in some tries, training dataset has the invariance property according to test dataset, but some doesn’t have. It need to augment data to gain the stable performance. 3) different HPP combinations(models) may have different sensitivity to the different tries.
  • 7. Neural Network: C: Convolution, P: pooling, FC: Full connection. Default manual: (9 layers) C(3,32)+C(3,32)+P(3,3)+C(3,64)+C(3,64)+P (3,3)+FC(64)+FC(32)+FC(6) AdaRandom generated: (6 layers) C(3,128)+C(4,64)+C(3,32)+P(3,3)+FC(16)+ FC(6) Reinforce generated: (5 layers) C(3,32)+P(2,2)+P(3,2)+C(5,32)+FC(6) Evaluations with Acoustic Applications • The best results for the three networks are (0.703,0.673, 0.695) (the smaller the better), that is, using AdaRandom and Reinforce recommended models, you can gain 4.3% and 1.1% in the best results comparisons. The average result of the three networks is (0.817,0.776, 0.763), that is, the DL Insight recommended modes can increase about 5.0% and 6.6% in the average case performance. And from the standard deviation view, the recommended models are clearly more stable. • The CDF (cumulative distribution function) curve is more intuitive to illustrate the comparison of the three models(the more left the better). For example, using reinforce recommended model, ER has more than 60% probability (frequency) less than 0.75, while the default only has the 30%. 7 • We implement the AdaRandom (adaptive random search scheme) and Reinforce (reinforcement learning search scheme) methods to generate deep learning neural network automatically. • We are trying the new methods in different areas. Here is the example for acoustic. Default is the best scheme by manual tuning. Better accuracy More stableLower complexity
  • 8. 8 Traditional fine-tune Standalone AutoML Dataset Virtual Dataset Model ParameterWeights and bias Neural network Hyper-parameter Dataset group Dog Car Unknown Collaborative AutoML Transfer AutoML Architecture
  • 9. Upload datasets White-box analysis Black-box analysis Benchmark models selection Clients Clients Knowledge base Model selection Joint Optimization with transfer learning RL HyperBand Bayesian ParallelizationB Virtual datasets group AutoML process Path 1 Path1 Server Server Spark
  • 10. Challenges for AutoML with Transfer Learning • Training small user dataset leads to convergence problem à Transfer learning is needed • When considering transfer learning, the pretrained model need to be chosen, usually in the computer vision, we choose image-net as the base dataset to get the initial weights as the pretrained model, but it can’t fit many specific user datasets. • To solve this transfer learning’s problem, we can let the user to classify his dataset into some predefined categories, and in each category, the pretrained model was trained separately. It can improve the performance of transfer learning but involve user’s intervention with their datasets. • Using AutoML with transfer learning can improve transfer learning’s performance without user’s intervention. But considering the transfer learning’s properties, there are two challenges for AutoML: – Since reusing the initial weights, transfer learning limits the searching space of AutoML, how to use AutoML based on the pretrained model is a question. – We can’t use AutoML to build one model for every user dataset, it is too expensive. How to reuse the model for transfer learning is a question. 10
  • 11. Joint optimization: AutoML with the fine-tune
  • 12. Search space: • lr_policy: LR_POLICY stepsize: STEPSIZE gamma: GAMMA momentum: MOMENTUM solver_mode: GPU max_iter: MAX_ITER test_iter: TEST_ITER test_interval: TEST_INTERVAL base_lr: BASE_LR weight_decay: WEIGHT_DECAY solver_type: SGD layer { param { lr_mult: LR_MULT_C_W_0 decay_mult: DECAY_MULT_C_W_0 } param { lr_mult: LR_MULT_C_B_0 decay_mult: DECAY_MULT_C_B_0 } convolution_param { num_output: NUM_OUTPUT_0 pad: PAD_0 kernel_size: KERNEL_SIZE_C_0 group: GROUP_0 weight_filler { type: TYPE_C_W_0 std: STD_C_W_0 } bias_filler { type: TYPE_C_B_0 std: STD_C_B_0 } } } layer { name: "conv0_relu" type: TYPE_C_AF_0 bottom: "conv0" top: "conv0" } layer { name: "pool1" type: "Pooling" bottom: "conv0" top: "pool1" pooling_param { pool: AVE kernel_size: KERNEL_SIZE_P_0 stride: STRIDE_P_0 } } layer { name: "last_fc" type: "InnerProduct" bottom: "pool1" top: "last_fc" param { lr_mult: LR_MULT_FC_W_0 decay_mult: DECAY_MULT_FC_W_0 } param { lr_mult: LR_MULT_FC_B_0 decay_mult: DECAY_MULT_FC_B_0 } inner_product_param { num_output: OUTPUT_NUMS weight_filler { type: TYPE_FC_W_0 std: STD_FC_W_0 } bias_filler { type: TYPE_FC_B_0 std: STD_FC_B_0 } } } c p c p fc 1) Neural network at the last stage: Example: 2) Below hyper-parameter
  • 13. Results and Analysis 0 0.2 0.4 0.6 0.8 1 1.2 Food(30 epoch) TV show(10 epoch) Car(50 epoch) Sence(100 epoch) Action(20 epoch) Accuracy GoogleCAM+Finetune(Imagenet) GoogleCAM+Finetune(imagenet+ same category dataset) AutoML+Finetune(Imagenet) Advantages of AutoML with finetune: (+) Best accuracy (+) Most stable (+) Don’t need separate pretrained models by predefining dataset categories. No user’s interventions.
  • 14. 14 Expert optimization advice for hyper parameter selection and tuning Traffic light alerting for required parameter optimization with early stop advice and more CPU, GPU, memory utilization info, comms overhead, +++ Interactive Tuning of Hyper Parameters – DL Insight
  • 15. 15 CIFAR-10 Traditional method: underfitting Our method: still going down Auto-detection method for training process
  • 16. 16 Traditional method: underfitting Our method: good game MNIST Auto-detection method for training process
  • 17. 17 Interactive Tuning – Example - 1st run of Caffe CIFAR-10 We will run three times of caffe on spark, to show DL Insight’s capabilities for DL
  • 18. Interactive Tuning – Example - 1st run of Caffe 18 The DL Insight suggests user to stop the training early because overflow detected, and give the user advisor to add clip gradient to 0.1
  • 19. Interactive Tuning – Example - 2nd run of Caffe 19 On the second run of Caffe, we accept the suggestion which DL Insight gave, the add clip gradient to 0.1, what happened then?
  • 20. Interactive Tuning – Example - 2nd run of Caffe 20 DL Insight suggests user to stop the training
  • 21. Interactive Tuning – Example -3rd run of Caffe 21 On the third run of Caffe, we accept the suggestion which DL Insight gave, and more layers such as RELU, what happened then?
  • 22. Interactive Tuning – Example - 3rd run of Caffe 22 No more suggestion. You can do other things or have a rest while training the dataset. When you feel worried about it, just have a look at DL InsightJ
  • 23. 2 base_lr weight_decay momentum early stop batch size/rpn batch size split ratio max_iters display applause01 0.001 0.0005 0.9 yes divergence 128/256 0.7(94/39) 2000 10 appl = 0.3409 0.3409 iter=370 mAP=0.221060 runs Hyperparameters in suggestions AP mAP mAP at early stop iteration Parameters Interactive Tuning - Example: “Applause” datasets Good judgement
  • 24. 2 base_lr weight_de cay momentu m early stop batch size/rpn batch size split ratio max_iters display applause-aug01-01 0.001 0.0005 0.9 yes divergence 128/256 0.7(98/42) 2000 10 AP for appl = 0.4848 0.4848 iter=350 mAP=0.313152 runs Hyperparameters in suggestions Parameters AP mAP mAP at early stop iteration Good judgement Learning Curve Trend Judgement – ‘Applause’ (Continue)
  • 25. 2 Good judgement base_lr weight_de cay momentu m early stop batch size/rpn batch size split ratio max_iters display applause-aug03-01 0.001 0.0005 0.9 128/256 0.7(164/69) 2000 10 AP for appl = 0.7847 0.7847 iter=1510 mAP=0.784068 runs Hyperparameters in suggestions Parameters AP mAP mAP at early stop iteration Learning Curve Trend Judgement – ‘Applause’ (Continue)
  • 26. Hyper-Parameter Search Implementation 26 Search hyper-parameters space : – Learning rate – Decay rate – Batch size – Optimizer: • GradientDecedent, • Adadelta, • … – Momentum (for some optimizers) – LSTM hidden unit size Random, Bayesian, TPE Based Search Types IBM CONFIDENTIAL
  • 27. Hyper-Parameter Search Implementation Think 2018 / 5613A.pdf / March 22, 2018 / © 2018 IBM Corporation 27 Spark search jobs are generated dynamically and executed in parallel Random TPE Tree-based Parzen Estimator Bayesian Multitenant Spark Cluster IBM Spectrum Conductor
  • 28. Hyper-Parameter Search Implementation ModelTuningMgrl startModelAutoTuning() stopModelAutoTuning() deleteModelAutoTuning() TuningTask (Thread) run() TuningJobCtr runTuningJobs() SparkTuningJobCtrI TuningFrameworkMgr frameworkMgr HPTAlgorithm TuneInputParam inputP TuningJobCtr jobctl search() BayesianAlg RandomAlg TPEAlg PythonAlg UserPlugInAl g TuningFrameworkMgr initialize() getLossValueFromLog() prepareJob() CoSFrameWorkMgrl TfFrameworkMgrl PytorchFrameworkMg rl
  • 29. Enterprise Class Deep Learning Solution IBM Spectrum Conductor Deep Learning Impact, IBM PowerAI, IBM Storage 29 Monitoring&Reporting Workload Management / Scheduling Resource Management & Orchestration Native Services Management Services & Support Deep Learning Impact Tensorflow Caffe Red Hat Linux x86… IBM Power Systems IBM Storage IBM PowerAI IBM CONFIDENTIAL
  • 30. Reference • [1] David Schaffer, Darrell Whitley and Larry J Eshelman, Combinations of genetic algorithms and neural networks: A survey of the state of the art. International Workshop on Combinations of Genetic Algorithms and Neural Networks, 1992. • [2] J.Snoek, H.Larochelle and R.P.Adams, Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems(NIPS), 2012. 
 • [3] Bergstra, James and Yoshua Bengio, Random search for hyper-parameter optimization. Journal of Machine Learning Research, 2012. 
 • [4] Lisha Li, Kevin Jamieson and Giulia DeSalvo, HYPERBAND: BANDIT- BASED CONFIGURATION EVALUATION FOR HYPERPARAMETER OPTIMIZATION. ICLR, 2017. 
 • [5] James Bergstra, etc. Algorithms for Hyper-Parameter Optimization. Proceedings of the IEEE, 2012. 
 • [6] Bowen Baker, Otkrist Gupta, Nikhil Naik and RameshRaskar, DESIGNING NEURAL NETWORK ARCHITECTURES USING REINFORCEMENT LEARNING. ICLR, 2017.
  • 31. Notices and disclaimers Think 2018 / 5613A.pdf / March 22, 2018 / © 2018 IBM Corporation •© 2018 International Business Machines Corporation. No part of this document may be reproduced or transmitted in any form without written permission from IBM. •U.S. Government Users Restricted Rights — use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. •Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. This document is distributed “as is” without any warranty, either express or implied. In no event, shall IBM be liable for any damage arising from the use of this information, including but not limited to, loss of data, business interruption, loss of profit or loss of opportunity. IBM products and services are warranted per the terms and conditions of the agreements under which they are provided. •IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.” •Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. •Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those •customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. •References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. •Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. •It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer follows any law. 31
  • 32. Notices and disclaimers continued Think 2018 / 5613A.pdf / March 22, 2018 / © 2018 IBM Corporation •Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products about this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third- party products to interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied warranties of merchantability and fitness for a purpose. •The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. •IBM, the IBM logo, ibm.com and [names of other referenced IBM products and services used in the presentation] are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml. 32
  • 33. Date, Time, Location & Duration Session title and Speaker Tue, June 5 | 11 AM 2010-2012, 30 mins Productionizing Spark ML Pipelines with the Portable Format for Analytics Nick Pentreath (IBM) Tue, June 5 | 2 PM 2018, 30 mins Making PySpark Amazing—From Faster UDFs to Dependency Management and Graphing! Holden Karau (Google) Bryan Cutler (IBM) Tue, June 5 | 2 PM Nook by 2001, 30 mins Making Data and AI Accessible for All Armand Ruiz Gabernet (IBM) Tue, June 5 | 2:40 PM 2002-2004, 30 mins Cognitive Database: An Apache Spark-Based AI-Enabled Relational Database System Rajesh Bordawekar (IBM T.J. Watson Research Center) Tue, June 5 | 3:20 PM 3016-3022, 30 mins Dynamic Priorities for Apache Spark Application’s Resource Allocations Michael Feiman (IBM Spectrum Computing) Shinnosuke Okada (IBM Canada Ltd.) Tue, June 5 | 3:20 PM 2001-2005, 30 mins Model Parallelism in Spark ML Cross-Validation Nick Pentreath (IBM) Bryan Cutler (IBM) Tue, June 5 | 3:20 PM 2007, 30 mins Serverless Machine Learning on Modern Hardware Using Apache Spark Patrick Stuedi (IBM) Tue, June 5 | 5:40 PM 2002-2004, 30 mins Create a Loyal Customer Base by Knowing Their Personality Using AI-Based Personality Recommendation Engine; Sourav Mazumder (IBM Analytics) Aradhna Tiwari (University of South Florida) Tue, June 5 | 5:40 PM 2007, 30 mins Transparent GPU Exploitation on Apache Spark Dr. Kazuaki Ishizaki (IBM) Madhusudanan Kandasamy (IBM) Tue, June 5 | 5:40 PM 2009-2011, 30 mins Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks Yonggang Hu (IBM) Chao Xue (IBM) IBM Sessions at Spark+AI Summit 2018 (Tuesday, June 5) IBM Cloud / Watson and Cloud Platform / © 2018 IBM Corporation 33
  • 34. Date, Time, Location & Duration Session title and Speaker Wed, June 6 | 12:50 PM Birds of a Feather: Apache Arrow in Spark and More Bryan Cutler (IBM) Li Jin (Two Sigma Investments, LP) Wed, June 6 | 2 PM 2002-2004, 30 mins Deep Learning for Recommender Systems Nick Pentreath (IBM) ) Wed, June 6 | 3:20 PM 2018, 30 mins Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer Frederick Reiss (IBM) Vijay Bommireddipalli (IBM Center for Open-Source Data & AI Technologies) IBM Sessions at Spark+AI Summit 2018 (Wednesday, June 6) IBM Cloud / Watson and Cloud Platform / © 2018 IBM Corporation 34 Meet us at IBM booth in the Expo area.