SlideShare a Scribd company logo
1 of 37
Download to read offline
Understanding Deep Learning for
Big Data
Le Song
http://www.cc.gatech.edu/~lsong/
College of Computing
Georgia Institute of Technology
1
AlexNet: deep convolution neural networks
2
11
11
5
5
3
3
3
3
256
13
13
3
3
40964096
1000
Rectified linear unit: โ„Ž ๐‘ข = max{0, ๐‘ข}
224
224
3
55
55
96
256
27
27
384
13
13
384
13
13
3.7 million parameters58.6 million parameters
Pr ๐‘ฆ|๐‘ฅ โˆ exp ๐‘Š8โ„Ž ๐‘Š7โ„Ž ๐‘Š6โ„Ž ๐‘Š5โ„Ž ๐‘Š4โ„Ž ๐‘Š3โ„Ž ๐‘Š2โ„Ž ๐‘Š1 ๐‘ฅ
Image
๐‘ฅ
Label
๐‘ฆ
cat/
bike/
โ€ฆ?
3
a benchmark image classification problem
~ 1.3 million examples, ~ 1 thousand classes
Training is end-to-end
Minimize negative log-likelihood over ๐‘š data points ๐‘ฅ๐‘–, ๐‘ฆ๐‘– ๐‘–=1
๐‘š
min
๐‘“โˆˆ๐“•
๐‘… ๐‘Š1, โ€ฆ , ๐‘Š8 โ‰” โˆ’
1
๐‘š
๐‘–=1
๐‘š
log Pr ๐‘ฆ๐‘–|๐‘ฅ๐‘–
(Stochastic) gradient descent
๐‘Š8
๐‘ก+1
= ๐‘Š8
๐‘ก
โˆ’ ๐œ‚
๐œ• ๐‘…
๐œ•๐‘Š8
โ€ฆ
๐‘Š1
๐‘ก+1
= ๐‘Š1
๐‘ก
โˆ’ ๐œ‚
๐œ• ๐‘…
๐œ•๐‘Š1
4
Pr ๐‘ฆ|๐‘ฅ โˆ exp ๐‘Š8โ„Ž ๐‘Š7โ„Ž ๐‘Š6โ„Ž ๐‘Š5โ„Ž ๐‘Š4โ„Ž ๐‘Š3โ„Ž ๐‘Š2โ„Ž ๐‘Š1 ๐‘ฅ
AlexNet achieve
~40%
top-1 error
Traditional image features not learned end-to-end
5
Handcrafted
feature extractor
(eg. SIFT)
Divide image
to patches
Combine features
Learn classifier
Rectified linear unit: โ„Ž ๐‘ข = max{0, ๐‘ข}
Deep learning not fully understood
11
11
5
5
3
3
3
3
256
13
13
3
3
40964096
1000
224
224
3
55
55
96
256
27
27
384
13
13
384
13
13
3.7 million
parameters
58.6 million parameters
6
ully connected layers
crucial?
Convolution layers
crucial?
Image
๐‘ฅ
Train end-to-end important?
Pr ๐‘ฆ|๐‘ฅ โˆ exp ๐‘Š8โ„Ž ๐‘Š7โ„Ž ๐‘Š6โ„Ž ๐‘Š5โ„Ž ๐‘Š4โ„Ž ๐‘Š3โ„Ž ๐‘Š2โ„Ž ๐‘Š1 ๐‘ฅ
Experiments
1. Fully connected layers crucial?
2. Convolution layers crucial?
3. Learn parameters end-to-end crucial?
Kernel methods: alternative nonlinear model
Combination of random basis functions ๐‘˜(๐‘ค, ๐‘ฅ)
๐‘“ ๐‘ฅ =
๐‘–=1
๐‘‡
๐›ผ๐‘– ๐‘˜(๐‘ค๐‘–, ๐‘ฅ)
8
๐‘–=1
7
๐›ผ๐‘– exp โˆ’ ๐‘ค๐‘– โˆ’ ๐‘ฅ 2
๐›ผ1 ๐›ผ2 ๐›ผ3 ๐›ผ4 ๐›ผ5 ๐›ผ6 ๐›ผ7
๐‘ค2 ๐‘ค3 ๐‘ค4 ๐‘ค5 ๐‘ค6 ๐‘ค7
๐‘˜ ๐‘ค๐‘–, ๐‘ฅ
= exp โˆ’ ๐‘ค๐‘– โˆ’ ๐‘ฅ 2
๐‘ฅ๐‘ค1
[Dai et al. NIPS 14]
๐‘ฅ
Replace fully connected by kernel methods
I. Jointly trained neural nets
(AlexNet)
Pr ๐‘ฆ ๐‘ฅ โˆ
exp ๐‘Š8โ„Ž7 ๐‘Š7 โ„Ž6 โ€ฆ โ„Ž1 ๐‘Š1 ๐‘ฅ
Learn
II. Fixed neural nets
III. Scalable kernel methods
[Dai et al. NIPS 14]
Learn Fix
Learn Fix
9
10
Learn classifiers from a benchmark subset of
~ 1.3 million examples, ~ 1 thousand classes
Kernel machine learns faster
ImageNet 1.3M original images, and 1000 classes
Random cropping and mirroring images in streaming fashion
Number of training samples
10
5
40
60
80
100
Test
top-1 error
(%)
10
6
10
7
10
8
jointly-trained neural net
fixed neural net
doubly SGD
Training 1 week
using GPU
47.8
44.5
42.6
Random guessing
99.9% error
11
Similar results with MNIST8M
Classification with handwritten digits
8M images, 10 classes
LeNet5
12
Similar results with CIFAR10
Classification with internet images
60K images, 10 classes
13
Experiments
1. Fully connected layers crucial? No
2. Convolution layers crucial?
3. Learn parameters end-to-end crucial?
Kernel methods directly on inputs?
Fixed convolutionWithout convolution
0
0.2
0.4
0.6
0.8
1
1.2
MNIST
2 convolution layer
0
10
20
30
40
CIFAR10
2 convolution layers
0
20
40
60
80
100
ImageNet
5 convolution layers
15
Kernel methods + random convolutions?
Fixed convolutionWithout convolution Random convolution
0
0.2
0.4
0.6
0.8
1
1.2
MNIST
2 convolution layer
0
10
20
30
40
CIFAR10
2 convolution layers
# random conv
โ‰ซ
# fixed conv
Random
16
Structured composition useful
Not just fully connected layers, and plain composition
๐‘“ ๐‘ฅ = โ„Ž ๐‘› โ„Ž ๐‘›โˆ’1 โ€ฆ โ„Ž1 ๐‘ฅ
Structured composition of nonlinear functions
๐‘“ ๐‘ฅ = โ„Ž ๐‘› โ„Ž ๐‘›โˆ’1 โ€ฆ โ„Ž1 ๐‘ฅ ๐‘๐‘Ž๐‘ก๐‘โ„Ž1
, โ„Ž1 ๐‘ฅ ๐‘๐‘Ž๐‘ก๐‘โ„Ž2
, โ€ฆ , โ„Ž1 ๐‘ฅ ๐‘๐‘Ž๐‘ก๐‘โ„Ž ๐‘š
17
the same function
Experiments
1. Fully connected layers crucial? No
2. Convolution layers crucial? Yes
3. Learn parameters end-to-end crucial?
Lots of random features used
58M parameters
131M parameters
AlexNet
Scalable
Kernel Method
Error
42.6%
Error
44.5%
1000
4096 4096
256
13
13
256
13
13
131K
1000
19
Fix
131M parameters needed?
58M parameters
32M parameters
AlexNet
Error
42.6%
Error
50.0%
1000
4096 4096
256
13
13
256
13
13
32K
1000
20
Scalable
Kernel Method
Fix
Basis function adaptation crucial
Integrated squared approximation error by ๐‘‡ basis function [Barron โ€˜93]
Error of
adapting basis function
โ‰ค
1
๐‘‡
Error of
fixed basis function
โ‰ฅ
1
๐‘‡2/๐‘‘
๐‘“ ๐‘ฅ =
๐‘–=1
7
๐›ผ๐‘– ๐‘˜ ๐‘ฅ๐‘–, ๐‘ฅ
๐›ผ1 ๐›ผ2 ๐›ผ3 ๐›ผ4 ๐›ผ5
๐›ผ6 ๐›ผ7
๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ3 ๐‘ฅ4 ๐‘ฅ5 ๐‘ฅ6 ๐‘ฅ7
๐‘˜(๐‘ฅ๐‘–, ๐‘ฅ)
๐‘“ ๐‘ฅ =
๐‘–=1
2
๐›ผ๐‘– ๐‘˜ ๐œƒ ๐‘–
๐‘ฅ๐‘–, ๐‘ฅ
๐‘ฅ1 ๐‘ฅ2
๐‘˜ ๐œƒ ๐‘–
(๐‘ฅ๐‘–, ๐‘ฅ)
๐›ผ1 ๐›ผ2
21
Learning random features helps a lot
58M parameters
32M parameters
Learn and basis adaptation
AlexNet
Error
42.6%
Error
43.7%
1000
4096 4096
256
13
13
256
13
13
32K
1000
Fix
22/50
Scalable
Kernel Method
Learning convolution together helps more
58M parameters
32M parameters
Learn and basis adaptation
AlexNet
Error
42.6%
Error
41.9%
1000
4096 4096
256
13
13
256
13
13
32K
1000
Jointly learn
23
Scalable
Kernel Method
Lesson learned:
Exploit Structure & Train End-to-End
Deep learning over (time-varying) graph
Co-evolutionary features
ChristineAliceDavid Jacob
Item embedding
๐‘“๐‘–(๐‘ก)
User embedding
๐‘“๐‘ข(๐‘ก)
User-item interactions
evolve over time
โ€ฆ 25
ChristineAliceDavid Jacob
User embedding
๐‘“๐‘ข(๐‘ก)
Co-evolutionary features
Item embedding
๐‘“๐‘–(๐‘ก)
User-item interactions
evolve over time
โ€ฆ 26
ChristineAliceDavid Jacob
User embedding
๐‘“๐‘ข(๐‘ก)
Co-evolutionary features
Item embedding
๐‘“๐‘–(๐‘ก)
User-item interactions
evolve over time
โ€ฆ 27
ChristineAliceDavid Jacob
Item embedding
๐‘“๐‘–(๐‘ก)
User embedding
๐‘“๐‘ข(๐‘ก)
Co-evolutionary features
User-item interactions
evolve over time
โ€ฆ 28
ChristineAliceDavid Jacob
Item embedding
๐‘“๐‘–(๐‘ก)
User embedding
๐‘“๐‘ข(๐‘ก)
Co-evolutionary features
User-item interactions
evolve over time
โ€ฆ 29
ChristineAliceDavid Jacob
Co-evolutionary features
Item embedding
๐‘“๐‘–(๐‘ก)
User embedding
๐‘“๐‘ข(๐‘ก)
User-item interactions
evolve over time
โ€ฆ 30
Co-evolutionary embedding
ChristineAliceDavid Jacob
Initialize item embedding
๐‘“๐‘– ๐‘›
๐‘ก0 = โ„Ž ๐‘‰0 โ‹… ๐‘“๐‘– ๐‘›
0
Initialize user embedding
๐‘“๐‘ข ๐‘›
๐‘ก0 = โ„Ž ๐‘Š0 โ‹… ๐‘“๐‘ข ๐‘›
0
๐‘ข ๐‘›, ๐‘– ๐‘›, ๐‘ก ๐‘›, ๐‘ž ๐‘›
Item raw profile features
User raw profile features
Drift
Context
Evolution
Co-evolution
User ๏ƒ Item๐‘“๐‘– ๐‘›
๐‘ก ๐‘› = โ„Ž
๐‘‰1 โ‹… ๐‘“๐‘– ๐‘›
๐‘ก ๐‘›
โˆ’
+๐‘‰2 โ‹… ๐‘“๐‘ข ๐‘›
๐‘ก ๐‘›
โˆ’
+๐‘‰3 โ‹… ๐‘ž ๐‘›
+๐‘‰4 โ‹… (๐‘ก ๐‘› โˆ’ ๐‘ก ๐‘›โˆ’1)
Update U2I:
Drift
Context
Evolution
Co-evolution
Item๏ƒ User๐‘“๐‘ข ๐‘›
๐‘ก ๐‘› = โ„Ž
๐‘Š1 โ‹… ๐‘“๐‘ข ๐‘›
๐‘ก ๐‘›
โˆ’
+๐‘Š2 โ‹… ๐‘“๐‘– ๐‘›
๐‘ก ๐‘›
โˆ’
+๐‘Š3 โ‹… ๐‘ž ๐‘›
+๐‘Š4 โ‹… (๐‘ก ๐‘› โˆ’ ๐‘ก ๐‘›โˆ’1)
Update I2U:
31[Dai et al. Recsys16]
Deep learning with time-varying computation graph
time
๐‘ก2
๐‘ก3
๐‘ก1
๐‘ก0
Mini-batch 1
Computation graph of RNN
determined by
1. The bipartite interaction
graph
2. The temporal ordering of
events
32
Much improvement prediction on Reddit dataset
Next item prediction Return time prediction
1,000 users, 1403 groups, ~10K interactions
MAR: mean absolute rank difference
MAE: mean absolute error (hours)
33
Predicting efficiency of solar panel materials
Dataset Harvard clean
energy project
Data point # 2.3 million
Type Molecule
Atom type 6
Avg node # 28
Avg edge # 33
Power Conversion Efficiency (PCE)
(0 -12 %)
predict
Organic
Solar Panel
Materials
34
Structure2Vec
๐œ‡2
(1)
๐œ‡2
(0)
๐œ‡1
(0)
๐œ‡3
(1)
๐œ‡1
(1)
โ€ฆโ€ฆ
๐œ‡2
(๐‘‡)
๐œ‡3
(๐‘‡)
๐œ‡1
(๐‘‡)
๐‘‹6
๐‘‹1
๐‘‹2 ๐‘‹3
๐‘‹4
๐‘‹5
๐œ’
๐œ‡6
(0)
โ€ฆโ€ฆ
โ€ฆโ€ฆ
Iteration 1:
Iteration ๐‘‡:
Label ๐‘ฆ
classification/regression
with parameter ๐‘‰
Aggregate
๐œ‡1
(๐‘‡)
๐œ‡2
(๐‘‡)
+
+
โ‹ฎ
= ๐œ‡ ๐‘Ž(๐‘Š, ๐œ’)
35
[Dai et al. ICML 16]
Improved prediction with small model
Structure2vec gets ~4% relative error
with 10,000 times smaller model!
Test MAE Test RMSE # parameters
Mean predictor 1.986 2.406 1
WL level-3 0.143 0.204 1.6 m
WL level-6 0.096 0.137 1378 m
structure2vec 0.085 0.117 0.1 m
10% data for testing
36
Take Home Message:
Deep fully connected layers not the key
Exploit structure (CNN, Coevolution,
Structure2vec)
Train end-to-end

More Related Content

What's hot

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...MLconf
ย 
TensorFlow in 3 sentences
TensorFlow in 3 sentencesTensorFlow in 3 sentences
TensorFlow in 3 sentencesBarbara Fusinska
ย 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
ย 
Generating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaGenerating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaAndre Pemmelaar
ย 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017MLconf
ย 
Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit Antti Haapala
ย 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras frameworkAlison Marczewski
ย 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15MLconf
ย 
Exploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitExploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitShiladitya Sen
ย 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
ย 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2Sungjoon Choi
ย 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slidesMLconf
ย 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopHรฉloรฏse Nonne
ย 
Terascale Learning
Terascale LearningTerascale Learning
Terascale Learningpauldix
ย 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theanoMassimo Quadrana
ย 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politรจcnica de Catalunya
ย 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
ย 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15MLconf
ย 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
ย 

What's hot (20)

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
ย 
TensorFlow in 3 sentences
TensorFlow in 3 sentencesTensorFlow in 3 sentences
TensorFlow in 3 sentences
ย 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
ย 
Generating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaGenerating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in julia
ย 
nn network
nn networknn network
nn network
ย 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
ย 
Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit
ย 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras framework
ย 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
ย 
Exploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitExploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal Wabbit
ย 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
ย 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2
ย 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
ย 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
ย 
Terascale Learning
Terascale LearningTerascale Learning
Terascale Learning
ย 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
ย 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
ย 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
ย 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
ย 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
ย 

Viewers also liked

Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...MLconf
ย 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
ย 
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...MLconf
ย 
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016MLconf
ย 
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016MLconf
ย 
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016MLconf
ย 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016MLconf
ย 
Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016MLconf
ย 
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016MLconf
ย 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
ย 
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016MLconf
ย 
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016MLconf
ย 
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16MLconf
ย 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
ย 
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...MLconf
ย 
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016MLconf
ย 
Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...MLconf
ย 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016MLconf
ย 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016MLconf
ย 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
ย 

Viewers also liked (20)

Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...
ย 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
ย 
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
ย 
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
ย 
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
ย 
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
ย 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
ย 
Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind โ€“ Deep learning for Industry at MLconf ATL 2016
ย 
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
ย 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
ย 
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
ย 
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
ย 
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
ย 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
ย 
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
ย 
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
ย 
Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-Franรงois Puget, Distinguished Engineer, Machine Learning and Optimizatio...
ย 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
ย 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
ย 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
ย 

Similar to Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology at MLconf ATL 2016

Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDing Li
ย 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
ย 
Gan seminar
Gan seminarGan seminar
Gan seminarSan Kim
ย 
Eye deep
Eye deepEye deep
Eye deepsveitser
ย 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningTapas Majumdar
ย 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedOmid Vahdaty
ย 
DL (v2).pptx
DL (v2).pptxDL (v2).pptx
DL (v2).pptxFKKBWITTAINAN
ย 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsWee Hyong Tok
ย 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdfnyomans1
ย 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfssuser7f0b19
ย 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
ย 
[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform
[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform
[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platformNaoki (Neo) SATO
ย 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesAdnanHaider234505
ย 
B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentationjesujoseph
ย 
ๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จ
ๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จ
ๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จCHENHuiMei
ย 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
ย 
2020 12-2-detr
2020 12-2-detr2020 12-2-detr
2020 12-2-detrJAEMINJEONG5
ย 
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate AscentCOCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascentjeykottalam
ย 
๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)
๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)
๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)Hansol Kang
ย 

Similar to Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology at MLconf ATL 2016 (20)

Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
ย 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
ย 
Gan seminar
Gan seminarGan seminar
Gan seminar
ย 
Eye deep
Eye deepEye deep
Eye deep
ย 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
ย 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
ย 
DL (v2).pptx
DL (v2).pptxDL (v2).pptx
DL (v2).pptx
ย 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
ย 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
ย 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdf
ย 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
ย 
[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform
[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform
[็ฌฌ34ๅ›ž WBA่‹ฅๆ‰‹ใฎไผšๅ‹‰ๅผทไผš] Microsoft AI platform
ย 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture Slides
ย 
B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentation
ย 
ๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จ
ๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จ
ๆทฑๅบฆๅญธ็ฟ’ๅœจAOI็š„ๆ‡‰็”จ
ย 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
ย 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
ย 
2020 12-2-detr
2020 12-2-detr2020 12-2-detr
2020 12-2-detr
ย 
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate AscentCOCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
ย 
๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)
๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)
๋”ฅ๋Ÿฌ๋‹ ์ค‘๊ธ‰ - AlexNet๊ณผ VggNet (Basic of DCNN : AlexNet and VggNet)
ย 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
ย 
Ted Willke - The Brainโ€™s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brainโ€™s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brainโ€™s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brainโ€™s Guide to Dealing with Context in Language UnderstandingMLconf
ย 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
ย 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
ย 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
ย 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
ย 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...MLconf
ย 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
ย 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
ย 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
ย 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
ย 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
ย 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
ย 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
ย 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
ย 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
ย 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
ย 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
ย 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
ย 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
ย 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
ย 
Ted Willke - The Brainโ€™s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brainโ€™s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brainโ€™s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brainโ€™s Guide to Dealing with Context in Language Understanding
ย 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
ย 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
ย 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
ย 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
ย 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimerโ€™s Disea...
ย 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
ย 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
ย 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
ย 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
ย 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
ย 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
ย 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
ย 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
ย 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
ย 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
ย 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
ย 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
ย 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
ย 

Recently uploaded

Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
ย 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
ย 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
ย 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
ย 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1Jamie (Taka) Wang
ย 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
ย 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
ย 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
ย 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
ย 
Meet the new FSP 3000 M-Flex800โ„ข
Meet the new FSP 3000 M-Flex800โ„ขMeet the new FSP 3000 M-Flex800โ„ข
Meet the new FSP 3000 M-Flex800โ„ขAdtran
ย 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
ย 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
ย 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
ย 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
ย 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
ย 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
ย 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
ย 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
ย 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
ย 

Recently uploaded (20)

Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
ย 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
ย 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
ย 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
ย 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
ย 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
ย 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
ย 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
ย 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
ย 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
ย 
Meet the new FSP 3000 M-Flex800โ„ข
Meet the new FSP 3000 M-Flex800โ„ขMeet the new FSP 3000 M-Flex800โ„ข
Meet the new FSP 3000 M-Flex800โ„ข
ย 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
ย 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
ย 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
ย 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
ย 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
ย 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
ย 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
ย 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
ย 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
ย 

Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology at MLconf ATL 2016

  • 1. Understanding Deep Learning for Big Data Le Song http://www.cc.gatech.edu/~lsong/ College of Computing Georgia Institute of Technology 1
  • 2. AlexNet: deep convolution neural networks 2 11 11 5 5 3 3 3 3 256 13 13 3 3 40964096 1000 Rectified linear unit: โ„Ž ๐‘ข = max{0, ๐‘ข} 224 224 3 55 55 96 256 27 27 384 13 13 384 13 13 3.7 million parameters58.6 million parameters Pr ๐‘ฆ|๐‘ฅ โˆ exp ๐‘Š8โ„Ž ๐‘Š7โ„Ž ๐‘Š6โ„Ž ๐‘Š5โ„Ž ๐‘Š4โ„Ž ๐‘Š3โ„Ž ๐‘Š2โ„Ž ๐‘Š1 ๐‘ฅ Image ๐‘ฅ Label ๐‘ฆ cat/ bike/ โ€ฆ?
  • 3. 3 a benchmark image classification problem ~ 1.3 million examples, ~ 1 thousand classes
  • 4. Training is end-to-end Minimize negative log-likelihood over ๐‘š data points ๐‘ฅ๐‘–, ๐‘ฆ๐‘– ๐‘–=1 ๐‘š min ๐‘“โˆˆ๐“• ๐‘… ๐‘Š1, โ€ฆ , ๐‘Š8 โ‰” โˆ’ 1 ๐‘š ๐‘–=1 ๐‘š log Pr ๐‘ฆ๐‘–|๐‘ฅ๐‘– (Stochastic) gradient descent ๐‘Š8 ๐‘ก+1 = ๐‘Š8 ๐‘ก โˆ’ ๐œ‚ ๐œ• ๐‘… ๐œ•๐‘Š8 โ€ฆ ๐‘Š1 ๐‘ก+1 = ๐‘Š1 ๐‘ก โˆ’ ๐œ‚ ๐œ• ๐‘… ๐œ•๐‘Š1 4 Pr ๐‘ฆ|๐‘ฅ โˆ exp ๐‘Š8โ„Ž ๐‘Š7โ„Ž ๐‘Š6โ„Ž ๐‘Š5โ„Ž ๐‘Š4โ„Ž ๐‘Š3โ„Ž ๐‘Š2โ„Ž ๐‘Š1 ๐‘ฅ AlexNet achieve ~40% top-1 error
  • 5. Traditional image features not learned end-to-end 5 Handcrafted feature extractor (eg. SIFT) Divide image to patches Combine features Learn classifier
  • 6. Rectified linear unit: โ„Ž ๐‘ข = max{0, ๐‘ข} Deep learning not fully understood 11 11 5 5 3 3 3 3 256 13 13 3 3 40964096 1000 224 224 3 55 55 96 256 27 27 384 13 13 384 13 13 3.7 million parameters 58.6 million parameters 6 ully connected layers crucial? Convolution layers crucial? Image ๐‘ฅ Train end-to-end important? Pr ๐‘ฆ|๐‘ฅ โˆ exp ๐‘Š8โ„Ž ๐‘Š7โ„Ž ๐‘Š6โ„Ž ๐‘Š5โ„Ž ๐‘Š4โ„Ž ๐‘Š3โ„Ž ๐‘Š2โ„Ž ๐‘Š1 ๐‘ฅ
  • 7. Experiments 1. Fully connected layers crucial? 2. Convolution layers crucial? 3. Learn parameters end-to-end crucial?
  • 8. Kernel methods: alternative nonlinear model Combination of random basis functions ๐‘˜(๐‘ค, ๐‘ฅ) ๐‘“ ๐‘ฅ = ๐‘–=1 ๐‘‡ ๐›ผ๐‘– ๐‘˜(๐‘ค๐‘–, ๐‘ฅ) 8 ๐‘–=1 7 ๐›ผ๐‘– exp โˆ’ ๐‘ค๐‘– โˆ’ ๐‘ฅ 2 ๐›ผ1 ๐›ผ2 ๐›ผ3 ๐›ผ4 ๐›ผ5 ๐›ผ6 ๐›ผ7 ๐‘ค2 ๐‘ค3 ๐‘ค4 ๐‘ค5 ๐‘ค6 ๐‘ค7 ๐‘˜ ๐‘ค๐‘–, ๐‘ฅ = exp โˆ’ ๐‘ค๐‘– โˆ’ ๐‘ฅ 2 ๐‘ฅ๐‘ค1 [Dai et al. NIPS 14] ๐‘ฅ
  • 9. Replace fully connected by kernel methods I. Jointly trained neural nets (AlexNet) Pr ๐‘ฆ ๐‘ฅ โˆ exp ๐‘Š8โ„Ž7 ๐‘Š7 โ„Ž6 โ€ฆ โ„Ž1 ๐‘Š1 ๐‘ฅ Learn II. Fixed neural nets III. Scalable kernel methods [Dai et al. NIPS 14] Learn Fix Learn Fix 9
  • 10. 10 Learn classifiers from a benchmark subset of ~ 1.3 million examples, ~ 1 thousand classes
  • 11. Kernel machine learns faster ImageNet 1.3M original images, and 1000 classes Random cropping and mirroring images in streaming fashion Number of training samples 10 5 40 60 80 100 Test top-1 error (%) 10 6 10 7 10 8 jointly-trained neural net fixed neural net doubly SGD Training 1 week using GPU 47.8 44.5 42.6 Random guessing 99.9% error 11
  • 12. Similar results with MNIST8M Classification with handwritten digits 8M images, 10 classes LeNet5 12
  • 13. Similar results with CIFAR10 Classification with internet images 60K images, 10 classes 13
  • 14. Experiments 1. Fully connected layers crucial? No 2. Convolution layers crucial? 3. Learn parameters end-to-end crucial?
  • 15. Kernel methods directly on inputs? Fixed convolutionWithout convolution 0 0.2 0.4 0.6 0.8 1 1.2 MNIST 2 convolution layer 0 10 20 30 40 CIFAR10 2 convolution layers 0 20 40 60 80 100 ImageNet 5 convolution layers 15
  • 16. Kernel methods + random convolutions? Fixed convolutionWithout convolution Random convolution 0 0.2 0.4 0.6 0.8 1 1.2 MNIST 2 convolution layer 0 10 20 30 40 CIFAR10 2 convolution layers # random conv โ‰ซ # fixed conv Random 16
  • 17. Structured composition useful Not just fully connected layers, and plain composition ๐‘“ ๐‘ฅ = โ„Ž ๐‘› โ„Ž ๐‘›โˆ’1 โ€ฆ โ„Ž1 ๐‘ฅ Structured composition of nonlinear functions ๐‘“ ๐‘ฅ = โ„Ž ๐‘› โ„Ž ๐‘›โˆ’1 โ€ฆ โ„Ž1 ๐‘ฅ ๐‘๐‘Ž๐‘ก๐‘โ„Ž1 , โ„Ž1 ๐‘ฅ ๐‘๐‘Ž๐‘ก๐‘โ„Ž2 , โ€ฆ , โ„Ž1 ๐‘ฅ ๐‘๐‘Ž๐‘ก๐‘โ„Ž ๐‘š 17 the same function
  • 18. Experiments 1. Fully connected layers crucial? No 2. Convolution layers crucial? Yes 3. Learn parameters end-to-end crucial?
  • 19. Lots of random features used 58M parameters 131M parameters AlexNet Scalable Kernel Method Error 42.6% Error 44.5% 1000 4096 4096 256 13 13 256 13 13 131K 1000 19 Fix
  • 20. 131M parameters needed? 58M parameters 32M parameters AlexNet Error 42.6% Error 50.0% 1000 4096 4096 256 13 13 256 13 13 32K 1000 20 Scalable Kernel Method Fix
  • 21. Basis function adaptation crucial Integrated squared approximation error by ๐‘‡ basis function [Barron โ€˜93] Error of adapting basis function โ‰ค 1 ๐‘‡ Error of fixed basis function โ‰ฅ 1 ๐‘‡2/๐‘‘ ๐‘“ ๐‘ฅ = ๐‘–=1 7 ๐›ผ๐‘– ๐‘˜ ๐‘ฅ๐‘–, ๐‘ฅ ๐›ผ1 ๐›ผ2 ๐›ผ3 ๐›ผ4 ๐›ผ5 ๐›ผ6 ๐›ผ7 ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ3 ๐‘ฅ4 ๐‘ฅ5 ๐‘ฅ6 ๐‘ฅ7 ๐‘˜(๐‘ฅ๐‘–, ๐‘ฅ) ๐‘“ ๐‘ฅ = ๐‘–=1 2 ๐›ผ๐‘– ๐‘˜ ๐œƒ ๐‘– ๐‘ฅ๐‘–, ๐‘ฅ ๐‘ฅ1 ๐‘ฅ2 ๐‘˜ ๐œƒ ๐‘– (๐‘ฅ๐‘–, ๐‘ฅ) ๐›ผ1 ๐›ผ2 21
  • 22. Learning random features helps a lot 58M parameters 32M parameters Learn and basis adaptation AlexNet Error 42.6% Error 43.7% 1000 4096 4096 256 13 13 256 13 13 32K 1000 Fix 22/50 Scalable Kernel Method
  • 23. Learning convolution together helps more 58M parameters 32M parameters Learn and basis adaptation AlexNet Error 42.6% Error 41.9% 1000 4096 4096 256 13 13 256 13 13 32K 1000 Jointly learn 23 Scalable Kernel Method
  • 24. Lesson learned: Exploit Structure & Train End-to-End Deep learning over (time-varying) graph
  • 25. Co-evolutionary features ChristineAliceDavid Jacob Item embedding ๐‘“๐‘–(๐‘ก) User embedding ๐‘“๐‘ข(๐‘ก) User-item interactions evolve over time โ€ฆ 25
  • 26. ChristineAliceDavid Jacob User embedding ๐‘“๐‘ข(๐‘ก) Co-evolutionary features Item embedding ๐‘“๐‘–(๐‘ก) User-item interactions evolve over time โ€ฆ 26
  • 27. ChristineAliceDavid Jacob User embedding ๐‘“๐‘ข(๐‘ก) Co-evolutionary features Item embedding ๐‘“๐‘–(๐‘ก) User-item interactions evolve over time โ€ฆ 27
  • 28. ChristineAliceDavid Jacob Item embedding ๐‘“๐‘–(๐‘ก) User embedding ๐‘“๐‘ข(๐‘ก) Co-evolutionary features User-item interactions evolve over time โ€ฆ 28
  • 29. ChristineAliceDavid Jacob Item embedding ๐‘“๐‘–(๐‘ก) User embedding ๐‘“๐‘ข(๐‘ก) Co-evolutionary features User-item interactions evolve over time โ€ฆ 29
  • 30. ChristineAliceDavid Jacob Co-evolutionary features Item embedding ๐‘“๐‘–(๐‘ก) User embedding ๐‘“๐‘ข(๐‘ก) User-item interactions evolve over time โ€ฆ 30
  • 31. Co-evolutionary embedding ChristineAliceDavid Jacob Initialize item embedding ๐‘“๐‘– ๐‘› ๐‘ก0 = โ„Ž ๐‘‰0 โ‹… ๐‘“๐‘– ๐‘› 0 Initialize user embedding ๐‘“๐‘ข ๐‘› ๐‘ก0 = โ„Ž ๐‘Š0 โ‹… ๐‘“๐‘ข ๐‘› 0 ๐‘ข ๐‘›, ๐‘– ๐‘›, ๐‘ก ๐‘›, ๐‘ž ๐‘› Item raw profile features User raw profile features Drift Context Evolution Co-evolution User ๏ƒ Item๐‘“๐‘– ๐‘› ๐‘ก ๐‘› = โ„Ž ๐‘‰1 โ‹… ๐‘“๐‘– ๐‘› ๐‘ก ๐‘› โˆ’ +๐‘‰2 โ‹… ๐‘“๐‘ข ๐‘› ๐‘ก ๐‘› โˆ’ +๐‘‰3 โ‹… ๐‘ž ๐‘› +๐‘‰4 โ‹… (๐‘ก ๐‘› โˆ’ ๐‘ก ๐‘›โˆ’1) Update U2I: Drift Context Evolution Co-evolution Item๏ƒ User๐‘“๐‘ข ๐‘› ๐‘ก ๐‘› = โ„Ž ๐‘Š1 โ‹… ๐‘“๐‘ข ๐‘› ๐‘ก ๐‘› โˆ’ +๐‘Š2 โ‹… ๐‘“๐‘– ๐‘› ๐‘ก ๐‘› โˆ’ +๐‘Š3 โ‹… ๐‘ž ๐‘› +๐‘Š4 โ‹… (๐‘ก ๐‘› โˆ’ ๐‘ก ๐‘›โˆ’1) Update I2U: 31[Dai et al. Recsys16]
  • 32. Deep learning with time-varying computation graph time ๐‘ก2 ๐‘ก3 ๐‘ก1 ๐‘ก0 Mini-batch 1 Computation graph of RNN determined by 1. The bipartite interaction graph 2. The temporal ordering of events 32
  • 33. Much improvement prediction on Reddit dataset Next item prediction Return time prediction 1,000 users, 1403 groups, ~10K interactions MAR: mean absolute rank difference MAE: mean absolute error (hours) 33
  • 34. Predicting efficiency of solar panel materials Dataset Harvard clean energy project Data point # 2.3 million Type Molecule Atom type 6 Avg node # 28 Avg edge # 33 Power Conversion Efficiency (PCE) (0 -12 %) predict Organic Solar Panel Materials 34
  • 35. Structure2Vec ๐œ‡2 (1) ๐œ‡2 (0) ๐œ‡1 (0) ๐œ‡3 (1) ๐œ‡1 (1) โ€ฆโ€ฆ ๐œ‡2 (๐‘‡) ๐œ‡3 (๐‘‡) ๐œ‡1 (๐‘‡) ๐‘‹6 ๐‘‹1 ๐‘‹2 ๐‘‹3 ๐‘‹4 ๐‘‹5 ๐œ’ ๐œ‡6 (0) โ€ฆโ€ฆ โ€ฆโ€ฆ Iteration 1: Iteration ๐‘‡: Label ๐‘ฆ classification/regression with parameter ๐‘‰ Aggregate ๐œ‡1 (๐‘‡) ๐œ‡2 (๐‘‡) + + โ‹ฎ = ๐œ‡ ๐‘Ž(๐‘Š, ๐œ’) 35 [Dai et al. ICML 16]
  • 36. Improved prediction with small model Structure2vec gets ~4% relative error with 10,000 times smaller model! Test MAE Test RMSE # parameters Mean predictor 1.986 2.406 1 WL level-3 0.143 0.204 1.6 m WL level-6 0.096 0.137 1378 m structure2vec 0.085 0.117 0.1 m 10% data for testing 36
  • 37. Take Home Message: Deep fully connected layers not the key Exploit structure (CNN, Coevolution, Structure2vec) Train end-to-end

Editor's Notes

  1. Why the performance rather than interpret the results
  2. The task: classification (maybe one slide)
  3. Have one slides for the neural networks.
  4. The task: classification (maybe one slide)
  5. The actual classification number Not improving, finish it. Make the meaning of convergence clearer: given sample, fewer error. Same error, fewer samples. Emphasize what does it mean by scalable. (compare to alternative methods).
  6. Take features from the last pooling layer Le-Net5 [LeCunโ€™12]
  7. H(x) the same line!!! Too busy!!! Remove the top. Smaller figure. Fewer gs.
  8. Need theory cited. Lower bound.
  9. Here we tried a large dataset, where the task is to predict the power conversion efficiency for molecular data. Accurate prediction is essential for screening of new form of energy and material. The dataset we used consists of 2.3 million samples from Harvard Clean Energy Project. And the figure here shows the PCE range is from 0 to 11
  10. Now is the time to put them together. We start with the zero embeddings, and then perform one step of fixed point equation update. For example, to get update of mu_2, we use its neighborhood embeddings and input features. Similarly, we can get updates for all other posterior marginal embeddings. Same as traditional graphical model inference, we need to iterate the fixed point update several times. Intuitively, this will allow each embedding capture more and more neighborhood information. In the last step, we merge those marginal embeddings to get a vector representation of entire structure data. We can see this model can be trained in an end to end fashion. Also, the parameters in embedding iteration layers are shared, which makes it similar to recurrent neural network. We can simply extend it by using LSTM to formulate the fixed point equation.
  11. Here is the result we reported. We compared with the Weisfeiler-Lehman kernel with different degrees. Since the kernel matrix cannot work in this scale, we manually created high dimensional explicit feature map for it. Due to its high dimensionality, we can at most work with degree 6. We can see that we get 4% for the relative error on predicting. Also, to get comparable result for the Weisfeiler-Lehman kernel, it requires 1.3 billion parameters. We can get better results with only 0.1m parameters, which is a 10k times smaller model than alternatives.