Quoc Le, Stanford & Google - Tera Scale Deep Learning

•

2 likes•2,713 views

Kun Le

Technology

Tera-scale deep learning
Quoc
V.
Le

Stanford
University
and
Google

Joint
work
with

Kai
Chen
Greg
Corrado
Jeﬀ
Dean
MaAhieu
Devin

Rajat
Monga
Andrew
Ng
Marc Aurelio
Paul
Tucker
Ke
Yang

Ranzato

Machine
Learning
successes

Face
recogniLon
OCR
Autonomous
car

Email
classiﬁcaLon

RecommendaLon
systems
Web
page
ranking

Quoc
Le

The
role
of
Feature
ExtracLon

in
PaAern
RecogniLon

Classiﬁer

Feature
extracLon

(Mostly
hand-‐craWed
features)

Quoc
Le

Hand-‐CraWed
Features

Computer
vision:

…

SIFT/HOG
SURF

Speech
RecogniLon:

…

MFCC
Spectrogram
ZCR

Quoc
Le

New
feature-‐designing
paradigm

Unsupervised
Feature
Learning
/
Deep
Learning

Show
promises
for
small
datasets

Expensive
and
typically
applied
to
small
problems

Quoc
Le

Brain
SimulaLon

Autoencoder
Watching
10
million
YouTube
video
frames

Train
on
2000
machines
(16000
cores)
for
1
week

Autoencoder
1.15
billion
parameters

-‐  100x
larger
than
previously
reported

-‐  Small
compared
to
visual
cortex

Autoencoder

Image

Le,
et
al.,
Building
high-‐level
features
using
large-‐scale
unsupervised
learning.
ICML
2012

Key
results

Face
detector
Human
body
detector
Cat
detector

Totally
unsupervised!

~85%

correct
in

classifying

face
vs
no
face

Le,
et
al.,
Building
high-‐level
features
using
large-‐scale
unsupervised
learning.
ICML
2012

ImageNet
classiﬁcaLon

0.005%
9.5%
15.8%

Random
guess
State-‐of-‐the-‐art
Feature
learning

(Weston,
Bengio
‘11)
From
raw
pixels

ImageNet
2009
(10k
categories):
Best
published
result:
17%

(Sanchez
&
Perronnin
‘11
),

Our
method:
20%

Using
only
1000
categories,
our
method
>
50%

Quoc
Le

Scaling
up
Deep
Learning

Prior
art
Our
work

#
Examples
100,000
10,000,000

#
Dimensions
1,000
10,000

#
Parameters
10,000,000
1,000,000,000

Data
set
size
Gbytes
Tbytes

Edge
ﬁlters

High-‐level
features

Learned
features
from
Images
Face,
cat
detectors

Quoc
Le

Summary
of
Scaling
up

-‐  Local
connecLvity
(Model
Parallelism)

-‐  Asynchronous
SGDs
(Clever
opLmizaLon
/
Data
parallelism)

-‐  RPCs

-‐  Prefetching

-‐  Single

-‐  Removing
slow
machines

-‐  Lots
of
opLmizaLon

Quoc
Le

Locally
connected
networks

Machine
#1
Machine
#2
Machine
#3
Machine
#4

Features

Image

Quoc
Le

Asynchronous
Parallel
SGDs
(Alex
Smola’s
talk)

Parameter
server

Quoc
Le

Conclusions

•  Scale
deep
learning
100x
larger
using
distributed
training
on
1000

machines

•  Brain
simulaLon
-‐>
Cat
neuron

•  State-‐of-‐the-‐art
performances
on

–  Object
recogniLon
(ImageNet)

–  AcLon
RecogniLon

–  Cancer
image
classiﬁcaLon

•  Other
applicaLons

–  Speech
recogniLon

–  Machine
TranslaLon

ImageNet

0.005%
9.5%
15.8%

Best
published
result

Model

Random
guess
Our
method

Parallelism

Data
Parameter
server

Parallelism

Cat
neuron
Face
neuron

References

•  Q.V.
Le,
M.A.
Ranzato,
R.
Monga,
M.
Devin,
G.
Corrado,
K.
Chen,
J.
Dean,
A.Y.

Ng.
Building
high-‐level
features
using
large-‐scale
unsupervised
learning.

ICML,
2012.

•  Q.V.
Le,
J.
Ngiam,
Z.
Chen,
D.
Chia,
P.
Koh,
A.Y.
Ng.
Tiled
Convolu7onal
Neural

Networks.
NIPS,
2010.

•  Q.V.
Le,
W.Y.
Zou,
S.Y.
Yeung,
A.Y.
Ng.
Learning
hierarchical
spa7o-‐temporal

features
for
ac7on
recogni7on
with
independent
subspace
analysis.
CVPR,

2011.

•  Q.V.
Le,
J.
Ngiam,
A.
Coates,
A.
Lahiri,
B.
Prochnow,
A.Y.
Ng.

On
op7miza7on
methods
for
deep
learning.
ICML,
2011.

•  Q.V.
Le,
A.
Karpenko,
J.
Ngiam,
A.Y.
Ng.

ICA
with
Reconstruc7on
Cost
for

Eﬃcient
Overcomplete
Feature
Learning.
NIPS,
2011.

•  Q.V.
Le,
J.
Han,
J.
Gray,
P.
Spellman,
A.
Borowsky,
B.
Parvin.
Learning
Invariant

Features
for
Tumor
Signatures.
ISBI,
2012.

•  I.J.
Goodfellow,
Q.V.
Le,
A.M.
Saxe,
H.
Lee,
A.Y.
Ng,

Measuring
invariances
in

deep
networks.
NIPS,
2009.

hAp://ai.stanford.edu/~quocle

Similar to Quoc Le, Stanford & Google - Tera Scale Deep Learning

What's Wrong With Deep Learning?

Philip Zheng

Strata London - Deep Learning 05-2015

Turi, Inc.

2008 brokerage 04 smart vision system [compatibility mode]

imec.archive

2008 brokerage 04 smart vision system [compatibility mode]

imec.archive

Yann le cun

Yandex

Deep Learning Hardware: Past, Present, & Future

Rouyun Pan

Convolutional Neural Network (CNN)

Muhammad Haroon

The Forces Driving Java

Steve Elliott

Framework Engineering_Final

YoungSu Son

introduction to deeplearning

Eyad Alshami

Anomaly Detection with Azure and .NET

Marco Parenzan

Lecture24

Albert Orriols-Puig

Evolving Web: Drupal 7 in Higher Education Case Study

dergachev

Presented by Mr. Dinesh KS Software Developer, Livares Technologies Introduction Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Face detection is a computer technology being used in a variety of applications that identifies human faces in digital images.

An Introduction to Face Detection

Livares Technologies Pvt Ltd

Gesture Based Interaction

lanesk8er

426 lecture2: AR Technology

Mark Billinghurst

Event-based systems have loose coupling within space, time and synchronization, providing a scalable infrastructure for information exchange and distributed workflows. However, event-based systems are tightly coupled, via event subscriptions and patterns, to the semantics of the underlying event schema and values. The high degree of semantic heterogeneity of events in large and open deployments such as smart cities and the sensor web makes it difficult to develop and maintain event-based systems. In order to address semantic coupling within event-based systems, we propose vocabulary free subscriptions together with the use of approximate semantic matching of events. This paper examines the requirement of event semantic decoupling and discusses approximate semantic event matching and the consequences it implies for event processing systems. We introduce a semantic event matcher and evaluate the suitability of an approximate hybrid matcher based on both thesauri-based and distributional semantics-based similarity and relatedness measures. The matcher is evaluated over show that the approach matches a representation of Wikipedia and Freebase events. Initial evaluations events structured with maximal combined precision-recall F1 score of 75.89% on average in all experiments with a subscription set of 7 subscriptions. The evaluation shows how a hybrid approach to semantic event matching outperforms a single similarity measure approach. Hasan S, O'Riain S, Curry E. Approximate Semantic Matching of Heterogeneous Events. In: 6th ACM International Conference on Distributed Event-Based Systems (DEBS 2012).

Approximate Semantic Matching of Heterogeneous Events

Edward Curry

Anomaly Detection with Azure and .net

Marco Parenzan

Microsoft HPC User Group

sjwoodman

Dubbawala _ Ebay Virtual Courier Aggregator

Manish Kanojia

Similar to Quoc Le, Stanford & Google - Tera Scale Deep Learning (20)

What's Wrong With Deep Learning?

Strata London - Deep Learning 05-2015

2008 brokerage 04 smart vision system [compatibility mode]

Yann le cun

Deep Learning Hardware: Past, Present, & Future

Convolutional Neural Network (CNN)

The Forces Driving Java

Framework Engineering_Final

introduction to deeplearning

Anomaly Detection with Azure and .NET

Lecture24

Evolving Web: Drupal 7 in Higher Education Case Study

An Introduction to Face Detection

Gesture Based Interaction

426 lecture2: AR Technology

Approximate Semantic Matching of Heterogeneous Events

Anomaly Detection with Azure and .net

Microsoft HPC User Group

Dubbawala _ Ebay Virtual Courier Aggregator

Recently uploaded

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc

With more memory available, system performance of three Dell devices increased, which can translate to a better user experience Conclusion When your system has plenty of RAM to meet your needs, you can efficiently access the applications and data you need to finish projects and to-do lists without sacrificing time and focus. Our test results show that with more memory available, three Dell PCs delivered better performance and took less time to complete the Procyon Office Productivity benchmark. These advantages translate to users being able to complete workflows more quickly and multitask more easily. Whether you need the mobility of the Latitude 5440, the creative capabilities of the Precision 3470, or the high performance of the OptiPlex Tower Plus 7010, configuring your system with more RAM can help keep processes running smoothly, enabling you to do more without compromising performance.

Boost PC performance: How more available memory can improve productivity

Principled Technologies

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Partners Life - Insurer Innovation Award 2024

The Digital Insurer

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

Manulife - Insurer Innovation Award 2024

The Digital Insurer

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

Scaling API-first – The story of a global engineering organization

Radu Cotescu

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

MINDCTI Revenue Release Quarter One 2024

MIND CTI

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

This presentation explores the impact of HTML injection attacks on web applications, detailing how attackers exploit vulnerabilities to inject malicious code into web pages. Learn about the potential consequences of such attacks and discover effective mitigation strategies to protect your web applications from HTML injection vulnerabilities. for more information visit https://bostoninstituteofanalytics.org/category/cyber-security-ethical-hacking/

HTML Injection Attacks: Impact and Mitigation Strategies

Boston Institute of Analytics

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Boost PC performance: How more available memory can improve productivity

Data Cloud, More than a CDP by Matt Robison

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Partners Life - Insurer Innovation Award 2024

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Manulife - Insurer Innovation Award 2024

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Scaling API-first – The story of a global engineering organization

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

MINDCTI Revenue Release Quarter One 2024

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Apidays New York 2024 - The value of a flexible API Management solution for O...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Axa Assurance Maroc - Insurer Innovation Award 2024

A Domino Admins Adventures (Engage 2024)

presentation ICT roal in 21st century education

HTML Injection Attacks: Impact and Mitigation Strategies

Quoc Le, Stanford & Google - Tera Scale Deep Learning

1. Tera-scale deep learning Quoc V. Le Stanford University and Google Joint work with Kai Chen Greg Corrado Jeﬀ Dean MaAhieu Devin Rajat Monga Andrew Ng Marc Aurelio Paul Tucker Ke Yang Ranzato

2. Machine Learning successes Face recogniLon OCR Autonomous car Email classiﬁcaLon RecommendaLon systems Web page ranking Quoc Le

3. The role of Feature ExtracLon in PaAern RecogniLon Classiﬁer Feature extracLon (Mostly hand-‐craWed features) Quoc Le

4. Hand-‐CraWed Features Computer vision: … SIFT/HOG SURF Speech RecogniLon: … MFCC Spectrogram ZCR Quoc Le

5. New feature-‐designing paradigm Unsupervised Feature Learning / Deep Learning Show promises for small datasets Expensive and typically applied to small problems Quoc Le

6. The Trend of BigData Quoc Le

7. Brain SimulaLon Autoencoder Watching 10 million YouTube video frames Train on 2000 machines (16000 cores) for 1 week Autoencoder 1.15 billion parameters -‐  100x larger than previously reported -‐  Small compared to visual cortex Autoencoder Image Le, et al., Building high-‐level features using large-‐scale unsupervised learning. ICML 2012

8. Key results Face detector Human body detector Cat detector Totally unsupervised! ~85% correct in classifying face vs no face Le, et al., Building high-‐level features using large-‐scale unsupervised learning. ICML 2012

9. ImageNet classiﬁcaLon 0.005% 9.5% 15.8% Random guess State-‐of-‐the-‐art Feature learning (Weston, Bengio ‘11) From raw pixels ImageNet 2009 (10k categories): Best published result: 17% (Sanchez & Perronnin ‘11 ), Our method: 20% Using only 1000 categories, our method > 50% Quoc Le

10. Scaling up Deep Learning Prior art Our work # Examples 100,000 10,000,000 # Dimensions 1,000 10,000 # Parameters 10,000,000 1,000,000,000 Data set size Gbytes Tbytes Edge ﬁlters High-‐level features Learned features from Images Face, cat detectors Quoc Le

11. Summary of Scaling up -‐  Local connecLvity (Model Parallelism) -‐  Asynchronous SGDs (Clever opLmizaLon / Data parallelism) -‐  RPCs -‐  Prefetching -‐  Single -‐  Removing slow machines -‐  Lots of opLmizaLon Quoc Le

12. Locally connected networks Machine #1 Machine #2 Machine #3 Machine #4 Features Image Quoc Le

13. Asynchronous Parallel SGDs (Alex Smola’s talk) Parameter server Quoc Le

14. Conclusions •  Scale deep learning 100x larger using distributed training on 1000 machines •  Brain simulaLon -‐> Cat neuron •  State-‐of-‐the-‐art performances on –  Object recogniLon (ImageNet) –  AcLon RecogniLon –  Cancer image classiﬁcaLon •  Other applicaLons –  Speech recogniLon –  Machine TranslaLon ImageNet 0.005% 9.5% 15.8% Best published result Model Random guess Our method Parallelism Data Parameter server Parallelism Cat neuron Face neuron

15. References •  Q.V. Le, M.A. Ranzato, R. Monga, M. Devin, G. Corrado, K. Chen, J. Dean, A.Y. Ng. Building high-‐level features using large-‐scale unsupervised learning. ICML, 2012. •  Q.V. Le, J. Ngiam, Z. Chen, D. Chia, P. Koh, A.Y. Ng. Tiled Convolu7onal Neural Networks. NIPS, 2010. •  Q.V. Le, W.Y. Zou, S.Y. Yeung, A.Y. Ng. Learning hierarchical spa7o-‐temporal features for ac7on recogni7on with independent subspace analysis. CVPR, 2011. •  Q.V. Le, J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, A.Y. Ng. On op7miza7on methods for deep learning. ICML, 2011. •  Q.V. Le, A. Karpenko, J. Ngiam, A.Y. Ng. ICA with Reconstruc7on Cost for Eﬃcient Overcomplete Feature Learning. NIPS, 2011. •  Q.V. Le, J. Han, J. Gray, P. Spellman, A. Borowsky, B. Parvin. Learning Invariant Features for Tumor Signatures. ISBI, 2012. •  I.J. Goodfellow, Q.V. Le, A.M. Saxe, H. Lee, A.Y. Ng, Measuring invariances in deep networks. NIPS, 2009. hAp://ai.stanford.edu/~quocle

Quoc Le, Stanford & Google - Tera Scale Deep Learning

Recommended

Recommended

More Related Content

Similar to Quoc Le, Stanford & Google - Tera Scale Deep Learning

Similar to Quoc Le, Stanford & Google - Tera Scale Deep Learning (20)

More from Kun Le

More from Kun Le (14)

Recently uploaded

Recently uploaded (20)

Quoc Le, Stanford & Google - Tera Scale Deep Learning