SlideShare a Scribd company logo
1 of 15
Download to read offline
Tera-scale deep learning
                                Quoc	
  V.	
  Le	
  
              Stanford	
  University	
  and	
  Google	
  
                               	
  
                               	
  
                     Joint	
  work	
  with	
  



           Kai	
  Chen	
   Greg	
  Corrado	
   Jeff	
  Dean	
   MaAhieu	
  Devin	
  




   Rajat	
  Monga	
   Andrew	
  Ng	
   Marc Aurelio	
   Paul	
  Tucker	
     Ke	
  Yang	
  
                                         Ranzato	
  
Machine	
  Learning	
  successes	
  




Face	
  recogniLon	
           OCR	
          Autonomous	
  car	
  
                                                                                Email	
  classificaLon	
  




             RecommendaLon	
  systems	
                         Web	
  page	
  ranking	
  

                                                                                                Quoc	
  Le	
  
The	
  role	
  of	
  Feature	
  ExtracLon	
  	
  
   in	
  PaAern	
  RecogniLon	
  




                                                         Classifier	
  




                  Feature	
  extracLon	
  
             (Mostly	
  hand-­‐craWed	
  features)	
  




                                                           Quoc	
  Le	
  
Hand-­‐CraWed	
  Features	
  
                           Computer	
  vision:	
  
                           	
  
                           	
  

                                                                           …	
  


           SIFT/HOG	
                                 SURF	
  


                          Speech	
  RecogniLon:	
  
                          	
  
                          	
  



                                                                                          …	
  



MFCC	
                     Spectrogram	
                         ZCR	
  
                                                                                   Quoc	
  Le	
  
New	
  feature-­‐designing	
  paradigm	
  



Unsupervised	
  Feature	
  Learning	
  /	
  Deep	
  Learning	
  	
  
	
  
Show	
  promises	
  for	
  small	
  datasets	
  
	
  
Expensive	
  and	
  typically	
  applied	
  to	
  small	
  problems	
  




                                                                          Quoc	
  Le	
  
The	
  Trend	
  of	
  BigData	
  




                                    Quoc	
  Le	
  
Brain	
  SimulaLon	
  


       Autoencoder	
                     Watching	
  10	
  million	
  YouTube	
  video	
  frames	
  
                                         	
  
                                         Train	
  on	
  2000	
  machines	
  (16000	
  cores)	
  for	
  1	
  week	
  
                                         	
  
       Autoencoder	
                     1.15	
  billion	
  parameters	
  
                                         -­‐  100x	
  larger	
  than	
  previously	
  reported	
  	
  
                                         -­‐  Small	
  compared	
  to	
  visual	
  cortex	
  
                                         	
  
       Autoencoder	
  




            Image	
  




Le,	
  et	
  al.,	
  Building	
  high-­‐level	
  features	
  using	
  large-­‐scale	
  unsupervised	
  learning.	
  ICML	
  2012	
  
Key	
  results	
  




             Face	
  detector	
                      Human	
  body	
  detector	
                                Cat	
  detector	
  


                                            Totally	
  unsupervised!	
  	
  

                                           ~85%	
  
                                                                                 correct	
  in	
  	
  
                                                                                 classifying	
  	
  
                                                                                 face	
  vs	
  no	
  face	
  
                                                                                 	
  

Le,	
  et	
  al.,	
  Building	
  high-­‐level	
  features	
  using	
  large-­‐scale	
  unsupervised	
  learning.	
  ICML	
  2012	
  
ImageNet	
  classificaLon	
  




0.005%	
   9.5%	
   15.8%	
  
 Random	
  guess	
                                                                                              State-­‐of-­‐the-­‐art	
                                                                                                                          Feature	
  learning	
  	
  
                                                                                                              (Weston,	
  Bengio	
  ‘11)	
                                                                                                                        From	
  raw	
  pixels	
  

    ImageNet	
  2009	
  (10k	
  categories):	
  Best	
  published	
  result:	
  17%	
  	
  
    	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  (Sanchez	
  &	
  Perronnin	
  ‘11	
  ),	
  	
  
    	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Our	
  method:	
  20%	
  
    	
  
    Using	
  only	
  1000	
  categories,	
  our	
  method	
  >	
  50%	
  
    	
  



                                                                                                                                                                                                                                                                                              Quoc	
  Le	
  
Scaling	
  up	
  Deep	
  Learning	
  



                             Prior	
  art	
         Our	
  work	
  

 #	
  Examples	
           100,000	
                10,000,000	
  

 #	
  Dimensions	
         1,000	
                  10,000	
  

#	
  Parameters	
          10,000,000	
             1,000,000,000	
  

 Data	
  set	
  size	
      Gbytes	
                Tbytes	
  

                           Edge	
  filters	
  	
     High-­‐level	
  features	
  
Learned	
  features	
      from	
  Images	
         Face,	
  cat	
  detectors	
  




                                                                                    Quoc	
  Le	
  
Summary	
  of	
  Scaling	
  up	
  


-­‐  Local	
  connecLvity	
  (Model	
  Parallelism)	
  

-­‐  Asynchronous	
  SGDs	
  (Clever	
  opLmizaLon	
  /	
  Data	
  parallelism)	
  
	
  
-­‐  RPCs	
  

-­‐  Prefetching	
  

-­‐  Single	
  

-­‐  Removing	
  slow	
  machines	
  

-­‐  Lots	
  of	
  opLmizaLon	
  




                                                                                      Quoc	
  Le	
  
Locally	
  connected	
  networks	
  

                      Machine	
  #1	
      Machine	
  #2	
     Machine	
  #3	
     Machine	
  #4	
  




       Features	
  




Image	
  




                                                                                           Quoc	
  Le	
  
Asynchronous	
  Parallel	
  SGDs	
  (Alex	
  Smola’s	
  talk)	
  




                       Parameter	
  server	
  




                                                                    Quoc	
  Le	
  
Conclusions	
  
                  •     Scale	
  deep	
  learning	
  100x	
  larger	
  using	
  distributed	
  training	
  on	
  1000	
  
                        machines	
  
                  •     Brain	
  simulaLon	
  -­‐>	
  Cat	
  neuron	
  
                  •     State-­‐of-­‐the-­‐art	
  performances	
  on	
  	
  
                          –  Object	
  recogniLon	
  (ImageNet)	
  
                          –  AcLon	
  RecogniLon	
  
                          –  Cancer	
  image	
  classificaLon	
  
                   •    Other	
  applicaLons	
  
                          –  Speech	
  recogniLon	
  
                          –  Machine	
  TranslaLon	
  

                                                                                                                  ImageNet	
  
                                                                                           0.005%	
                  9.5%	
                           15.8%	
  
                                                                                                                   Best	
  published	
  result	
  
Model	
  	
  
                                                                                            Random	
  guess	
                                        Our	
  method	
  


Parallelism	
  




Data	
                                    Parameter	
  server	
  
Parallelism	
  
                                                                                            Cat	
  neuron	
                                               Face	
  neuron	
  
References	
  

•  Q.V.	
  Le,	
  M.A.	
  Ranzato,	
  R.	
  Monga,	
  M.	
  Devin,	
  G.	
  Corrado,	
  K.	
  Chen,	
  J.	
  Dean,	
  A.Y.	
  
   Ng.	
  Building	
  high-­‐level	
  features	
  using	
  large-­‐scale	
  unsupervised	
  learning.	
  
   ICML,	
  2012.	
  
•  Q.V.	
  Le,	
  J.	
  Ngiam,	
  Z.	
  Chen,	
  D.	
  Chia,	
  P.	
  Koh,	
  A.Y.	
  Ng.	
  Tiled	
  Convolu7onal	
  Neural	
  
   Networks.	
  NIPS,	
  2010.	
  	
  
•  Q.V.	
  Le,	
  W.Y.	
  Zou,	
  S.Y.	
  Yeung,	
  A.Y.	
  Ng.	
  Learning	
  hierarchical	
  spa7o-­‐temporal	
  
   features	
  for	
  ac7on	
  recogni7on	
  with	
  independent	
  subspace	
  analysis.	
  CVPR,	
  
   2011.	
  
•  Q.V.	
  Le,	
  J.	
  Ngiam,	
  A.	
  Coates,	
  A.	
  Lahiri,	
  B.	
  Prochnow,	
  A.Y.	
  Ng.	
  	
  
   On	
  op7miza7on	
  methods	
  for	
  deep	
  learning.	
  ICML,	
  2011.	
  	
  
•  Q.V.	
  Le,	
  A.	
  Karpenko,	
  J.	
  Ngiam,	
  A.Y.	
  Ng.	
  	
  ICA	
  with	
  Reconstruc7on	
  Cost	
  for	
  
   Efficient	
  Overcomplete	
  Feature	
  Learning.	
  NIPS,	
  2011.	
  	
  
•  Q.V.	
  Le,	
  J.	
  Han,	
  J.	
  Gray,	
  P.	
  Spellman,	
  A.	
  Borowsky,	
  B.	
  Parvin.	
  Learning	
  Invariant	
  
   Features	
  for	
  Tumor	
  Signatures.	
  ISBI,	
  2012.	
  	
  
•  I.J.	
  Goodfellow,	
  Q.V.	
  Le,	
  A.M.	
  Saxe,	
  H.	
  Lee,	
  A.Y.	
  Ng,	
  	
  Measuring	
  invariances	
  in	
  
   deep	
  networks.	
  NIPS,	
  2009.	
  



                       hAp://ai.stanford.edu/~quocle	
  

More Related Content

Similar to Quoc Le, Stanford & Google - Tera Scale Deep Learning

2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
imec.archive
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
imec.archive
 
Yann le cun
Yann le cunYann le cun
Yann le cun
Yandex
 
Framework Engineering_Final
Framework Engineering_FinalFramework Engineering_Final
Framework Engineering_Final
YoungSu Son
 
Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsApproximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous Events
Edward Curry
 

Similar to Quoc Le, Stanford & Google - Tera Scale Deep Learning (20)

What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
 
Yann le cun
Yann le cunYann le cun
Yann le cun
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & Future
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
The Forces Driving Java
The Forces Driving JavaThe Forces Driving Java
The Forces Driving Java
 
Framework Engineering_Final
Framework Engineering_FinalFramework Engineering_Final
Framework Engineering_Final
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Lecture24
Lecture24Lecture24
Lecture24
 
Evolving Web: Drupal 7 in Higher Education Case Study
Evolving Web: Drupal 7 in Higher Education Case Study Evolving Web: Drupal 7 in Higher Education Case Study
Evolving Web: Drupal 7 in Higher Education Case Study
 
An Introduction to Face Detection
An Introduction to Face DetectionAn Introduction to Face Detection
An Introduction to Face Detection
 
Gesture Based Interaction
Gesture Based InteractionGesture Based Interaction
Gesture Based Interaction
 
426 lecture2: AR Technology
426 lecture2: AR Technology426 lecture2: AR Technology
426 lecture2: AR Technology
 
Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous EventsApproximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous Events
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Microsoft HPC User Group
Microsoft HPC User Group Microsoft HPC User Group
Microsoft HPC User Group
 
Dubbawala _ Ebay Virtual Courier Aggregator
Dubbawala _ Ebay Virtual Courier AggregatorDubbawala _ Ebay Virtual Courier Aggregator
Dubbawala _ Ebay Virtual Courier Aggregator
 

More from Kun Le

Ibm big data and analytics for a holistic customer journey
Ibm big data and analytics for a holistic customer journeyIbm big data and analytics for a holistic customer journey
Ibm big data and analytics for a holistic customer journey
Kun Le
 
Lessons and Challenges from Mining Retail E-Commerce Data
Lessons and Challenges from Mining Retail E-Commerce DataLessons and Challenges from Mining Retail E-Commerce Data
Lessons and Challenges from Mining Retail E-Commerce Data
Kun Le
 
The “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInThe “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedIn
Kun Le
 
Marketo - Definitive guide to marketing metrics marketing analytics
Marketo - Definitive guide to marketing metrics marketing analyticsMarketo - Definitive guide to marketing metrics marketing analytics
Marketo - Definitive guide to marketing metrics marketing analytics
Kun Le
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...
Kun Le
 
Architecting a-big-data-platform-for-analytics 24606569
Architecting a-big-data-platform-for-analytics 24606569Architecting a-big-data-platform-for-analytics 24606569
Architecting a-big-data-platform-for-analytics 24606569
Kun Le
 
Under the hood_of_mobile_marketing
Under the hood_of_mobile_marketingUnder the hood_of_mobile_marketing
Under the hood_of_mobile_marketing
Kun Le
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
Kun Le
 
Smarter analytics for retailers Delivering insight to enable business success
Smarter analytics for retailers Delivering insight to enable business successSmarter analytics for retailers Delivering insight to enable business success
Smarter analytics for retailers Delivering insight to enable business success
Kun Le
 
IBM - Using Predictive Analytics to Segment, Target and Optimize Marketing
IBM - Using Predictive Analytics to Segment, Target and Optimize MarketingIBM - Using Predictive Analytics to Segment, Target and Optimize Marketing
IBM - Using Predictive Analytics to Segment, Target and Optimize Marketing
Kun Le
 
IBM-Why Big Data?
IBM-Why Big Data?IBM-Why Big Data?
IBM-Why Big Data?
Kun Le
 
Big data that drives marketing roi across all channels & campaigns
Big data that drives marketing roi across all channels & campaignsBig data that drives marketing roi across all channels & campaigns
Big data that drives marketing roi across all channels & campaigns
Kun Le
 

More from Kun Le (14)

Ibm big data and analytics for a holistic customer journey
Ibm big data and analytics for a holistic customer journeyIbm big data and analytics for a holistic customer journey
Ibm big data and analytics for a holistic customer journey
 
Lessons and Challenges from Mining Retail E-Commerce Data
Lessons and Challenges from Mining Retail E-Commerce DataLessons and Challenges from Mining Retail E-Commerce Data
Lessons and Challenges from Mining Retail E-Commerce Data
 
University of Washington Computer Science & Engineering CSE 403: Software Eng...
University of Washington Computer Science & Engineering CSE 403: Software Eng...University of Washington Computer Science & Engineering CSE 403: Software Eng...
University of Washington Computer Science & Engineering CSE 403: Software Eng...
 
Business Intelligence and Retail
Business Intelligence and RetailBusiness Intelligence and Retail
Business Intelligence and Retail
 
The “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInThe “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedIn
 
Marketo - Definitive guide to marketing metrics marketing analytics
Marketo - Definitive guide to marketing metrics marketing analyticsMarketo - Definitive guide to marketing metrics marketing analytics
Marketo - Definitive guide to marketing metrics marketing analytics
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...
 
Architecting a-big-data-platform-for-analytics 24606569
Architecting a-big-data-platform-for-analytics 24606569Architecting a-big-data-platform-for-analytics 24606569
Architecting a-big-data-platform-for-analytics 24606569
 
Under the hood_of_mobile_marketing
Under the hood_of_mobile_marketingUnder the hood_of_mobile_marketing
Under the hood_of_mobile_marketing
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
 
Smarter analytics for retailers Delivering insight to enable business success
Smarter analytics for retailers Delivering insight to enable business successSmarter analytics for retailers Delivering insight to enable business success
Smarter analytics for retailers Delivering insight to enable business success
 
IBM - Using Predictive Analytics to Segment, Target and Optimize Marketing
IBM - Using Predictive Analytics to Segment, Target and Optimize MarketingIBM - Using Predictive Analytics to Segment, Target and Optimize Marketing
IBM - Using Predictive Analytics to Segment, Target and Optimize Marketing
 
IBM-Why Big Data?
IBM-Why Big Data?IBM-Why Big Data?
IBM-Why Big Data?
 
Big data that drives marketing roi across all channels & campaigns
Big data that drives marketing roi across all channels & campaignsBig data that drives marketing roi across all channels & campaigns
Big data that drives marketing roi across all channels & campaigns
 

Recently uploaded

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Quoc Le, Stanford & Google - Tera Scale Deep Learning

  • 1. Tera-scale deep learning Quoc  V.  Le   Stanford  University  and  Google       Joint  work  with   Kai  Chen   Greg  Corrado   Jeff  Dean   MaAhieu  Devin   Rajat  Monga   Andrew  Ng   Marc Aurelio   Paul  Tucker   Ke  Yang   Ranzato  
  • 2. Machine  Learning  successes   Face  recogniLon   OCR   Autonomous  car   Email  classificaLon   RecommendaLon  systems   Web  page  ranking   Quoc  Le  
  • 3. The  role  of  Feature  ExtracLon     in  PaAern  RecogniLon   Classifier   Feature  extracLon   (Mostly  hand-­‐craWed  features)   Quoc  Le  
  • 4. Hand-­‐CraWed  Features   Computer  vision:       …   SIFT/HOG   SURF   Speech  RecogniLon:       …   MFCC   Spectrogram   ZCR   Quoc  Le  
  • 5. New  feature-­‐designing  paradigm   Unsupervised  Feature  Learning  /  Deep  Learning       Show  promises  for  small  datasets     Expensive  and  typically  applied  to  small  problems   Quoc  Le  
  • 6. The  Trend  of  BigData   Quoc  Le  
  • 7. Brain  SimulaLon   Autoencoder   Watching  10  million  YouTube  video  frames     Train  on  2000  machines  (16000  cores)  for  1  week     Autoencoder   1.15  billion  parameters   -­‐  100x  larger  than  previously  reported     -­‐  Small  compared  to  visual  cortex     Autoencoder   Image   Le,  et  al.,  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.  ICML  2012  
  • 8. Key  results   Face  detector   Human  body  detector   Cat  detector   Totally  unsupervised!     ~85%   correct  in     classifying     face  vs  no  face     Le,  et  al.,  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.  ICML  2012  
  • 9. ImageNet  classificaLon   0.005%   9.5%   15.8%   Random  guess   State-­‐of-­‐the-­‐art   Feature  learning     (Weston,  Bengio  ‘11)   From  raw  pixels   ImageNet  2009  (10k  categories):  Best  published  result:  17%                                                                                                                        (Sanchez  &  Perronnin  ‘11  ),                                                                                                                        Our  method:  20%     Using  only  1000  categories,  our  method  >  50%     Quoc  Le  
  • 10. Scaling  up  Deep  Learning   Prior  art   Our  work   #  Examples   100,000   10,000,000   #  Dimensions   1,000   10,000   #  Parameters   10,000,000   1,000,000,000   Data  set  size   Gbytes   Tbytes   Edge  filters     High-­‐level  features   Learned  features   from  Images   Face,  cat  detectors   Quoc  Le  
  • 11. Summary  of  Scaling  up   -­‐  Local  connecLvity  (Model  Parallelism)   -­‐  Asynchronous  SGDs  (Clever  opLmizaLon  /  Data  parallelism)     -­‐  RPCs   -­‐  Prefetching   -­‐  Single   -­‐  Removing  slow  machines   -­‐  Lots  of  opLmizaLon   Quoc  Le  
  • 12. Locally  connected  networks   Machine  #1   Machine  #2   Machine  #3   Machine  #4   Features   Image   Quoc  Le  
  • 13. Asynchronous  Parallel  SGDs  (Alex  Smola’s  talk)   Parameter  server   Quoc  Le  
  • 14. Conclusions   •  Scale  deep  learning  100x  larger  using  distributed  training  on  1000   machines   •  Brain  simulaLon  -­‐>  Cat  neuron   •  State-­‐of-­‐the-­‐art  performances  on     –  Object  recogniLon  (ImageNet)   –  AcLon  RecogniLon   –  Cancer  image  classificaLon   •  Other  applicaLons   –  Speech  recogniLon   –  Machine  TranslaLon   ImageNet   0.005%   9.5%   15.8%   Best  published  result   Model     Random  guess   Our  method   Parallelism   Data   Parameter  server   Parallelism   Cat  neuron   Face  neuron  
  • 15. References   •  Q.V.  Le,  M.A.  Ranzato,  R.  Monga,  M.  Devin,  G.  Corrado,  K.  Chen,  J.  Dean,  A.Y.   Ng.  Building  high-­‐level  features  using  large-­‐scale  unsupervised  learning.   ICML,  2012.   •  Q.V.  Le,  J.  Ngiam,  Z.  Chen,  D.  Chia,  P.  Koh,  A.Y.  Ng.  Tiled  Convolu7onal  Neural   Networks.  NIPS,  2010.     •  Q.V.  Le,  W.Y.  Zou,  S.Y.  Yeung,  A.Y.  Ng.  Learning  hierarchical  spa7o-­‐temporal   features  for  ac7on  recogni7on  with  independent  subspace  analysis.  CVPR,   2011.   •  Q.V.  Le,  J.  Ngiam,  A.  Coates,  A.  Lahiri,  B.  Prochnow,  A.Y.  Ng.     On  op7miza7on  methods  for  deep  learning.  ICML,  2011.     •  Q.V.  Le,  A.  Karpenko,  J.  Ngiam,  A.Y.  Ng.    ICA  with  Reconstruc7on  Cost  for   Efficient  Overcomplete  Feature  Learning.  NIPS,  2011.     •  Q.V.  Le,  J.  Han,  J.  Gray,  P.  Spellman,  A.  Borowsky,  B.  Parvin.  Learning  Invariant   Features  for  Tumor  Signatures.  ISBI,  2012.     •  I.J.  Goodfellow,  Q.V.  Le,  A.M.  Saxe,  H.  Lee,  A.Y.  Ng,    Measuring  invariances  in   deep  networks.  NIPS,  2009.   hAp://ai.stanford.edu/~quocle