building intelligent systems with large scale deep learning
1. Building Intelligent Systems with
Large Scale Deep Learning
Jeff Dean
Google Brain team
g.co/brain
Presenting the work of many people at Google
2. Google Brain Team Mission:
Make Machines Intelligent.
Improve People’s Lives.
3. How do we do this?
● Conduct long-term research (>200 papers, see g.co/brain & g.co/brain/papers)
○ Unsupervised learning of cats, Inception, word2vec, seq2seq, DeepDream, image
captioning, neural translation, Magenta, ML for robotics control, healthcare, …
● Build and open-source systems like TensorFlow (see tensorflow.org and
https://github.com/tensorflow/tensorflow)
● Collaborate with others at Google and Alphabet to get our work into
the hands of billions of people (e.g., RankBrain for Google Search, GMail
Smart Reply, Google Photos, Google speech recognition, Google Translate, Waymo, …)
● Train new researchers through internships and the Google Brain
Residency program
4. Main Research Areas
● General Machine Learning Algorithms and Techniques
● Computer Systems for Machine Learning
● Natural Language Understanding
● Perception
● Healthcare
● Robotics
● Music and Art Generation
5. Main Research Areas
● General Machine Learning Algorithms and Techniques
● Computer Systems for Machine Learning
● Natural Language Understanding
● Perception
● Healthcare
● Robotics
● Music and Art Generation
10. Growth of Deep Learning at Google
and many more . . . .
Directories containing model description files
11. Experiment Turnaround Time and Research Productivity
● Minutes, Hours:
○ Interactive research! Instant gratification!
● 1-4 days
○ Tolerable
○ Interactivity replaced by running many experiments in parallel
● 1-4 weeks
○ High value experiments only
○ Progress stalls
● >1 month
○ Don’t even try
12. Google Confidential + Proprietary (permission granted to share within NIST)
Build the right tools
13. Open, standard software for
general machine learning
Great for Deep Learning in
particular
First released Nov 2015
Apache 2.0 license
http://tensorflow.org/
and
https://github.com/tensorflow/tensorflow
14. TensorFlow Goals
Establish common platform for expressing machine learning ideas and systems
Make this platform the best in the world for both research and production use
Open source it so that it becomes a platform for everyone, not just Google
22. ML is done in many places
TensorFlow GitHub stars by GitHub user profiles w/ public locations
Source: http://jrvis.com/red-dwarf/?user=tensorflow&repo=tensorflow
23. TensorFlow: A Vibrant Open-Source Community
● Rapid development, many outside contributors
○ 475+ non-Google contributors to TensorFlow 1.0
○ 15,000+ commits in 15 months
○ Many community created tutorials, models, translations, and projects
■ ~7,000 GitHub repositories with ‘TensorFlow’ in the title
● Direct engagement between community and TensorFlow team
○ 5000+ Stack Overflow questions answered
○ 80+ community-submitted GitHub issues responded to weekly
● Growing use in ML classes: Toronto, Berkeley, Stanford, ...
25. Reuse same model for completely
different problems
Same basic model structure
trained on different data,
useful in completely different contexts
Example: given image → predict interesting pixels
26.
27.
28. We have tons of vision problems
Image search, StreetView, Satellite Imagery,
Translation, Robotics, Self-driving Cars,
www.google.com/sunroof
29. Google Confidential + Proprietary (permission granted to share within NIST)
Computers can now see
Large implications for healthcare
32. Performance on par or slightly better than the median of 8 U.S.
board-certified ophthalmologists (F-score of 0.95 vs. 0.91).
http://research.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html
33. Google Confidential + Proprietary (permission granted to share within NIST)
Computers can now see
Large implications for robotics
34. Combining Vision with Robotics
“Deep Learning for Robots: Learning
from Large-Scale Interaction”, Google
Research Blog, March, 2016
“Learning Hand-Eye Coordination for
Robotic Grasping with Deep Learning
and Large-Scale Data Collection”,
Sergey Levine, Peter Pastor, Alex
Krizhevsky, & Deirdre Quillen,
Arxiv, arxiv.org/abs/1603.02199
37. Google Confidential + Proprietary (permission granted to share within NIST)
Scientific Applications of ML
38. Predicting Properties of Molecules
Toxic?
Bind with a given protein?
Quantum properties.
Message
Passing Neural
NetAspirin
● Chemical space is too big, so chemists often rely on virtual screening.
● Machine Learning can help search this large space.
● Molecules are graphs, nodes=atoms and edges=bonds (and other stuff)
● Message Passing Neural Nets unify and extend many neural net models
that are invariant to graph symmetries
● State of the art results predicting output of expensive quantum chemistry
calculations, but ~300,000 times faster
https://research.googleblog.com/2017/04/predicting-properties-of-molecules-with.html and
https://arxiv.org/abs/1702.05532 and https://arxiv.org/abs/1704.01212 (latter to appear in ICML 2017)
45. Human iPSC neurons / phase contrast /
nuclei (blue), dendrites (green), and
axons (red)
46. Google Confidential + Proprietary (permission granted to share within NIST)
Scaling language understanding models
47. Sequence-to-Sequence Model
A B C
v
D __ X Y Z
X Y Z Q
Input sequence
Target sequence
[Sutskever & Vinyals & Le NIPS 2014]
Deep LSTM
48. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How
Quelle est taille?votre <EOS>
49. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How
Quelle est taille?votre <EOS>
tall
How
50. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall are
Quelle est taille?votre <EOS> How tall
51. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall you?are
Quelle est taille?votre <EOS> How aretall
52. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
[Sutskever & Vinyals & Le NIPS 2014]
At inference time:
Beam search to choose most probable
over possible output sequences
Quelle est taille?votre <EOS>
61. BACKTRANSLATION
FROM JAPANESE (en->ja->en)
Kilimanjaro is a mountain of 19,710 feet covered with snow, which is
said to be the highest mountain in Africa. The summit of the west is
called “Ngaje Ngai” God ‘s house in Masai language. There is a dried
and frozen carcass of a leopard near the summit of the west. No one
can explain what the leopard was seeking at that altitude.
Kilimanjaro is 19,710 feet of the mountain covered with snow, and
it is said that the highest mountain in Africa. Top of the west,
“Ngaje Ngai” in the Maasai language, has been referred to as the
house of God. The top close to the west, there is a dry, frozen
carcass of a leopard. Whether the leopard had what the demand at
that altitude, there is no that nobody explained.
Google Neural Machine Translation (new system):
Phrase-Based Machine Translation (old system):
62. Google Confidential + Proprietary (permission granted to share within NIST)
Automated machine learning
(“learning to learn”)
64. Current:
Solution = ML expertise + data + computation
Can we turn this into:
Solution = data + 100X computation
???
65. Early encouraging signs
Trying multiple different approaches:
(1) RL-based architecture search
(2) Model architecture evolution
(3) Learn how to optimize
66. Idea: model-generating model trained via RL
(1) Generate ten models
(2) Train them for a few hours
(3) Use loss of the generated models as reinforcement
learning signal
Appeared in ICLR 2017
arxiv.org/abs/1611.01578
68. “Normal” LSTM cell
Cell discovered by
architecture search
Penn Tree Bank Language Modeling Task
69. Learn2Learn: Learn the Optimization Update Rule
Neural Optimizer Search using Reinforcement Learning,
Irwan Bello, Barret Zoph, Vijay Vasudevan, and Quoc Le. To appear in ICML 2017
70.
71. Google Confidential + Proprietary (permission granted to share within NIST)
More computational power needed
Deep learning is transforming how we
design computers
74. Tensor Processing Unit v2
Google-designed device for neural net training and inference
● 180 teraflops of computation, 64 GB of memory
Revealed in May at Google I/O
76. Will be Available through Google Cloud
Cloud TPU - virtual machine w/180 TFLOPS TPUv2 device attached
Programmed via TensorFlow
Same program will run with only minor
modifications on CPUs, GPUs, & TPUs
77. Making 1000 Cloud TPUs available for free to top researchers who are
committed to open machine learning research
We’re excited to see what researchers will do with much more computation!
g.co/tpusignup
78. Custom ML models Pre-trained ML models
Machine Learning
Engine
TensorFlow
Vision API
Translation
API
Natural
Language API
Speech API Jobs API
Machine Learning in Google Cloud
Video
Intelligence API
79. Google Confidential + Proprietary (permission granted to share within NIST)
Machine Learning for
Higher Performance Machine Learning
Models
80. Device Placement with Reinforcement Learning
Measured time
per step gives
RL reward signal
Placement model (trained
via RL) gets graph as input
+ set of devices, outputs
device placement for each
graph node
Device Placement Optimization with Reinforcement Learning,
Azalia Mirhoseini, Hieu Pham, Quoc Le, Mohammad Norouzi, Samy Bengio, Benoit Steiner, Yuefeng Zhou,
Naveen Kumar, Rasmus Larsen, and Jeff Dean, to appear in ICML 2017, arxiv.org/abs/1706.04972
+19.7% faster vs. expert human for InceptionV3+19.3% faster vs. expert human for NMT model
83. Example queries of the future
Which of these eye
images shows
symptoms of diabetic
retinopathy?
Please fetch me a cup
of tea from the kitchen
Describe this video
in Spanish
Find me documents related to
reinforcement learning for
robotics and summarize them
in German
84. Conclusions
Deep neural networks are making significant strides in
speech, vision, language, search, robotics, healthcare, …
If you’re not considering how to use deep neural nets to solve
your problems, you almost certainly should be