Urs Köster Presenting at RE-Work DL Summit in Boston

•

1 like•425 views

Intel Nervana

Deep Learning at Scale

Technology

Proprietary and conﬁdential. Do not distribute.
ner va na
About nervana
2
• A platform for machine intelligence
• enable deep learning at scale
• optimized from algorithms to silicon
X

Proprietary and conﬁdential. Do not distribute.
ner va na
The Nervana Platform - a full-stack solution
3
neon deep
learning
framework
nervana
cloud Solutions
Images
Text
Tabular
Speech
Time series
Video

neon: nervana python deep learning library
4
• User-friendly, extensible, fast
• Support for many deep learning models
• Interface to nervana cloud
• Multiple backends
• nervana engine
• GPU (optimized assembler kernels)
• CPU cluster
Open source (Apache 2.0) on
github.com/nervanaSystems/neon

Proprietary and conﬁdential. Do not distribute.
ner va na
Nervana Cloud
5
web interface
command line

Proprietary and conﬁdential. Do not distribute.
ner va na
Deep learning as a core technology
6
DL
Photos Maps
Voice
Search
Self-driving
car
Ad
Targeting
Machine
Translation
‘Google Brain’ model
DL
Image
Classification
Object
Localization
Video
Indexing
Speech
Recognition
Nervana Platform
Natural
Language

Proprietary and conﬁdential. Do not distribute.
ner va na
Video recognition with 3D convolution
7
Training Speed
0
0.25
0.5
0.75
1
epochs / hour
neon caffe

Proprietary and conﬁdential. Do not distribute.
ner va na
Object Localization / Segmentation
8
CamVid Dataset
SegNet model
KITTI Dataset
Fast R-CNN model
neon (ms) caﬀe (ms) Speedup
Fast-RCNN (batch size=4) 360 670 1.8x
SegNet (batch size=4) 267 1455 5.4x
SegNet (4 GPUs, batch size=16) 348 -- *5.9x

Proprietary and conﬁdential. Do not distribute.
ner va na
Image Classification (Residual Network)
9

Proprietary and conﬁdential. Do not distribute.
ner va na
Speech to text
10

Proprietary and conﬁdential. Do not distribute.
ner va na
Imagenet ILSVRC Challenge
11
Top-5errorrate
0%
10%
20%
30%
2010 2011 2012 2013 2014 2015
Deep learning
human
performance
AlexNet
ClarifaiGoogleNet
ResNet

Proprietary and conﬁdential. Do not distribute.
ner va na 12
• Same model, better performance:
• Hardware improvements
• Algorithmic improvements
Speeding up Deep Learning
0
100
200
300
400
500
600
CPU GTX580TitanX neon
Soumith's AlexNet Benchmark
ms
0
100
200
300
400
500
4/2015 8/2015 3/2016
neon
CuDNN
Soumith's GoogleNet Benchmark
ms
0
100
200
300
400
500
4/2015 8/2015 3/2016
neon
CuDNN
15,000
...
Alexnet ms / iteration

Proprietary and conﬁdential. Do not distribute.
ner va na
Dennard scaling has ended
13
# OF PROCESSORS
LEARNING
SPEED
INDUSTRY STANDARD:
COMMUNICATION
OVERHEAD =
PERFORMANCE CEILING
NERVANA: BETTER
COMMUNICATION
FABRIC, NEAR
LINEAR SCALING
Transistors
Clock speed
Power
Perf / clock

Proprietary and conﬁdential. Do not distribute.
ner va na
Nervana Engine (coming in 2017)
14
• Unprecedented computing power
• 10x speedup over current GPUs
• More memory on-chip
• High-Bandwidth Memory off-chip
• Six bi-directional high-bandwidth
links for 3D torus interconnect
• 8 chips in a box, seamlessly scale
to multiple chassis

Proprietary and conﬁdential. Do not distribute.
ner va na
Summary
15
• Deep learning is a new computational paradigm
• Learning and Inference on data
• neon with state-of-the-art GPU kernels
• Nervana Cloud with multi-GPU training
• Watch for Nervana Engine deep learning processor

Urs Köster Presenting at RE-Work DL Summit in Boston

What's hot

ODSC WestIntel Nervana

Introduction to Deep Learning with Will ConstableIntel Nervana

NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA Taiwan

Introduction to deep learning @ Startup.ML by Andres RodriguezIntel Nervana

Nervana SystemsNand Dalal

Deep Learning for RoboticsIntel Nervana

Using neon for pattern recognition in audio dataIntel Nervana

NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA Taiwan

Squeezing Deep Learning Into Mobile PhonesAnirudh Koul

Improving Hardware Efficiency for DNN ApplicationsChester Chen

A Platform for Accelerating Machine Learning ApplicationsNVIDIA Taiwan

Intel Nervana Artificial Intelligence Meetup 1/31/17Intel Nervana

An Introduction to Deep Learning (May 2018)Julien SIMON

Deep Learning with Microsoft R OpenPoo Kuan Hoong

Introduction to multi gpu deep learning with DIGITS 2 - Mike WangPAPIs.io

Recent developments in Deep LearningBrahim HAMADICHAREF

Deep Learning Computer BuildPetteriTeikariPhD

Mastering Computer Vision Problems with State-of-the-art Deep LearningMiguel González-Fierro

Faster deep learning solutions from training to inference - Michele Tameni - ...Codemotion

Affordable AI Connects To A Better LifeNVIDIA Taiwan

What's hot (20)

ODSC West

Introduction to Deep Learning with Will Constable

NVIDIA 深度學習教育機構 (DLI): Approaches to object detection

Introduction to deep learning @ Startup.ML by Andres Rodriguez

Nervana Systems

Deep Learning for Robotics

Using neon for pattern recognition in audio data

NVIDIA 深度學習教育機構 (DLI): Neural network deployment

Squeezing Deep Learning Into Mobile Phones

Improving Hardware Efficiency for DNN Applications

A Platform for Accelerating Machine Learning Applications

Intel Nervana Artificial Intelligence Meetup 1/31/17

An Introduction to Deep Learning (May 2018)

Deep Learning with Microsoft R Open

Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang

Recent developments in Deep Learning

Deep Learning Computer Build

Mastering Computer Vision Problems with State-of-the-art Deep Learning

Faster deep learning solutions from training to inference - Michele Tameni - ...

Affordable AI Connects To A Better Life

Viewers also liked

Object Detection and Recognition Intel Nervana

Object recognitionGeraldyne Gengania

Object recognitionsaniacorreya

Video Activity Recognition and NLP Q&A Model ExampleIntel Nervana

Object recognitionakkichester

Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Intel Nervana

Nervana AI Overview Deck April 2016Sean Everett

An Analysis of Convolution for InferenceIntel Nervana

Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa

Anil Thomas - Object recognitionIntel Nervana

High-Performance GPU Programming for Deep LearningIntel Nervana

Deep Learning for Computer Vision: Attention Models (UPC 2016)Universitat Politècnica de Catalunya

Object RecognitionEman Abed AlWahhab

Machine Translation Introductionnlab_utokyo

Big Data visualization with Apache Spark and Zeppelinprajods

Viewers also liked (15)

Object Detection and Recognition

Object recognition

Video Activity Recognition and NLP Q&A Model Example

Object recognition

Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...

Nervana AI Overview Deck April 2016

An Analysis of Convolution for Inference

Apache Hadoop YARN, NameNode HA, HDFS Federation

Anil Thomas - Object recognition

High-Performance GPU Programming for Deep Learning

Deep Learning for Computer Vision: Attention Models (UPC 2016)

Object Recognition

Machine Translation Introduction

Big Data visualization with Apache Spark and Zeppelin

Similar to Urs Köster Presenting at RE-Work DL Summit in Boston

Modern frameworks for machine learningSergii Nechuiviter

Deep Learning Workflows: Training and InferenceNVIDIA

TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...Willy Marroquin (WillyDevNET)

Bringing Deep Learning into production Paolo Platter

PPT5: Neuron Introductionakira-ai

Nvidia at SEMICon, MunichAlison B. Lowndes

HPE and NVIDIA empowering AI and IoTRenee Yao

Deep learning on mobileAnirudh Koul

abelbrownnvidiarakuten2016-170208065814 (1).pptxgopikahari7

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski

Which Is Deeper - Comparison Of Deep Learning Frameworks On SparkSpark Summit

Deep Learning on Qubole Data PlatformShivaji Dutta

Amazon Deep LearningAmanda Mackay (she/her)

"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...Edge AI and Vision Alliance

Deep Learning Accelerator Design TechniquesMindos Cheng

ApacheCon 2021 Apache Deep Learning 302Timothy Spann

Deep Learning at the EdgeJulien SIMON

Deep Learning and Recurrent Neural Networks in the EnterpriseJosh Patterson

NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA Taiwan

Introduction to Deep Learning (NVIDIA)Rakuten Group, Inc.

Similar to Urs Köster Presenting at RE-Work DL Summit in Boston (20)

Modern frameworks for machine learning

Deep Learning Workflows: Training and Inference

TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...

Bringing Deep Learning into production

PPT5: Neuron Introduction

Nvidia at SEMICon, Munich

HPE and NVIDIA empowering AI and IoT

Deep learning on mobile

abelbrownnvidiarakuten2016-170208065814 (1).pptx

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Which Is Deeper - Comparison Of Deep Learning Frameworks On Spark

Deep Learning on Qubole Data Platform

Amazon Deep Learning

"Fast Deployment of Low-power Deep Learning on CEVA Vision Processors," a Pre...

Deep Learning Accelerator Design Techniques

ApacheCon 2021 Apache Deep Learning 302

Deep Learning at the Edge

Deep Learning and Recurrent Neural Networks in the Enterprise

NVIDIA DGX-1 超級電腦與人工智慧及深度學習

Introduction to Deep Learning (NVIDIA)

Recently uploaded

AI as an Interface for Commercial BuildingsMemoori

Install Stable Diffusion in windows machinePadma Pradeep

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Training state-of-the-art general text embeddingZilliz

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

CloudStudio User manual (basic edition):comworks

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Recently uploaded (20)

AI as an Interface for Commercial Buildings

Install Stable Diffusion in windows machine

Powerpoint exploring the locations used in television show Time Clash

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Anypoint Exchange: It’s Not Just a Repo!

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost

Advanced Test Driven-Development @ php[tek] 2024

Training state-of-the-art general text embedding

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

Nell’iperspazio con Rocket: il Framework Web di Rust!

CloudStudio User manual (basic edition):

My Hashitalk Indonesia April 2024 Presentation

Scanning the Internet for External Cloud Exposures via SSL Certs

Vertex AI Gemini Prompt Engineering Tips

DMCC Future of Trade Web3 - Special Edition

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Unraveling Multimodality with Large Language Models.pdf

Designing IA for AI - Information Architecture Conference 2024

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Search Engine Optimization SEO PDF for 2024.pdf

Urs Köster Presenting at RE-Work DL Summit in Boston

1. Proprietary and conﬁdential. Do not distribute. Deep Learning at Scale May 2016 Urs Köster, PhD Nervana MAKING MACHINES SMARTER.

2. Proprietary and conﬁdential. Do not distribute. ner va na About nervana 2 • A platform for machine intelligence • enable deep learning at scale • optimized from algorithms to silicon X

3. Proprietary and conﬁdential. Do not distribute. ner va na The Nervana Platform - a full-stack solution 3 neon deep learning framework nervana cloud Solutions Images Text Tabular Speech Time series Video

4. neon: nervana python deep learning library 4 • User-friendly, extensible, fast • Support for many deep learning models • Interface to nervana cloud • Multiple backends • nervana engine • GPU (optimized assembler kernels) • CPU cluster Open source (Apache 2.0) on github.com/nervanaSystems/neon

5. Proprietary and conﬁdential. Do not distribute. ner va na Nervana Cloud 5 web interface command line

6. Proprietary and conﬁdential. Do not distribute. ner va na Deep learning as a core technology 6 DL Photos Maps Voice Search Self-driving car Ad Targeting Machine Translation ‘Google Brain’ model DL Image Classification Object Localization Video Indexing Speech Recognition Nervana Platform Natural Language

7. Proprietary and conﬁdential. Do not distribute. ner va na Video recognition with 3D convolution 7 Training Speed 0 0.25 0.5 0.75 1 epochs / hour neon caffe

8. Proprietary and conﬁdential. Do not distribute. ner va na Object Localization / Segmentation 8 CamVid Dataset SegNet model KITTI Dataset Fast R-CNN model neon (ms) caﬀe (ms) Speedup Fast-RCNN (batch size=4) 360 670 1.8x SegNet (batch size=4) 267 1455 5.4x SegNet (4 GPUs, batch size=16) 348 -- *5.9x

9. Proprietary and conﬁdential. Do not distribute. ner va na Image Classification (Residual Network) 9

10. Proprietary and conﬁdential. Do not distribute. ner va na Speech to text 10

11. Proprietary and conﬁdential. Do not distribute. ner va na Imagenet ILSVRC Challenge 11 Top-5errorrate 0% 10% 20% 30% 2010 2011 2012 2013 2014 2015 Deep learning human performance AlexNet ClarifaiGoogleNet ResNet

12. Proprietary and conﬁdential. Do not distribute. ner va na 12 • Same model, better performance: • Hardware improvements • Algorithmic improvements Speeding up Deep Learning 0 100 200 300 400 500 600 CPU GTX580TitanX neon Soumith's AlexNet Benchmark ms 0 100 200 300 400 500 4/2015 8/2015 3/2016 neon CuDNN Soumith's GoogleNet Benchmark ms 0 100 200 300 400 500 4/2015 8/2015 3/2016 neon CuDNN 15,000 ... Alexnet ms / iteration

13. Proprietary and conﬁdential. Do not distribute. ner va na Dennard scaling has ended 13 # OF PROCESSORS LEARNING SPEED INDUSTRY STANDARD: COMMUNICATION OVERHEAD = PERFORMANCE CEILING NERVANA: BETTER COMMUNICATION FABRIC, NEAR LINEAR SCALING Transistors Clock speed Power Perf / clock

14. Proprietary and conﬁdential. Do not distribute. ner va na Nervana Engine (coming in 2017) 14 • Unprecedented computing power • 10x speedup over current GPUs • More memory on-chip • High-Bandwidth Memory off-chip • Six bi-directional high-bandwidth links for 3D torus interconnect • 8 chips in a box, seamlessly scale to multiple chassis

15. Proprietary and conﬁdential. Do not distribute. ner va na Summary 15 • Deep learning is a new computational paradigm • Learning and Inference on data • neon with state-of-the-art GPU kernels • Nervana Cloud with multi-GPU training • Watch for Nervana Engine deep learning processor

Urs Köster Presenting at RE-Work DL Summit in Boston

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Urs Köster Presenting at RE-Work DL Summit in Boston

Similar to Urs Köster Presenting at RE-Work DL Summit in Boston (20)

Recently uploaded

Recently uploaded (20)

Urs Köster Presenting at RE-Work DL Summit in Boston