SlideShare a Scribd company logo
1 of 72
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Aran Khanna, AI Engineer, AWS Deep Learning
Deep Learning at the Edge With
Apache MXNet
Amazon AI
GRT Intern
Amazon AI
What Do These Have in Common?
Deep Neural Networks
Inputs Outputs
…At The Edge
Inputs Outputs
Deep Neural Networks At The Edge
Overview
Motivating Problems in DL at the Edge
Why Apache MXNet
From the Metal To the Models With
MXNet
DL at the Edge with AWS
Why The Edge, When We have the Cloud?
VS.
Why The Edge, When We have the Cloud?
Latency
VS.
Why The Edge, When We have the Cloud?
Latency
Connectivity
VS.
Why The Edge, When We have the Cloud?
Latency
Connectivity
Cost
VS.
Why The Edge, When We have the Cloud?
Latency
Connectivity
Cost
Privacy/Security
VS.
Motivating Examples
• Real Time Filtering (Neural Style Transfer)
Motivating Examples
• Industrial IoT (Out of Distribution/Anomaly Detection)
Motivating Examples
• Robotics (Object Detection and Recognition)
Motivating Examples
• Autonomous Driving Systems
Infrastructure GPU CPU IoT Mobile
Amazon AI : Artificial Intelligence In The Hands Of Every Developer
Engines MXNet TensorFlow Caffe Theano Pytorch CNTK
Platforms Amazon
ML
Spark &
EMR
Kinesis Batch ECS
Services
Rekognition Polly
ChatSpeechVision
Lex
Infrastructure GPU CPU IoT Mobile
Amazon AI : Artificial Intelligence In The Hands Of Every Developer
Engines MXNet TensorFlow Caffe Theano Pytorch CNTK
Overview
Motivating Problems in DL at the Edge
Why Apache MXNet
From the Metal To the Models With
MXNet
DL at the Edge with AWS
Deep Learning Frameworks
Flexible Portable Performance
Mixed Programming API Runs Everywhere Near Linear Scaling
Apache MXNet | Differentiators
Flexible Portable Performance
Mixed Programming API Runs Everywhere Near Linear Scaling
Apache MXNet | Differentiators
>>> import mxnet as mx
>>> a = mx.nd.zeros((100, 50))
>>> b = mx.nd.ones((100, 50))
>>> c = a + b
>>> c += 1
>>> print(c)
IMPERATIVE
NDARRAY API
>>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, num_hidden=128)
>>> net = mx.symbol.SoftmaxOutput(data=net)
>>> texec = mx.module.Module(net)
>>> texec.forward(data=c)
>>> texec.backward()
DECLARATIVE
SYMBOLIC
EXECUTOR
Apache MXNet | Flexible Programming
Flexible Portable Performance
Mixed Programming API Runs Everywhere Near Linear Scaling
Apache MXNet | Differentiators
Ideal
Inception v3
Resnet
Alexnet
88%
Efficiency
1 2 4 8 16 32 64 128 256
No. of GPUs
Apache MXNet | Efficient Scaling
Flexible Portable Performance
Mixed Programming API Runs Everywhere Near Linear Scaling
Apache MXNet | Differentiators
Apache MXNet | On Mobile Devices
https://mxnet.incubator.apache.org
/how_to/smart_device.html
mxnet.incubator.apache.org/get_started/install.html
Apache MXNet | On IoT Devices
Most
Open
Best On
AWS
Optimized for deep learning on
AWS
Accepted into the Apache Incubator
Apache MXNet | Community
35%
Outpacing
Contributors
Diverse Community
0 40,000
Yutian Li (Stanford)
Nan Zhu (MSFT)
Liang Depeng (Sun Yat-sen U.)
Xingjian Shi (HKUST)
Tianjun Xiao (Tesla)
Chiyuan Zhang (MIT)
Yao Wang (AWS)
Jian Guo (TuSimple)
Yizhi Liu (Mediav)
Sandeep K. (AWS)
Sergey Kolychev (Whitehat)
Eric Xie (AWS)
Tianqi Chen (UW)
Mu Li (AWS)
Bing Su (Apple)
*As of 3/30/17
**Amazon @35% of Contributions
| Amazon Contributions
| Torch, Theano, CNTK
Apple, Tesla, Microsoft, NYU,
MIT, Stanford, Lots of others..
|
Apache MXNet | Community
Apache MXNet | Apple CoreML
pip install mxnet-to-coreml
Apache MXNet | Easy to Get Started
http://gluon.mxnet.io/
Overview
Motivating Problems in DL at the Edge
Why Apache MXNet
From the Metal To the Models With
MXNet
DL at the Edge with AWS
What Are the Challenges at the Edge?
The Metal: Heterogeneity
In the Cloud
• X86_64
• CUDA GPU
The Metal: Heterogeneity
In the Cloud
• X86_64
• CUDA GPU
At the Edge
• X86_64, X86_32, ARM, Arch64, Android, iOS
• OpenCL GPU, CUDA GPU, Metal GPU
• NEON DSP, Hexagon DSP
• Custom Accelerators, FPGA
The Metal: Performance Gap
Low End:
Raspberry Pi 3
- 32 Bit ARMv7
- ARM NEON
- 1GB Ram
High End:
NVIDIA Jetson
- ARM Arch64
- 128 CUDA Cores
- 8GB RAM
The Metal: The Problem
How Can We Adapt Our Models?
The Models: Where is Our Cost?
Convolutions are expensive
The Models: Where is Our Cost?
Models are generally over parameterized
Cheaper Convolutions: Winograd
Convolution in Time Domain = Pointwise Multiplication in Frequency Domain
Under the Hood in MXNet with integrations in NNPACK, CUDA etc.
Cheaper Convolutions: Separable Convolutions
Good for devices that can’t run lots of multiplications in parallel
Convolve separately over each depth channel of input
followed by 1x1 convolutions to merge channels
Depth Separable Convolutions in MXNet
>>> import mxnet as mx
>>> x = mx.sym.Variable('x')
>>> w = mx.sym.Variable('w')
>>> b = mx.sym.Variable('b')
>>> xslice = mx.sym.SliceChannel(data=x, num_outputs=num_group, axis=1)
>>> wslice = mx.sym.SliceChannel(data=w, num_outputs=num_group, axis=0)
>>> bslice = mx.sym.SliceChannel(data=b, num_outputs=num_group, axis=0)
>>> y_sep = mx.sym.Concat(*[mx.sym.Convolution(data=xslice[i],
weight=wslice[i], bias=bslice[i], num_filter=num_filter//num_group,
kernel=kernel, stride=stride, pad=pad) for i in range(num_group)])
>>> y = mx.sym.Convolution(data=x, weight=w, bias=b, num_filter=num_filter,
num_group=num_group, kernel=kernel, stride=stride, pad=pad)
Fewer Parameters: Quantization
Good for devices with hardware to accelerate low precision operations
Map activations into lower bit-width buckets and multiply with quantized weights
Quantization in MXNet
>>> import mxnet as mx
>>> min0 = mx.nd.array([0.0])
>>> max0 = mx.nd.array([1.0])
>>> sym = mx.nd.array([[0.1392, 0.5928], [0.6027, 0.8579]]
>>> quantized_sym, min1, max1 = mx.nd.contrib.quantize(a, min0, max0,
out_type='uint8')
>>> dequantized_sym = mx.nd.contrib.dequantize(quantized_sym, min1, max1,
out_type='float32')
Fewer Parameters: Weight Pruning
Prune unused weights during training
Good at high sparsity for devices with fast sparse multiplication
Weight Pruning in MXNet
>>> # Assume we have defined a model and training data set
>>> model.fit(train,
>>> eval_data=val,
>>> eval_metric='acc',
>>> num_epoch=10,
>>> optimizer='sparsesgd’,
>>> optimizer_params={'learning_rate' : 0.1,
>>> 'wd' : 0.004,
>>> 'momentum' : 0.9,
>>> 'pruning_switch_epoch' : 5,
>>> 'weight_sparsity' : 0.8,
>>> 'bias_sparsity' : 0.0,
>>> }
Weight Pruning in MXNet
Fewer Parameters: Efficient Architectures
SqueezeNet: AlexNet Accuracy with 50x Fewer Parameters
Good for devices with low RAM that can’t hold all weights for larger
models concurrently in memory
Efficient Architectures in MXNet
https://mxnet.incubator.apache.org/model_zoo/
Fewer Parameters: Tensor Decompositions
CVPR paper at arxiv.org/abs/1706.00439
Code at https://github.com/tensorly/tensorly
Table of Model Optimization Techniques
Winograd
Convolutions
Separable
Convolutions
Quantization Tensor
Contractions
Sparsity
Exploitation
Weight
Sharing
CPU
Acceleration
+ ++ = ++ + +
GPU
Acceleration
+ + + + = +
Model
Size
= = - - - -
Model
Accuracy
= - - - - -
Specialized
Hardware
Acceleration
+ + ++ + + +
Edge Model Optimization Benefits The Cloud
Models with fewer parameters often
generalize better
Tricks from the edge can be applied in
the cloud
Pre-processing with edge models decreases
compute load in the cloud
Overview
Motivating Problems in DL at the Edge
Why Apache MXNet
From the Metal To the Models With
MXNet
DL at the Edge with AWS
Tons of GPUs and CPUs
Serverless
At the Edge, On IoT Devices
Prediction
The Challenge For Artificial Intelligence: SCALE
Tons of GPUs
Elastic capacity
Training
Pre-built images
Aggressive migration
New data created on AWS
Data
PBs of existing data
p2 instances
Up to 40k CUDA cores
Deep Learning AMI
Pre-configured for Deep Learning
CFN Template
Launch a Deep Learning Cluster
AWS Tools for Deep Learning
AWS Deep Learning AMI: One-Click Deep Learning
Kepler, Volta
& Skylake
Apache
MXNet
Python 2/3 Notebooks
& Examples
(and others)
https://aws.amazon.com/amazon-ai/amis/
AWS IoT and AWS Greengrass
Manage and Monitor Models on The Fly
AWS
Captured Data
Upload
Tagged
Data
Escalate to
AI Service
Escalate to
Custom
Model on P2
Deploy
and
Manage
Model
Local Learning Loop
Poorly
Classified
Data
Updated
Model
Fine Tune Model With
Accurate Classification
Getting Started with MXNet at the Edge+ AWS IoT
http://amzn.to/2h6kPvY
Running AI In Production on AWS Today
We’re Hiring!
Thank You!
Aran Khanna – arankhan@amazon.com
GRT Intern

More Related Content

What's hot

"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ..."Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
Edge AI and Vision Alliance
 

What's hot (20)

Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflow
 
Keras: Deep Learning Library for Python
Keras: Deep Learning Library for PythonKeras: Deep Learning Library for Python
Keras: Deep Learning Library for Python
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
Data Science and Deep Learning on Spark with 1/10th of the Code with Roope As...
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
 
TinyML as-a-Service
TinyML as-a-ServiceTinyML as-a-Service
TinyML as-a-Service
 
Tensorflow vs MxNet
Tensorflow vs MxNetTensorflow vs MxNet
Tensorflow vs MxNet
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ..."Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
 
Neural networks and google tensor flow
Neural networks and google tensor flowNeural networks and google tensor flow
Neural networks and google tensor flow
 
Machine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud Platform
 

Viewers also liked

Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
MLconf
 
Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017
Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017
Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017
MLconf
 
Artemy Malkov, CEO, Data Monsters at The AI Conference 2017
Artemy Malkov, CEO, Data Monsters at The AI Conference 2017 Artemy Malkov, CEO, Data Monsters at The AI Conference 2017
Artemy Malkov, CEO, Data Monsters at The AI Conference 2017
MLconf
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
MLconf
 
Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...
Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...
Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...
MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
MLconf
 
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
MLconf
 

Viewers also liked (20)

Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017Daniel Shank, Data Scientist, Talla at MLconf SF 2017
Daniel Shank, Data Scientist, Talla at MLconf SF 2017
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...
Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...
Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...
 
Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017
Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017
Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017
 
Artemy Malkov, CEO, Data Monsters at The AI Conference 2017
Artemy Malkov, CEO, Data Monsters at The AI Conference 2017 Artemy Malkov, CEO, Data Monsters at The AI Conference 2017
Artemy Malkov, CEO, Data Monsters at The AI Conference 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
 
Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...
Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...
Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
 
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
 
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
 
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
 
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
 

Similar to Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017

洞悉未來運算:量子與5G、混合雲架構與EC2新應用
洞悉未來運算:量子與5G、混合雲架構與EC2新應用洞悉未來運算:量子與5G、混合雲架構與EC2新應用
洞悉未來運算:量子與5G、混合雲架構與EC2新應用
Amazon Web Services
 
모두를 위한 MxNET - AWS Summit Seoul 2017
모두를 위한 MxNET - AWS Summit Seoul 2017모두를 위한 MxNET - AWS Summit Seoul 2017
모두를 위한 MxNET - AWS Summit Seoul 2017
Amazon Web Services Korea
 
Virtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud ComptingVirtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud Compting
Ahmed Mekkawy
 

Similar to Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017 (20)

Artificial Intelligence on the AWS Cloud - AWS Innovate Ottawa
Artificial Intelligence on the AWS Cloud - AWS Innovate OttawaArtificial Intelligence on the AWS Cloud - AWS Innovate Ottawa
Artificial Intelligence on the AWS Cloud - AWS Innovate Ottawa
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Machine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetMachine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNet
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
洞悉未來運算:量子與5G、混合雲架構與EC2新應用
洞悉未來運算:量子與5G、混合雲架構與EC2新應用洞悉未來運算:量子與5G、混合雲架構與EC2新應用
洞悉未來運算:量子與5G、混合雲架構與EC2新應用
 
Machine Learning inference at the Edge
Machine Learning inference at the EdgeMachine Learning inference at the Edge
Machine Learning inference at the Edge
 
Using Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML ModelsUsing Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML Models
 
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul MaddoxAWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
 
Deep Learning with Apache MXNet
Deep Learning with Apache MXNetDeep Learning with Apache MXNet
Deep Learning with Apache MXNet
 
모두를 위한 MxNET - AWS Summit Seoul 2017
모두를 위한 MxNET - AWS Summit Seoul 2017모두를 위한 MxNET - AWS Summit Seoul 2017
모두를 위한 MxNET - AWS Summit Seoul 2017
 
Using Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML ModelsUsing Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML Models
 
Build, Train & Deploy ML Models Using SageMaker
Build, Train & Deploy ML Models Using SageMakerBuild, Train & Deploy ML Models Using SageMaker
Build, Train & Deploy ML Models Using SageMaker
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNetAWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
 
Build, Train, & Deploy ML Models Using SageMaker
Build, Train, & Deploy ML Models Using SageMakerBuild, Train, & Deploy ML Models Using SageMaker
Build, Train, & Deploy ML Models Using SageMaker
 
Virtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud ComptingVirtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud Compting
 
How to build a Citrix infrastructure on AWS
How to build a Citrix infrastructure on AWSHow to build a Citrix infrastructure on AWS
How to build a Citrix infrastructure on AWS
 
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
 
Time series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 LausanneTime series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 Lausanne
 

More from MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Aran Khanna, AI Engineer, AWS Deep Learning Deep Learning at the Edge With Apache MXNet Amazon AI GRT Intern
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. What Do These Have in Common?
  • 11. Deep Neural Networks At The Edge
  • 12. Overview Motivating Problems in DL at the Edge Why Apache MXNet From the Metal To the Models With MXNet DL at the Edge with AWS
  • 13. Why The Edge, When We have the Cloud? VS.
  • 14. Why The Edge, When We have the Cloud? Latency VS.
  • 15. Why The Edge, When We have the Cloud? Latency Connectivity VS.
  • 16. Why The Edge, When We have the Cloud? Latency Connectivity Cost VS.
  • 17. Why The Edge, When We have the Cloud? Latency Connectivity Cost Privacy/Security VS.
  • 18. Motivating Examples • Real Time Filtering (Neural Style Transfer)
  • 19. Motivating Examples • Industrial IoT (Out of Distribution/Anomaly Detection)
  • 20. Motivating Examples • Robotics (Object Detection and Recognition)
  • 22. Infrastructure GPU CPU IoT Mobile Amazon AI : Artificial Intelligence In The Hands Of Every Developer Engines MXNet TensorFlow Caffe Theano Pytorch CNTK Platforms Amazon ML Spark & EMR Kinesis Batch ECS Services Rekognition Polly ChatSpeechVision Lex
  • 23. Infrastructure GPU CPU IoT Mobile Amazon AI : Artificial Intelligence In The Hands Of Every Developer Engines MXNet TensorFlow Caffe Theano Pytorch CNTK
  • 24. Overview Motivating Problems in DL at the Edge Why Apache MXNet From the Metal To the Models With MXNet DL at the Edge with AWS
  • 26. Flexible Portable Performance Mixed Programming API Runs Everywhere Near Linear Scaling Apache MXNet | Differentiators
  • 27. Flexible Portable Performance Mixed Programming API Runs Everywhere Near Linear Scaling Apache MXNet | Differentiators
  • 28. >>> import mxnet as mx >>> a = mx.nd.zeros((100, 50)) >>> b = mx.nd.ones((100, 50)) >>> c = a + b >>> c += 1 >>> print(c) IMPERATIVE NDARRAY API >>> import mxnet as mx >>> net = mx.symbol.Variable('data') >>> net = mx.symbol.FullyConnected(data=net, num_hidden=128) >>> net = mx.symbol.SoftmaxOutput(data=net) >>> texec = mx.module.Module(net) >>> texec.forward(data=c) >>> texec.backward() DECLARATIVE SYMBOLIC EXECUTOR Apache MXNet | Flexible Programming
  • 29. Flexible Portable Performance Mixed Programming API Runs Everywhere Near Linear Scaling Apache MXNet | Differentiators
  • 30. Ideal Inception v3 Resnet Alexnet 88% Efficiency 1 2 4 8 16 32 64 128 256 No. of GPUs Apache MXNet | Efficient Scaling
  • 31. Flexible Portable Performance Mixed Programming API Runs Everywhere Near Linear Scaling Apache MXNet | Differentiators
  • 32. Apache MXNet | On Mobile Devices https://mxnet.incubator.apache.org /how_to/smart_device.html
  • 34. Most Open Best On AWS Optimized for deep learning on AWS Accepted into the Apache Incubator Apache MXNet | Community
  • 35. 35% Outpacing Contributors Diverse Community 0 40,000 Yutian Li (Stanford) Nan Zhu (MSFT) Liang Depeng (Sun Yat-sen U.) Xingjian Shi (HKUST) Tianjun Xiao (Tesla) Chiyuan Zhang (MIT) Yao Wang (AWS) Jian Guo (TuSimple) Yizhi Liu (Mediav) Sandeep K. (AWS) Sergey Kolychev (Whitehat) Eric Xie (AWS) Tianqi Chen (UW) Mu Li (AWS) Bing Su (Apple) *As of 3/30/17 **Amazon @35% of Contributions | Amazon Contributions | Torch, Theano, CNTK Apple, Tesla, Microsoft, NYU, MIT, Stanford, Lots of others.. | Apache MXNet | Community
  • 36. Apache MXNet | Apple CoreML pip install mxnet-to-coreml
  • 37. Apache MXNet | Easy to Get Started http://gluon.mxnet.io/
  • 38. Overview Motivating Problems in DL at the Edge Why Apache MXNet From the Metal To the Models With MXNet DL at the Edge with AWS
  • 39. What Are the Challenges at the Edge?
  • 40. The Metal: Heterogeneity In the Cloud • X86_64 • CUDA GPU
  • 41. The Metal: Heterogeneity In the Cloud • X86_64 • CUDA GPU At the Edge • X86_64, X86_32, ARM, Arch64, Android, iOS • OpenCL GPU, CUDA GPU, Metal GPU • NEON DSP, Hexagon DSP • Custom Accelerators, FPGA
  • 42. The Metal: Performance Gap Low End: Raspberry Pi 3 - 32 Bit ARMv7 - ARM NEON - 1GB Ram High End: NVIDIA Jetson - ARM Arch64 - 128 CUDA Cores - 8GB RAM
  • 43. The Metal: The Problem
  • 44. How Can We Adapt Our Models?
  • 45. The Models: Where is Our Cost? Convolutions are expensive
  • 46. The Models: Where is Our Cost? Models are generally over parameterized
  • 47. Cheaper Convolutions: Winograd Convolution in Time Domain = Pointwise Multiplication in Frequency Domain Under the Hood in MXNet with integrations in NNPACK, CUDA etc.
  • 48. Cheaper Convolutions: Separable Convolutions Good for devices that can’t run lots of multiplications in parallel Convolve separately over each depth channel of input followed by 1x1 convolutions to merge channels
  • 49. Depth Separable Convolutions in MXNet >>> import mxnet as mx >>> x = mx.sym.Variable('x') >>> w = mx.sym.Variable('w') >>> b = mx.sym.Variable('b') >>> xslice = mx.sym.SliceChannel(data=x, num_outputs=num_group, axis=1) >>> wslice = mx.sym.SliceChannel(data=w, num_outputs=num_group, axis=0) >>> bslice = mx.sym.SliceChannel(data=b, num_outputs=num_group, axis=0) >>> y_sep = mx.sym.Concat(*[mx.sym.Convolution(data=xslice[i], weight=wslice[i], bias=bslice[i], num_filter=num_filter//num_group, kernel=kernel, stride=stride, pad=pad) for i in range(num_group)]) >>> y = mx.sym.Convolution(data=x, weight=w, bias=b, num_filter=num_filter, num_group=num_group, kernel=kernel, stride=stride, pad=pad)
  • 50. Fewer Parameters: Quantization Good for devices with hardware to accelerate low precision operations Map activations into lower bit-width buckets and multiply with quantized weights
  • 51. Quantization in MXNet >>> import mxnet as mx >>> min0 = mx.nd.array([0.0]) >>> max0 = mx.nd.array([1.0]) >>> sym = mx.nd.array([[0.1392, 0.5928], [0.6027, 0.8579]] >>> quantized_sym, min1, max1 = mx.nd.contrib.quantize(a, min0, max0, out_type='uint8') >>> dequantized_sym = mx.nd.contrib.dequantize(quantized_sym, min1, max1, out_type='float32')
  • 52. Fewer Parameters: Weight Pruning Prune unused weights during training Good at high sparsity for devices with fast sparse multiplication
  • 53. Weight Pruning in MXNet >>> # Assume we have defined a model and training data set >>> model.fit(train, >>> eval_data=val, >>> eval_metric='acc', >>> num_epoch=10, >>> optimizer='sparsesgd’, >>> optimizer_params={'learning_rate' : 0.1, >>> 'wd' : 0.004, >>> 'momentum' : 0.9, >>> 'pruning_switch_epoch' : 5, >>> 'weight_sparsity' : 0.8, >>> 'bias_sparsity' : 0.0, >>> }
  • 55. Fewer Parameters: Efficient Architectures SqueezeNet: AlexNet Accuracy with 50x Fewer Parameters Good for devices with low RAM that can’t hold all weights for larger models concurrently in memory
  • 56. Efficient Architectures in MXNet https://mxnet.incubator.apache.org/model_zoo/
  • 57. Fewer Parameters: Tensor Decompositions CVPR paper at arxiv.org/abs/1706.00439 Code at https://github.com/tensorly/tensorly
  • 58. Table of Model Optimization Techniques Winograd Convolutions Separable Convolutions Quantization Tensor Contractions Sparsity Exploitation Weight Sharing CPU Acceleration + ++ = ++ + + GPU Acceleration + + + + = + Model Size = = - - - - Model Accuracy = - - - - - Specialized Hardware Acceleration + + ++ + + +
  • 59. Edge Model Optimization Benefits The Cloud Models with fewer parameters often generalize better Tricks from the edge can be applied in the cloud Pre-processing with edge models decreases compute load in the cloud
  • 60. Overview Motivating Problems in DL at the Edge Why Apache MXNet From the Metal To the Models With MXNet DL at the Edge with AWS
  • 61. Tons of GPUs and CPUs Serverless At the Edge, On IoT Devices Prediction The Challenge For Artificial Intelligence: SCALE Tons of GPUs Elastic capacity Training Pre-built images Aggressive migration New data created on AWS Data PBs of existing data
  • 62. p2 instances Up to 40k CUDA cores Deep Learning AMI Pre-configured for Deep Learning CFN Template Launch a Deep Learning Cluster AWS Tools for Deep Learning
  • 63. AWS Deep Learning AMI: One-Click Deep Learning Kepler, Volta & Skylake Apache MXNet Python 2/3 Notebooks & Examples (and others)
  • 65.
  • 66. AWS IoT and AWS Greengrass
  • 67. Manage and Monitor Models on The Fly AWS Captured Data Upload Tagged Data Escalate to AI Service Escalate to Custom Model on P2 Deploy and Manage Model
  • 68. Local Learning Loop Poorly Classified Data Updated Model Fine Tune Model With Accurate Classification
  • 69. Getting Started with MXNet at the Edge+ AWS IoT http://amzn.to/2h6kPvY
  • 70. Running AI In Production on AWS Today
  • 72. Thank You! Aran Khanna – arankhan@amazon.com GRT Intern

Editor's Notes

  1. Evenly distribute the future; just the same as the original AWS goal.
  2. For computation efficiency Neurons are arranged in layers
  3. For computation efficiency Neurons are arranged in layers
  4. For computation efficiency Neurons are arranged in layers
  5. duolingo story
  6. duolingo story
  7. Hard to define the network the definition of the inception network has >1k lines of codes in Caffe Memory consumption is linear with number of layers
  8. duolingo story