SlideShare a Scribd company logo
1 of 20
Download to read offline
Accelerating algorithmic and
hardware advancements for
power efficient on-device AI
Qualcomm Technologies, Inc.
@qualcomm_techJuly 2018
2
By 2025, the data center sector could
be using 20% of all available electricity
in the world1
A cloud provider used the equivalent
energy consumption of ~366,000 US
households in 20142
Bitcoin mining in 2017 used the same
energy as did all of Ireland3
Computers
are consuming
an increasing
amount of energy
1. Andrae, Anders (2017) Total Consumer Power
Consumption Forecast; 2. The Verge (2014); 3.The Guardian (2017)
Given the economic potential
of AI, these numbers will
only be increasing
33
Will we have reached the capacity of the human brain?
Energy efficiency of a brain is 100x better than current hardware
Source: Welling
2025
Energyconsumption
1940 1950 1960 1970 1980 1990 2000 2010 2020 2030
1943: First NN (+/- N=10)
1988: NetTalk
(+/- N=20K)
2009: Hinton’s Deep
Belief Net (+/- N=10M)
2013: Google/Y!
(N=+/- 1B)
2025:
N = 100T = 1014
2017: Extremely large
neural networks (N =137B)
1012
1010
108
106
1014
104
102
100
Deep neural networks
are energy hungry
and growing fast
AI is being powered by the explosive
growth of deep neural networks
44
Value created by AI must exceed the cost to run the service
Broad economic viability requires energy efficient AI
Economic feasibility per transaction may
require cost as low as a micro-dollar
(1/10,000th of a cent)
• Personalized advertisements and recommendations
• Smart security monitoring based on image and sound recognition
• Efficiency improvements for smart cities and factories
5
The AI power and thermal ceiling
The challenge
of AI workloads
Very compute intensive
Complex concurrencies
Real-time
Always-on
Constrained
mobile environment
Must be thermally efficient
for sleek, ultra-light designs
Requires long battery
life for all-day use
Storage/memory
bandwidth limitations
66
Soon, AI algorithms will be
measured by the amount
of intelligence they provide
per watt hour.
77
Weight parameter
Activation node
Edges
Parts Objects
OutputPixels
Dog?
Cat?
Deep learning
The good
Convolutional
neural networks
(CNNs) have been
very successful
• Extract learnable features
with state-of-the art results
• Encode location invariance,
namely that the same object
may appear anywhere in
the image
• Share parameters, making
them “data efficient”
• Execute quickly on modern
hardware with parallel
processing
88
Weight parameter
Activation node
Edges
Parts Objects
OutputPixels
Dog?
Cat?
Deep learning
The bad and the ugly
CNNs use too much
memory, compute,
and energy (today)
• CNNs do not encode
additional symmetries, such
as rotation invariance
(object may appear in any
orientation1)
• CNNs do not reliably
quantify the confidence in a
prediction
• CNNs are easy to fool by
changing the input only
slightly, such as adversarial
examples
1. Cohen & Welling 2016
99
Edges
Parts Objects
OutputPixels
Dog?
Cat?
Bayesian deep-learning
addresses these challenges
Inspired by brain functionality, introducing noise
to neural networks is beneficial
Noise can be a
good thing for AI
Compression
and quantization
• Reduce complexity
of the neural network model
• Reduce bit-width of the
parameters and activations
• Save power and
improve efficiency
Introduce noise to weights Noise propagates to activations
1010
Edges
Parts Objects
OutputPixels
Dog
Cat
w1
w2 (Y)
w2
w1 Initial weight values Apply Bayesian
deep-learning to
shrink the model
Compression
and quantization
• Quantize weights:
Use lower precision (bit-width)
• Prune activations:
Reduce number of activation
nodes
1111
Edges
Parts Objects
OutputPixels
Introduce noise to parameters
Dog
Cat
w1
w2 (Y)
Prior P(W)—distribution before data (X)
w2
w1 Initial weight values Apply Bayesian
deep-learning to
shrink the model
Compression
and quantization
• Quantize weights:
Use lower precision (bit-width)
• Prune activations:
Reduce number of activation
nodes
1212
Edges
Parts Objects
OutputPixels
Introduce noise to parameters
Add data (X)
Dog
Cat
w1
w2 (Y)
Prior P(W)—distribution before data (X)
w2
Posterior P(W| X,Y)—
distribution after data
w1 Initial weight values Apply Bayesian
deep-learning to
shrink the model
Compression
and quantization
• Quantize weights:
Use lower precision (bit-width)
• Prune activations:
Reduce number of activation
nodes
1313
Edges
Parts Objects
OutputPixels
Introduce noise to parameters
Add data (X)
Dog
Cat
w1
w2
Prune activations
(similar method as
quantization)
X
Quantize weights
(or even remove)
(Y)
Prior P(W)—distribution before data (X)
w2
Posterior P(W| X,Y)—
distribution after data
w1 is pinned down
w2 is still highly uncertain
w1 Initial weight values Apply Bayesian
deep-learning to
shrink the model
Compression
and quantization
• Quantize weights:
Use lower precision (bit-width)
• Prune activations:
Reduce number of activation
nodes
14
Bayesian deep learning provides broad benefits
A powerful tool to address a variety of deep learning challenges
Compression
and quantization
Quantize parameters and
activations, prune model
components
Regularization
and generalization
Avoid overfitting data; choose
the simplest model to explain
observations (Occam’s razor)
Confidence
estimation
Generate the confidence
intervals of the predictions
Privacy/adversarial
robustness
Avoid storing personal
information in parameters, be
less sensitive to adversarial
attacks
1515
83%
84%
85%
86%
87%
88%
89%
90%
1 1.5 2 2.5 3 3.5
Top5Accuracy
Compression ratio
ResNet-18 on a large set of labeled images
Baseline Channel pruning SVD Bayesian pruning
Applying Bayesian deep learning to real use cases
Image classification
Zhang, Zou, He, & Sun 2015 (SVD);
He, Zhang, & Sun 2017 (channel pruning)
compression ratio while maintaining close to the same accuracy3X
16
What does the
future look like
for AI hardware?
Distributed computing
architectures
AI acceleration
research
1717
AI acceleration research
Balance AI acceleration capabilities across engines (CPU, GPU, DSP)
Focused on fundamentals to accelerate deep learning workloads at low power
Memory hierarchy
Optimize the memory hierarchy to reduce
the power consumption of data movement
while ensuring performance
Compute architecture
Optimize instruction type, parallelism,
and precision of the functional units
Utilization
Exploit sparsity to improve utilization and reduce power consumption
18
Algorithmic
advancements
Neural network algorithm design
optimized for hardware
Optimization for space and
runtime efficiency
Efficient
hardware
Efficient architecture design
Selecting the right compute
engine for the right task
Software
tools
Software-accelerated
run-time for deep learning
Neural processing SDK for
model compression and
optimization
The approach
for making
power efficient
on-device AI
Focusing on the joint
design of algorithms and
hardware to achieve
high performance
19
AI algorithms and hardware
need to be energy efficient
We are a leader in Bayesian
deep learning and its
applications to model
compression and quantization
We are doing fundamental
research on AI algorithms,
software, and hardware to
maximize power efficiency
Follow us on:
For more information, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Thank you
Nothing in these materials is an offer to sell any of the
components or devices referenced herein.
©2018 Qualcomm Technologies, Inc. and/or its affiliated
companies. All Rights Reserved.
Qualcomm is a trademark of Qualcomm Incorporated,
registered in the United States and other countries. Other
products and brand names may be trademarks or registered
trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm
Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries
or business units within the Qualcomm corporate structure, as
applicable. Qualcomm Incorporated includes Qualcomm’s licensing
business, QTL, and the vast majority of its patent portfolio. Qualcomm
Technologies, Inc., a wholly-owned subsidiary of Qualcomm
Incorporated, operates, along with its subsidiaries, substantially all of
Qualcomm’s engineering, research and development functions, and
substantially all of its product and services businesses, including its
semiconductor business, QCT.

More Related Content

What's hot

Pushing the boundaries of AI research
Pushing the boundaries of AI researchPushing the boundaries of AI research
Pushing the boundaries of AI researchQualcomm Research
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scaleQualcomm Research
 
White Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected EraWhite Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected EraCharo Sanchez
 
“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...
“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...
“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...Edge AI and Vision Alliance
 
On-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VROn-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VRQualcomm Research
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edgeQualcomm Research
 
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)AugmentedWorldExpo
 
AI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptAI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptQualcomm Research
 
Forwarding Plane Opportunities: How to Accelerate Deployment
Forwarding Plane Opportunities: How to Accelerate DeploymentForwarding Plane Opportunities: How to Accelerate Deployment
Forwarding Plane Opportunities: How to Accelerate DeploymentCharo Sanchez
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingQualcomm Research
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless InnovationTakayuki Yamazaki
 
Artificial Intelligence in the Network
Artificial Intelligence in the Network Artificial Intelligence in the Network
Artificial Intelligence in the Network Michelle Holley
 
5G Multi-Access Edge Compute
5G Multi-Access Edge Compute5G Multi-Access Edge Compute
5G Multi-Access Edge ComputeMichelle Holley
 
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr..."Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...Edge AI and Vision Alliance
 
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...Edge AI and Vision Alliance
 
latencyin fiber optic networks
latencyin fiber optic networkslatencyin fiber optic networks
latencyin fiber optic networksMapYourTech
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Why SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networkingWhy SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networkingON.Lab
 

What's hot (20)

Pushing the boundaries of AI research
Pushing the boundaries of AI researchPushing the boundaries of AI research
Pushing the boundaries of AI research
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scale
 
White Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected EraWhite Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected Era
 
“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...
“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...
“5G and AI Transforming the Next Generation of Robotics,” a Presentation from...
 
On-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VROn-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VR
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge
 
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
 
AI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptAI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-concept
 
Forwarding Plane Opportunities: How to Accelerate Deployment
Forwarding Plane Opportunities: How to Accelerate DeploymentForwarding Plane Opportunities: How to Accelerate Deployment
Forwarding Plane Opportunities: How to Accelerate Deployment
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous driving
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation5G AI the Ingredients for Next Gen Wireless Innovation
5G AI the Ingredients for Next Gen Wireless Innovation
 
FPGAs and Machine Learning
FPGAs and Machine LearningFPGAs and Machine Learning
FPGAs and Machine Learning
 
Artificial Intelligence in the Network
Artificial Intelligence in the Network Artificial Intelligence in the Network
Artificial Intelligence in the Network
 
5G Multi-Access Edge Compute
5G Multi-Access Edge Compute5G Multi-Access Edge Compute
5G Multi-Access Edge Compute
 
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr..."Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
"Image Sensor Formats and Interfaces for IoT Applications," a Presentation fr...
 
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
“Visual AI at the Edge: From Surveillance Cameras to People Counters,” a Pres...
 
latencyin fiber optic networks
latencyin fiber optic networkslatencyin fiber optic networks
latencyin fiber optic networks
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Why SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networkingWhy SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networking
 

Similar to Accelerating algorithmic and hardware advancements for power efficient on-device AI

Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Codemotion
 
AIoT: Intelligence on Microcontroller
AIoT: Intelligence on MicrocontrollerAIoT: Intelligence on Microcontroller
AIoT: Intelligence on MicrocontrollerAndri Yadi
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!TigerGraph
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Germany
 
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE
 
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...Edge AI and Vision Alliance
 
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...Edge AI and Vision Alliance
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...Edge AI and Vision Alliance
 
Operationalizing SDN
Operationalizing SDNOperationalizing SDN
Operationalizing SDNADVA
 
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)byteLAKE
 
Implementing AI: Hardware Challenges
Implementing AI: Hardware ChallengesImplementing AI: Hardware Challenges
Implementing AI: Hardware ChallengesKTN
 
IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013Cliff Kinard
 
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 Eran Shlomo
 
Aplications for machine learning in IoT
Aplications for machine learning in IoTAplications for machine learning in IoT
Aplications for machine learning in IoTYashesh Shroff
 
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitMeetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitDigipolis Antwerpen
 
CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
CloudCamp Chicago - Big Data & Cloud May 2015 - All SlidesCloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
CloudCamp Chicago - Big Data & Cloud May 2015 - All SlidesCloudCamp Chicago
 

Similar to Accelerating algorithmic and hardware advancements for power efficient on-device AI (20)

On-Device AI
On-Device AIOn-Device AI
On-Device AI
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
 
AIoT: Intelligence on Microcontroller
AIoT: Intelligence on MicrocontrollerAIoT: Intelligence on Microcontroller
AIoT: Intelligence on Microcontroller
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
 
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
 
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
 
Future of hpc
Future of hpcFuture of hpc
Future of hpc
 
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
 
Operationalizing SDN
Operationalizing SDNOperationalizing SDN
Operationalizing SDN
 
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
 
Implementing AI: Hardware Challenges
Implementing AI: Hardware ChallengesImplementing AI: Hardware Challenges
Implementing AI: Hardware Challenges
 
IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013
 
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017
 
LEGaTO: Use cases
LEGaTO: Use casesLEGaTO: Use cases
LEGaTO: Use cases
 
Aplications for machine learning in IoT
Aplications for machine learning in IoTAplications for machine learning in IoT
Aplications for machine learning in IoT
 
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitMeetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
 
CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
CloudCamp Chicago - Big Data & Cloud May 2015 - All SlidesCloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
 

More from Qualcomm Research

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdfQualcomm Research
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
 
Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Qualcomm Research
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfQualcomm Research
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfQualcomm Research
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIQualcomm Research
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingQualcomm Research
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfQualcomm Research
 
Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Qualcomm Research
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GQualcomm Research
 
3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolution3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolutionQualcomm Research
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Qualcomm Research
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G futureQualcomm Research
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsQualcomm Research
 
How to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANHow to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANQualcomm Research
 
What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? Qualcomm Research
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AIQualcomm Research
 
Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Qualcomm Research
 

More from Qualcomm Research (20)

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
The future of AI is hybrid
The future of AI is hybridThe future of AI is hybrid
The future of AI is hybrid
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdf
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensing
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdf
 
Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)Scaling 5G to new frontiers with NR-Light (RedCap)
Scaling 5G to new frontiers with NR-Light (RedCap)
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5G
 
3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolution3GPP Release 17: Completing the first phase of 5G evolution
3GPP Release 17: Completing the first phase of 5G evolution
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G future
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
 
Pioneering 5G broadcast
Pioneering 5G broadcastPioneering 5G broadcast
Pioneering 5G broadcast
 
How to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANHow to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RAN
 
What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave?
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
 
Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...
 

Recently uploaded

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Accelerating algorithmic and hardware advancements for power efficient on-device AI

  • 1. Accelerating algorithmic and hardware advancements for power efficient on-device AI Qualcomm Technologies, Inc. @qualcomm_techJuly 2018
  • 2. 2 By 2025, the data center sector could be using 20% of all available electricity in the world1 A cloud provider used the equivalent energy consumption of ~366,000 US households in 20142 Bitcoin mining in 2017 used the same energy as did all of Ireland3 Computers are consuming an increasing amount of energy 1. Andrae, Anders (2017) Total Consumer Power Consumption Forecast; 2. The Verge (2014); 3.The Guardian (2017) Given the economic potential of AI, these numbers will only be increasing
  • 3. 33 Will we have reached the capacity of the human brain? Energy efficiency of a brain is 100x better than current hardware Source: Welling 2025 Energyconsumption 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 1943: First NN (+/- N=10) 1988: NetTalk (+/- N=20K) 2009: Hinton’s Deep Belief Net (+/- N=10M) 2013: Google/Y! (N=+/- 1B) 2025: N = 100T = 1014 2017: Extremely large neural networks (N =137B) 1012 1010 108 106 1014 104 102 100 Deep neural networks are energy hungry and growing fast AI is being powered by the explosive growth of deep neural networks
  • 4. 44 Value created by AI must exceed the cost to run the service Broad economic viability requires energy efficient AI Economic feasibility per transaction may require cost as low as a micro-dollar (1/10,000th of a cent) • Personalized advertisements and recommendations • Smart security monitoring based on image and sound recognition • Efficiency improvements for smart cities and factories
  • 5. 5 The AI power and thermal ceiling The challenge of AI workloads Very compute intensive Complex concurrencies Real-time Always-on Constrained mobile environment Must be thermally efficient for sleek, ultra-light designs Requires long battery life for all-day use Storage/memory bandwidth limitations
  • 6. 66 Soon, AI algorithms will be measured by the amount of intelligence they provide per watt hour.
  • 7. 77 Weight parameter Activation node Edges Parts Objects OutputPixels Dog? Cat? Deep learning The good Convolutional neural networks (CNNs) have been very successful • Extract learnable features with state-of-the art results • Encode location invariance, namely that the same object may appear anywhere in the image • Share parameters, making them “data efficient” • Execute quickly on modern hardware with parallel processing
  • 8. 88 Weight parameter Activation node Edges Parts Objects OutputPixels Dog? Cat? Deep learning The bad and the ugly CNNs use too much memory, compute, and energy (today) • CNNs do not encode additional symmetries, such as rotation invariance (object may appear in any orientation1) • CNNs do not reliably quantify the confidence in a prediction • CNNs are easy to fool by changing the input only slightly, such as adversarial examples 1. Cohen & Welling 2016
  • 9. 99 Edges Parts Objects OutputPixels Dog? Cat? Bayesian deep-learning addresses these challenges Inspired by brain functionality, introducing noise to neural networks is beneficial Noise can be a good thing for AI Compression and quantization • Reduce complexity of the neural network model • Reduce bit-width of the parameters and activations • Save power and improve efficiency Introduce noise to weights Noise propagates to activations
  • 10. 1010 Edges Parts Objects OutputPixels Dog Cat w1 w2 (Y) w2 w1 Initial weight values Apply Bayesian deep-learning to shrink the model Compression and quantization • Quantize weights: Use lower precision (bit-width) • Prune activations: Reduce number of activation nodes
  • 11. 1111 Edges Parts Objects OutputPixels Introduce noise to parameters Dog Cat w1 w2 (Y) Prior P(W)—distribution before data (X) w2 w1 Initial weight values Apply Bayesian deep-learning to shrink the model Compression and quantization • Quantize weights: Use lower precision (bit-width) • Prune activations: Reduce number of activation nodes
  • 12. 1212 Edges Parts Objects OutputPixels Introduce noise to parameters Add data (X) Dog Cat w1 w2 (Y) Prior P(W)—distribution before data (X) w2 Posterior P(W| X,Y)— distribution after data w1 Initial weight values Apply Bayesian deep-learning to shrink the model Compression and quantization • Quantize weights: Use lower precision (bit-width) • Prune activations: Reduce number of activation nodes
  • 13. 1313 Edges Parts Objects OutputPixels Introduce noise to parameters Add data (X) Dog Cat w1 w2 Prune activations (similar method as quantization) X Quantize weights (or even remove) (Y) Prior P(W)—distribution before data (X) w2 Posterior P(W| X,Y)— distribution after data w1 is pinned down w2 is still highly uncertain w1 Initial weight values Apply Bayesian deep-learning to shrink the model Compression and quantization • Quantize weights: Use lower precision (bit-width) • Prune activations: Reduce number of activation nodes
  • 14. 14 Bayesian deep learning provides broad benefits A powerful tool to address a variety of deep learning challenges Compression and quantization Quantize parameters and activations, prune model components Regularization and generalization Avoid overfitting data; choose the simplest model to explain observations (Occam’s razor) Confidence estimation Generate the confidence intervals of the predictions Privacy/adversarial robustness Avoid storing personal information in parameters, be less sensitive to adversarial attacks
  • 15. 1515 83% 84% 85% 86% 87% 88% 89% 90% 1 1.5 2 2.5 3 3.5 Top5Accuracy Compression ratio ResNet-18 on a large set of labeled images Baseline Channel pruning SVD Bayesian pruning Applying Bayesian deep learning to real use cases Image classification Zhang, Zou, He, & Sun 2015 (SVD); He, Zhang, & Sun 2017 (channel pruning) compression ratio while maintaining close to the same accuracy3X
  • 16. 16 What does the future look like for AI hardware? Distributed computing architectures AI acceleration research
  • 17. 1717 AI acceleration research Balance AI acceleration capabilities across engines (CPU, GPU, DSP) Focused on fundamentals to accelerate deep learning workloads at low power Memory hierarchy Optimize the memory hierarchy to reduce the power consumption of data movement while ensuring performance Compute architecture Optimize instruction type, parallelism, and precision of the functional units Utilization Exploit sparsity to improve utilization and reduce power consumption
  • 18. 18 Algorithmic advancements Neural network algorithm design optimized for hardware Optimization for space and runtime efficiency Efficient hardware Efficient architecture design Selecting the right compute engine for the right task Software tools Software-accelerated run-time for deep learning Neural processing SDK for model compression and optimization The approach for making power efficient on-device AI Focusing on the joint design of algorithms and hardware to achieve high performance
  • 19. 19 AI algorithms and hardware need to be energy efficient We are a leader in Bayesian deep learning and its applications to model compression and quantization We are doing fundamental research on AI algorithms, software, and hardware to maximize power efficiency
  • 20. Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Thank you Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2018 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm is a trademark of Qualcomm Incorporated, registered in the United States and other countries. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.