SlideShare a Scribd company logo
1 of 23
Download to read offline
© 2019 Xilinx
The Xilinx AI Engine: High
Performance with Future-proof
Architecture Adaptability
Vinod Kathail
Xilinx
May 2019
© 2019 Xilinx 2
Motivation for AI Engine
© 2019 Xilinx
Compute Intensity
Real Time Capability
Power Efficiency
Moore’s Law
Performance & Power Scaling
Traditional Single / Multi-core
Machine
Learning
ADAS / AD5G
Smart City
Smart
Factory
Data Center
Workloads
Motivation for AI Engine
Dynamic Markets Require Adaptive Compute Acceleration Platform (ACAP)
AI Everywhere
Applications
Technology
Scaling
Page 3
© 2019 Xilinx
Adaptable Engines
2X compute density
Programmable I/O
• Any interface or sensor
• Includes 4.2Gb/s MIPI
AI Engines
• AI Compute
• Diverse DSP workloads
DDR Memory
• 3200-DDR4, 3200-LPDDR4
• 2X bandwidth/pin
Protocol Engines
• Integrated 600G cores
• 4X encrypted bandwidth
PCIe & CCIX
• 2X PCIe & DMA bandwidth
• Cache-coherent interface
to accelerators
Transceivers
• Broad range, 25G →112G
• 58G in mainstream devices
Scalar Engines
• Platform Control
• Edge Compute
Versal ACAP Architecture Overview
>> 4
Network-on-Chip
• Guaranteed Bandwidth
• Enables SW Programmability
© 2019 Xilinx
AI
CORE
MEMORY
AI
CORE
MEMORY
AI
CORE
MEMORY
AI
CORE
MEMORY
Introducing the AI Engine
Signal ProcessingArtificial
Intelligence
CNN,
LSTM, MLP
Computer Vision
• 1GHz+ Multi-precision Vector Processor
• High bandwidth extensible memory
• Up to 400 AI Engines per device
• 8X Compute Density
• 40% Lower Power
SW Programmable
Adaptable. Intelligent.
Deterministic
Efficient
Page 5
© 2019 Xilinx
C/C++
C/C++
Software Programmable: Any Developer
Page 6
Compile
Design
4G/5G/Radar
Library
AI
Library
Vision
Library
AI Engine Compiler
Programming
Abstraction Levels
1
2
3Run
Domain Specific
Architecture
Data Flow
w/ Xilinx libraries
Kernel Program
Data Flow w/ user
defined libraries
Page 6
Frameworks
© 2019 Xilinx
AI Engine Application Performance & Power Efficiency
Page 7
Image Classification
(GoogleNet, <1ms)
Massive MIMO Radio
(DUC, DDC, CFR, DPD)
AI Inference
Compute
5G Wireless
Bandwidth
Power
Consumption
Xilinx UltraScale+
Xilinx Versal w/ AI Engine
20x
40%
Less
Power
5x
Xilinx 16nm UltraScale+
Xilinx 7nm Versal w/ AI Engine
© 2019 Xilinx 8
AI Engine Architecture
© 2019 Xilinx
AI Engine: Tile-Based Architecture
Page 9
Interconnect
ISA-based
Vector Processor
Local
Memory
AI Vector
Extensions
5G Vector
Extensions
ISA-based
Vector Processor
Software Programmable
(e.g., C/C++)
Data
Mover
Data Mover
Non-neighbor data communication
Integrated synchronization primitives
Non-Blocking Interconnect
high GB/s bandwidth per tile
Local Memory
Multi-bank implementation
Shared across neighbor cores
Cascade Interface
Partial results to next core
PL
PS I/O
© 2019 Xilinx
AI Engine: Array Architecture
Page 10
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Modular and scalable architecture
• More tiles = more compute
• Up to 400 per device
• Versal AI Core VC1902 device
Distributed memory hierarchy
Maximize memory bandwidth
Array of AI Engines
• Increase in compute, memory and
communication bandwidth
Deterministic Performance & Low Latency
PL
PS I/O
© 2019 Xilinx
Page 11
AI Engine: Processor Core
Local, Shareable Memory
• 32KB Local, 128KB Addressable
32-bit Scalar RISC Processor
Up to 128 MACs / Clock Cycle per Core (INT 8)
Highly
Parallel
Memory Interface
Scalar Unit
Scalar
Register
File
Scalar ALU
Non-linear
Functions
Vector
Register
File
Fixed-Point
Vector Unit
Floating-Point
Vector Unit
Vector Unit Vector Processor
512-bit SIMD Datapath
Instruction Fetch
& Decode Unit
AGU AGU AGU
Load Unit A Load Unit B Store Unit
7+ operations / clock cycle
• 2 Vector Loads / 1 Mult / 1 Store
• 2 Scalar Ops / Stream Access
Instruction Parallelism: VLIW Data Parallelism: SIMD
Multiple vector lanes
• Vector Datapath
• 8 / 16 / 32-bit & SPFP operands
Stream
Interface
© 2019 Xilinx
Data Movement Architecture
Page 12
Dataflow
Graph
Mem
Mem
AI
Core
AI
Core
AI
Core
Dataflow
Pipeline
AI
Core
Memory
B0
B1
Memory
B2
B3
Mem
AI
Core
AI
Core
Mem
AI
Core
Streaming
Multicast
AI
Core
AI
Core
AI
Core
AI
Core
AI
Core
Memory
AI
Core
Memory
Non-
Neighbor
AI
Core
AI
Core
Cascade
Streaming
Memory Communication Streaming Communication
Memory Interface
Stream Interface
Cascade Interface
Mem Mem
AI
Core
AI
Core
AI
Core
© 2019 Xilinx
AI Engine Integration in Versal
Page 13
˃ TB/s of Interface Bandwidth
AI Engine to Programmable Logic
AI Engine to NOC
˃ Leveraging NOC connectivity
Processing System manages Config
/ Debug / Trace
AI Engine to DRAM without PL
PL
PS I/O
© 2019 Xilinx
AI Engine: Multi-Core Compute with Dedicated Memory
Page 14
core
L0
core
L0
core
L0
Block 0
L1
core
L0
core
L0
core
L0
Block 1
L1
L2
DRAM
D0
D0
D0
D0
Fixed, shared
Interconnect
• Blocking limits
compute
• Timing not
deterministic
Data
Replicated
• Robs bandwidth
• Reduces capacity
Traditional Multi-core
(cache-based architecture)
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI Engine Array
(intelligent engine)
Dedicated
Interconnect
• Non-blocking
• Deterministic
Local, Distributed
Memory
• No cache misses
• Higher bandwidth
• Less capacity required
© 2019 Xilinx
AI Engine Delivers High Compute Efficiency
Page 15
95%
80%
98%
ML Convolutions FFT DPD
Vector Processor Efficiency
Peak Kernel Theoretical Performance
Block-based
Matrix Multiplication
(32×64) × (64×32)
1024-pt
FFT/iFFT
Volterra-based
forward-path DPD
˃ Adaptable, non-blocking interconnect
Flexible data movement architecture
Avoids interconnect “bottlenecks”
˃ Adaptable memory hierarchy
Local, distributed, shareable = extreme
bandwidth
No cache misses or data replication
Extend to PL memory (BRAM, URAM)
˃ Transfer data while AI Engine Computes
Compute
Comm
Overlap Compute and Communication
Compute Compute
Comm Comm
© 2019 Xilinx 16
AI Engine Programming and
Applications
© 2019 Xilinx
Versal ACAP Development Tools
Page 17
Frameworks
AI and Data
Scientists
Unified Software
Development Environment
Software Application
Developers
Vivado Design Suite
Hardware
Developers
USERTOOLS SUPPORTED FRAMEWORKS
© 2019 Xilinx
Software Development Environment
Page 18
˃ Unified development environment
Full chip programming
˃ SW programmable for whole application
Heterogeneous SW acceleration
˃ Full system simulation, debug & profiling
Software development experience
Application
(e.g. C/C++)
Performance
Constraints
Application
e.g. C/C++
Processing
Sub-system
Programmable
Logic
AI
Engines
System
Simulation Hardware
System Debug & Profiling
Unified SW Development Environment
IntelligentAdaptableScalar
© 2019 Xilinx
AI Engine Programming: Dataflow Model
Page 19
a b c
d
e
User defines dataflow logic
User describes dataflow graph
using C/C++ APIs
1
2
3
a b c
d
ee
Compiler transparently manages placement
& interconnect
to e
Memory
b
Memory
a
Memory
Vector
Core
Memory
Vector
Core
Memory
Vector
Core
Memory
Vector
Core
MemoryMemory
Memory
c
Physical Mapping to AI Engines
Vector
Core d
PL
© 2019 Xilinx
Accelerating AI Inference
Page 20
2
1
3
User works in framework of choice
• Develop & train custom network
• User provides trained model
Xilinx DNN Compiler implements network
• Targets AI inference implemented on FPGA
and Versal
• Optimizations: Quantize, merge layers, prune
• Compile to AI Engines
Scalable across hardware targets
• Start with Alveo boards with FPGAs today
Deep Learning Frameworks
Xilinx DNN Compiler
New Versal based
Acceleration Cards
Xilinx AI Inference
Domain Specific Architecture
Alveo
U200/U250/U280
© 2019 Xilinx
AI Engine Delivers Real-time Inference Leadership
Page 21
Sources:
GPU: Nvidia T4 TensorRT 5, Published March 2019
(INT8, Batch=4, 1.5ms Latency)
Versal Card, Projected (INT8, Batch=8, 1.5ms Latency) 0
2,000
4,000
6,000
8,000
10,000
12,000
VersalGPU
1x
Resnet50 Inference Performance
3.5x
Images/Sec
4.5x With
Xilinx Pruning
© 2019 Xilinx
AI Engine: Accelerating AI Inference & Signal Processing
Page 22
Software Programmable Deterministic Efficient
• Frameworks & C/C++
• SW Compile, Debug &
Deploy
• Max throughput w/ low latency
• Real-time inference leadership
• Up to 8X compute density
• At ~40% lower power
Signal
ProcessingAI
Inference
10x 5x
© 2019 Xilinx
Additional Resources
23
For more information on ACAP
and Versal, please visit:
www.xilinx.com/versal
Please visit EVS Booth #610
Face recognition using Xilinx FPGA

More Related Content

What's hot

OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...NETWAYS
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware LandscapeGrigory Sapunov
 
DPDK IPSec performance benchmark ~ Georgii Tkachuk
DPDK IPSec performance benchmark ~ Georgii TkachukDPDK IPSec performance benchmark ~ Georgii Tkachuk
DPDK IPSec performance benchmark ~ Georgii TkachukIntel
 
Parallel computing with Gpu
Parallel computing with GpuParallel computing with Gpu
Parallel computing with GpuRohit Khatana
 
Developing a Smart Campus
Developing a Smart CampusDeveloping a Smart Campus
Developing a Smart CampusJames Clay
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
 
Generali connection platform_full
Generali connection platform_fullGenerali connection platform_full
Generali connection platform_fullconfluent
 
Simplifying AI for Communications, Radar, and Wireless Systems
Simplifying AI for Communications, Radar, and Wireless SystemsSimplifying AI for Communications, Radar, and Wireless Systems
Simplifying AI for Communications, Radar, and Wireless SystemsJohn Ferguson
 
Starship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery RobotsStarship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery RobotsAndré Karpištšenko
 
HDMI Troubleshooting & System Design
HDMI Troubleshooting & System DesignHDMI Troubleshooting & System Design
HDMI Troubleshooting & System DesignMark Stockfisch
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scaleQualcomm Research
 
InfiniBand Essentials Every HPC Expert Must Know
InfiniBand Essentials Every HPC Expert Must KnowInfiniBand Essentials Every HPC Expert Must Know
InfiniBand Essentials Every HPC Expert Must KnowMellanox Technologies
 
Qualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceQualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceJJ Wu
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 
Kafka in Space | Kafka Summit London 2022
 Kafka in Space | Kafka Summit London 2022 Kafka in Space | Kafka Summit London 2022
Kafka in Space | Kafka Summit London 2022HostedbyConfluent
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyQualcomm Research
 

What's hot (20)

OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
 
Deep learning with FPGA
Deep learning with FPGADeep learning with FPGA
Deep learning with FPGA
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware Landscape
 
DPDK IPSec performance benchmark ~ Georgii Tkachuk
DPDK IPSec performance benchmark ~ Georgii TkachukDPDK IPSec performance benchmark ~ Georgii Tkachuk
DPDK IPSec performance benchmark ~ Georgii Tkachuk
 
Parallel computing with Gpu
Parallel computing with GpuParallel computing with Gpu
Parallel computing with Gpu
 
Developing a Smart Campus
Developing a Smart CampusDeveloping a Smart Campus
Developing a Smart Campus
 
Andes open cl for RISC-V
Andes open cl for RISC-VAndes open cl for RISC-V
Andes open cl for RISC-V
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Generali connection platform_full
Generali connection platform_fullGenerali connection platform_full
Generali connection platform_full
 
Simplifying AI for Communications, Radar, and Wireless Systems
Simplifying AI for Communications, Radar, and Wireless SystemsSimplifying AI for Communications, Radar, and Wireless Systems
Simplifying AI for Communications, Radar, and Wireless Systems
 
Starship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery RobotsStarship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery Robots
 
HDMI Troubleshooting & System Design
HDMI Troubleshooting & System DesignHDMI Troubleshooting & System Design
HDMI Troubleshooting & System Design
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scale
 
InfiniBand Essentials Every HPC Expert Must Know
InfiniBand Essentials Every HPC Expert Must KnowInfiniBand Essentials Every HPC Expert Must Know
InfiniBand Essentials Every HPC Expert Must Know
 
Qualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceQualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile Device
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
Kafka in Space | Kafka Summit London 2022
 Kafka in Space | Kafka Summit London 2022 Kafka in Space | Kafka Summit London 2022
Kafka in Space | Kafka Summit London 2022
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
 

Similar to The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability

Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudRebekah Rodriguez
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...Linaro
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableRebekah Rodriguez
 
News from re:Invent 2019
News from re:Invent 2019News from re:Invent 2019
News from re:Invent 2019Vladimir Simek
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...Edge AI and Vision Alliance
 
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...Edge AI and Vision Alliance
 
Xilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIXXilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIXYoshihiro Horie
 
Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption Michelle Holley
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT Project
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsGanesan Narayanasamy
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V International
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureRebekah Rodriguez
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...Filipe Miranda
 
Are you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkAre you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkMegan O'Keefe
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v finalCisco Canada
 

Similar to The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability (20)

Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
 
News from re:Invent 2019
News from re:Invent 2019News from re:Invent 2019
News from re:Invent 2019
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
 
SYCL 2020 Specification
SYCL 2020 SpecificationSYCL 2020 Specification
SYCL 2020 Specification
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
 
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
 
Xilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIXXilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIX
 
Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
Are you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkAre you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the network
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability

  • 1. © 2019 Xilinx The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability Vinod Kathail Xilinx May 2019
  • 2. © 2019 Xilinx 2 Motivation for AI Engine
  • 3. © 2019 Xilinx Compute Intensity Real Time Capability Power Efficiency Moore’s Law Performance & Power Scaling Traditional Single / Multi-core Machine Learning ADAS / AD5G Smart City Smart Factory Data Center Workloads Motivation for AI Engine Dynamic Markets Require Adaptive Compute Acceleration Platform (ACAP) AI Everywhere Applications Technology Scaling Page 3
  • 4. © 2019 Xilinx Adaptable Engines 2X compute density Programmable I/O • Any interface or sensor • Includes 4.2Gb/s MIPI AI Engines • AI Compute • Diverse DSP workloads DDR Memory • 3200-DDR4, 3200-LPDDR4 • 2X bandwidth/pin Protocol Engines • Integrated 600G cores • 4X encrypted bandwidth PCIe & CCIX • 2X PCIe & DMA bandwidth • Cache-coherent interface to accelerators Transceivers • Broad range, 25G →112G • 58G in mainstream devices Scalar Engines • Platform Control • Edge Compute Versal ACAP Architecture Overview >> 4 Network-on-Chip • Guaranteed Bandwidth • Enables SW Programmability
  • 5. © 2019 Xilinx AI CORE MEMORY AI CORE MEMORY AI CORE MEMORY AI CORE MEMORY Introducing the AI Engine Signal ProcessingArtificial Intelligence CNN, LSTM, MLP Computer Vision • 1GHz+ Multi-precision Vector Processor • High bandwidth extensible memory • Up to 400 AI Engines per device • 8X Compute Density • 40% Lower Power SW Programmable Adaptable. Intelligent. Deterministic Efficient Page 5
  • 6. © 2019 Xilinx C/C++ C/C++ Software Programmable: Any Developer Page 6 Compile Design 4G/5G/Radar Library AI Library Vision Library AI Engine Compiler Programming Abstraction Levels 1 2 3Run Domain Specific Architecture Data Flow w/ Xilinx libraries Kernel Program Data Flow w/ user defined libraries Page 6 Frameworks
  • 7. © 2019 Xilinx AI Engine Application Performance & Power Efficiency Page 7 Image Classification (GoogleNet, <1ms) Massive MIMO Radio (DUC, DDC, CFR, DPD) AI Inference Compute 5G Wireless Bandwidth Power Consumption Xilinx UltraScale+ Xilinx Versal w/ AI Engine 20x 40% Less Power 5x Xilinx 16nm UltraScale+ Xilinx 7nm Versal w/ AI Engine
  • 8. © 2019 Xilinx 8 AI Engine Architecture
  • 9. © 2019 Xilinx AI Engine: Tile-Based Architecture Page 9 Interconnect ISA-based Vector Processor Local Memory AI Vector Extensions 5G Vector Extensions ISA-based Vector Processor Software Programmable (e.g., C/C++) Data Mover Data Mover Non-neighbor data communication Integrated synchronization primitives Non-Blocking Interconnect high GB/s bandwidth per tile Local Memory Multi-bank implementation Shared across neighbor cores Cascade Interface Partial results to next core PL PS I/O
  • 10. © 2019 Xilinx AI Engine: Array Architecture Page 10 Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Modular and scalable architecture • More tiles = more compute • Up to 400 per device • Versal AI Core VC1902 device Distributed memory hierarchy Maximize memory bandwidth Array of AI Engines • Increase in compute, memory and communication bandwidth Deterministic Performance & Low Latency PL PS I/O
  • 11. © 2019 Xilinx Page 11 AI Engine: Processor Core Local, Shareable Memory • 32KB Local, 128KB Addressable 32-bit Scalar RISC Processor Up to 128 MACs / Clock Cycle per Core (INT 8) Highly Parallel Memory Interface Scalar Unit Scalar Register File Scalar ALU Non-linear Functions Vector Register File Fixed-Point Vector Unit Floating-Point Vector Unit Vector Unit Vector Processor 512-bit SIMD Datapath Instruction Fetch & Decode Unit AGU AGU AGU Load Unit A Load Unit B Store Unit 7+ operations / clock cycle • 2 Vector Loads / 1 Mult / 1 Store • 2 Scalar Ops / Stream Access Instruction Parallelism: VLIW Data Parallelism: SIMD Multiple vector lanes • Vector Datapath • 8 / 16 / 32-bit & SPFP operands Stream Interface
  • 12. © 2019 Xilinx Data Movement Architecture Page 12 Dataflow Graph Mem Mem AI Core AI Core AI Core Dataflow Pipeline AI Core Memory B0 B1 Memory B2 B3 Mem AI Core AI Core Mem AI Core Streaming Multicast AI Core AI Core AI Core AI Core AI Core Memory AI Core Memory Non- Neighbor AI Core AI Core Cascade Streaming Memory Communication Streaming Communication Memory Interface Stream Interface Cascade Interface Mem Mem AI Core AI Core AI Core
  • 13. © 2019 Xilinx AI Engine Integration in Versal Page 13 ˃ TB/s of Interface Bandwidth AI Engine to Programmable Logic AI Engine to NOC ˃ Leveraging NOC connectivity Processing System manages Config / Debug / Trace AI Engine to DRAM without PL PL PS I/O
  • 14. © 2019 Xilinx AI Engine: Multi-Core Compute with Dedicated Memory Page 14 core L0 core L0 core L0 Block 0 L1 core L0 core L0 core L0 Block 1 L1 L2 DRAM D0 D0 D0 D0 Fixed, shared Interconnect • Blocking limits compute • Timing not deterministic Data Replicated • Robs bandwidth • Reduces capacity Traditional Multi-core (cache-based architecture) MEM AI Core MEM AI Core MEM AI Core MEM AI Core MEM AI Core MEM AI Core AI Core MEM AI Core MEM AI Core MEM AI Engine Array (intelligent engine) Dedicated Interconnect • Non-blocking • Deterministic Local, Distributed Memory • No cache misses • Higher bandwidth • Less capacity required
  • 15. © 2019 Xilinx AI Engine Delivers High Compute Efficiency Page 15 95% 80% 98% ML Convolutions FFT DPD Vector Processor Efficiency Peak Kernel Theoretical Performance Block-based Matrix Multiplication (32×64) × (64×32) 1024-pt FFT/iFFT Volterra-based forward-path DPD ˃ Adaptable, non-blocking interconnect Flexible data movement architecture Avoids interconnect “bottlenecks” ˃ Adaptable memory hierarchy Local, distributed, shareable = extreme bandwidth No cache misses or data replication Extend to PL memory (BRAM, URAM) ˃ Transfer data while AI Engine Computes Compute Comm Overlap Compute and Communication Compute Compute Comm Comm
  • 16. © 2019 Xilinx 16 AI Engine Programming and Applications
  • 17. © 2019 Xilinx Versal ACAP Development Tools Page 17 Frameworks AI and Data Scientists Unified Software Development Environment Software Application Developers Vivado Design Suite Hardware Developers USERTOOLS SUPPORTED FRAMEWORKS
  • 18. © 2019 Xilinx Software Development Environment Page 18 ˃ Unified development environment Full chip programming ˃ SW programmable for whole application Heterogeneous SW acceleration ˃ Full system simulation, debug & profiling Software development experience Application (e.g. C/C++) Performance Constraints Application e.g. C/C++ Processing Sub-system Programmable Logic AI Engines System Simulation Hardware System Debug & Profiling Unified SW Development Environment IntelligentAdaptableScalar
  • 19. © 2019 Xilinx AI Engine Programming: Dataflow Model Page 19 a b c d e User defines dataflow logic User describes dataflow graph using C/C++ APIs 1 2 3 a b c d ee Compiler transparently manages placement & interconnect to e Memory b Memory a Memory Vector Core Memory Vector Core Memory Vector Core Memory Vector Core MemoryMemory Memory c Physical Mapping to AI Engines Vector Core d PL
  • 20. © 2019 Xilinx Accelerating AI Inference Page 20 2 1 3 User works in framework of choice • Develop & train custom network • User provides trained model Xilinx DNN Compiler implements network • Targets AI inference implemented on FPGA and Versal • Optimizations: Quantize, merge layers, prune • Compile to AI Engines Scalable across hardware targets • Start with Alveo boards with FPGAs today Deep Learning Frameworks Xilinx DNN Compiler New Versal based Acceleration Cards Xilinx AI Inference Domain Specific Architecture Alveo U200/U250/U280
  • 21. © 2019 Xilinx AI Engine Delivers Real-time Inference Leadership Page 21 Sources: GPU: Nvidia T4 TensorRT 5, Published March 2019 (INT8, Batch=4, 1.5ms Latency) Versal Card, Projected (INT8, Batch=8, 1.5ms Latency) 0 2,000 4,000 6,000 8,000 10,000 12,000 VersalGPU 1x Resnet50 Inference Performance 3.5x Images/Sec 4.5x With Xilinx Pruning
  • 22. © 2019 Xilinx AI Engine: Accelerating AI Inference & Signal Processing Page 22 Software Programmable Deterministic Efficient • Frameworks & C/C++ • SW Compile, Debug & Deploy • Max throughput w/ low latency • Real-time inference leadership • Up to 8X compute density • At ~40% lower power Signal ProcessingAI Inference 10x 5x
  • 23. © 2019 Xilinx Additional Resources 23 For more information on ACAP and Versal, please visit: www.xilinx.com/versal Please visit EVS Booth #610 Face recognition using Xilinx FPGA