"Embedding Programmable DNNs in Low-Power SoCs," a Presentation from Xperi

Portfolio of Trusted Brands
Licensing
Semiconductor
Intellectual Property
Imaging and
Computer Vision
silicon IP cores and
solutions
Audio Technology
Solutions
Automotive Audio,
Data, and Digital
Radio Broadcast
Solutions
Semiconductor
and Interconnect
Packaging
Technology &
Solutions
3.4+ B Devices 70+ M Cars 1+ B Devices 100+ B Devices2+ B Devices

Always-on
inference: operates
even while the
device is “off”
(e.g., ultra low power
FD/FR as an enabler)
Head-mounted
displays for AR or MR
IRIS, eye gaze, scene
understanding,, etc..)
Smart IoT: TVs to
drones to microwave
ovens to …
people detection)
Driver State
Sensing for
autonomous driving
(e.g., always ON
driver assistant)

Enhance
• De-warping & Stitching
• Stabilization with Rolling
Shutter Correction
• HDR & LTM
Understand
• Face, People, Object
Detection, Segmentation
& Tracking
• Scene Classification and
recognition
3rd Party
FN IP
Personalize
• Visible and NIR 3D FR
• IRIS recognition
• Liveliness Detection &
Continuous recognition
(hand jitter, facial, etc.)Accelerate
Computer Vision
Accelerator
Sensor
MIPI
DDR
CTRLR
I/F
LCD
DDR
COMM
S
FLASHI/F
-------
-------
-------
Flash
GPU
ISP
Display/LTM
IPU
CPU
Low Power Face
Detection
HQ Distortion
Correction
Facial Feature
Extraction
Image
Registration
Object Detection
& Classification
PCNN Clusters
Biometrics Cores

UNDERSTAND ENHANCE PERSONALIZE
▲ ▲ ▲
IMAGE PROCESSING UNIT (IPU)

Pre-processing
• Multi-resolution stream generation
• Local tone mapping enhancement (significantly
improves detection ratios)
• High-quality, low-latency distortion correction
Dedicated
Cores
• Facial & people analytics (AI/ML)
• Stabilization (HQ resampling, analytics)
• Optional Depth (AI/ML)
PCNNs
• PCNNs for reconfigurable
functionality
• Concurrent support of
multiple real-time networks

11
• IPU
• RTL (image pre-processing, dedicated analytics
and inference cores)
• Tools for programming, training and debugging
• Testing Framework (a.k.a. ImageDB) test
framework supporting acquisition, marking, testing
and reporting
• Data Sets (a.k.a. CV Infra) computer vision
infrastructure supporting 2D and 3D image sets
acquisition, annotation, marking and training sets
generation with ground truth
IPU
Cores &
Tools
Testing
Framework
Data
Sets
… inference cores are a
small part of the total
CNN
Inference
Cores

Rendering
(lens/sensor/ISP
model)
Marked 2D
Samples
Generating Marked 2D Renders
AFT
CUSTOM
BACKGROUNDS
Auto-marking 3D Model
3D Model
3D ACCESSORIES
CUSTOM LIGHTING
Setup (visible «& NIR») 3D ModelPhotogrammetry
Texture
(vis)
Texture
(nfr)
Mesh
Acquisition – 3D Images

Built in pre-
processing on the fly
imaging engine for
(e.g., layer 0)
Local memory for fast
data access and
reduced memory BW
Designed for very low
latency and real-time
network inference
Supporting toolchain to
implement customer
defined network architecturesPCNN 1.2 - 36 MAC/cycle
or
PCNN 2.0 - 512 MAC/cycle
Support for compression,
quantization, decryption

PCNN CORE
PCNN ENGINE
SYS BUS
(AXI)
DDR CPU
REGS
MAP RD MAP WR CODE RD IRQ APB
FLASH
CTL BUS (APB)
•
•
•
•
•
•

•
•
•
•
SHARED SRAM
512KB – 16MB
PCNN CORE
PCNN-CORE
PCNN-CORE
PCNNCORE

SYSTEM BUS
(AXI)
SYS MEMORY
(DDR, FLASH)
HOST CPU
PCNN-CLUSTER
CORE
RISC
MAILBOX
CFG
(AHB)
PCNN-C ENGINE
PCNN PCNNPCNN PCNN
SRAM CTL
SHARED SRAM
IRQ
CFG
1K Bus
*
*
ARBITER
TO/FROM
OTHER
PCNN-C
•
•
•

Device can be accessed
and storage contents
read
Networks sit in the
device’s
permanent storage
Networks
representation patterns
can be identified and
localized in the storage
contents
Neural processor
makers offer
network transfer
tools
Once the network
representation is known,
architecture and weights
values can be obtained
Network
architecture &
weights extraction
Networks can be
remapped and inferred
on alternative
architectures as own
Network re-map and
inference on
alternative
architectures

Kstream= Npub
* Csec
Encrypted NN
Kstream
SW-HW
Public
Interface
PCNN
DECIPHER
Neural
Network
PCNN
INFERENCE
Stream
cipher
SW run by NN provider
Chip
Csec
Genration
Network
NSec
Generation
Neual
Network
On chip
(fuses)
Npub
Npub
Creation
Cpub
Creation
Kstream= Cpub
* Nsec
Cpub
FotoNation HW IPUSW run by chip maker

Scalable to
1,000,000
interconnects
per mm2
3D Design &
Architecture
Materials
Characterization
Simulation
Wafer/Die Bonding
& Processing
Reliability
Failure
Analysis
Technology
Development &
Optimization
Die to Wafer (D2W)
Die to Die (D2D)
Wafer to Wafer (W2W)
ZiBond® DBI®
Homogeneous Bonding Hybrid Bonding

23
SRAM die
LOGIC die
• Array of 2µm pitch DBI®
interconnects (between silicon dies) -
up to 250,000 vertical interconnects
per mm2 enables groundbreaking
computing architecture
• Very short vertical interconnects offer
ultra high performance at very low
power
HOST CPU
CTL CTL
SRAM 8MB SRAM 8MB
. . .
. . .
CTL
8K
SRAM 8MB
2xAXI 128
. . . . . .
DDR CTRL FLASH CTRL
LPDDR 4 NAND FLASH
x32
1K
AHB
IRQ
Cache
AXI
PCNN-CPCNN-CPCNN-C
DBI®

PCNN-C Configuration scenarios
PCNN-C Performance examples on various networks

1. Logic wafer/die with pre-drilled TSVs and DBI® layer
2. SRAM wafer/die with DBI® layer
DBI® layer
Interconnect layers
Active (IC) layer
TSVs
Bulk silicon
3. Face-to-face Zibond bonding of two wafers/dies
4. Thinning of logic wafer/die to expose TSVs
5. Wafer level packaging of stacked wafers/dies
Industry Standard Damascene Process

Wafer to
wafer
bonding
Chip to wafer
bonding
Chip to chip
bonding
ZiBond: Full surface bonding (no underfill)
DBI®: Cu-Cu interconnect joining (no solder)
Sony’s latest image sensor uses DBI®
BSI
image
sensor
Logic
Local
memory
Signal
processing
Wafer 1
Wafer 2
µm scale Cu interconnects
Full surface oxide bond
3D Wafer Bonding with Copper Interconnects
DBI® with Zibond: the most advanced 3D interconnects

"Embedding Programmable DNNs in Low-Power SoCs," a Presentation from Xperi

Recommended

Recommended

More Related Content

More from Edge AI and Vision Alliance

More from Edge AI and Vision Alliance (20)

Recently uploaded

Recently uploaded (20)

"Embedding Programmable DNNs in Low-Power SoCs," a Presentation from Xperi