SlideShare a Scribd company logo
1 of 46
Download to read offline
Northwestern
1
Computer Engineering Research
@Northwestern
•  Housed within the Electrical and Computer
Engineering Department, co-owned by the
Computer Science Department
– Faculty have dual appointments
– PhD programs in both departments have
Computer Engineering tracks
•  Design Automation and VLSI, Computer
Architecture, Internet of Things,
Embedded Systems, Big Data & AI
2
Computer Engineering Research
@Northwestern
•  Hardware assisted ML
•  Neuromorphic computing
•  Hardware security
•  Photonic circuits in computer architectures
•  Emphatic computing
•  Batteryless computing
•  Secure and verified software for automotive
application
•  Adaptive architecture – compiler in the loop
3
Research Interests
•  Electronic Design Automation (EDA)
•  Circuits and Systems
•  High Performance Computing (HPC)
Systems
4
Research Activities
•  Design Automation
–  High-Level Synthesis
•  Converting applications described with programming languages to hardware
•  Circuits and Systems
–  Thermal and Power Management of ICs
•  Temperature Sensors, 3D Integration, Power and Performance
Modeling using Machine Learning
5
Research Activities
•  High Performance
Computing Systems
– FPGA-based
accelerators for HPC
applications
– Thermal and Power
Management in large
scale systems
6
HW Assisted ML
•  Design Automation for real-time AI
•  Cyberinfrastructure for scientists who deploy HW
systems for ML
7
Physics	
Material	
Science	
Astro-
physics	
Particle	
Tracking	
Image	Reconstruction	
Real-time	control	of	
microscopy	
Accelerator	
Control	
Semantic	
compression	
Data	filtering,	compression,	reconstruction,	
feedback	controllers
Handling Big Data
•  Source of the computational challenge
– Particle acceleration experiment in high-energy
physics
•  Billions of “event” happen every ~25ns creating Pb/
sec data rates
•  “Interesting” events (~3% of all) need to be
recognized and filtered for further processing
–  Sky survey data collected from observatories
stream in at rates ~Gb/s (or more)
•  Limited energy (generators in South Pole) to move
data to datacenters
8
Handling Big Data
•  Source of the computational challenge
–  CPU-GPU systems perform streaming inference on
images captured by an electron microscope within
~300ms latency
–  Desired goal is to perform inference on images
captured at 10s of millions of frames per second
under 50ms latency
–  ~50ms latency could allow real-time control of the
EM during material synthesis
•  Think of making defects for a quantum material,
deposition of nano layers, etc.
9
Key areas of Overlap: System
Constraints
10
Key areas of Overlap: System
Constraints
11
Key areas of Overlap: System
Constraints
12
Projects – Complete/Underway
•  READS – Accelerator Real-time Edge AI
for Distributed Systems
•  Design of a reconfigurable autoencoder
algorithm for detector front-end ASICs
13
Projects – Preliminary
•  CryoAI - 22nm testchip
•  Adaptive ML accelerators
14
READS – Accelerator Real-time
Edge AI for Distributed Systems
•  Goal: integrate ML into accelerator operations
•  Challenges: Beam Loss
–  High Energy Physics (HEP) experiment use proton
beams
–  Particles get lost through interactions with the beam
vacuum pipe
–  Intensity/pace of beam extraction affects radiation in
the environment
–  Human operators tune parameters of hundreds to
thousands of devices inside the accelerator complex
–  If beam loss (“leak”) is at extreme levels, system
needs to shut-down – loss of access to facility
15
READS – Accelerator Real-time
Edge AI for Distributed Systems
16
•  Muon Complex
•  Protons are accelerators within
the Booster
•  Injected into the Recycler Ring
–  The proton beam is “prepared”
•  Proton beam is released into the
Delivery Ring
–  Multiple experiments stationed
around it
•  Proton beam is extracted to
match the exact intensity
required for an experiments -
spill
READS – Accelerator Real-time
Edge AI for Distributed Systems
17
•  Two main mechanisms for beam control/correction
–  Tune Quads – quadrupole magnets
–  RFKO Kickers - heaters
READS – Accelerator Real-time
Edge AI for Distributed Systems
•  System Design Goal: PID Controller gains
need to be optimized in real-time ~ms
•  The ML Processor receives inputs from
sensors
– beam position monitor (BPM)
– beam loss monitor (BLM)
18
READS – Accelerator Real-time
Edge AI for Distributed Systems
•  System Architecture:
19
READS – Accelerator Real-time
Edge AI for Distributed Systems
•  System Architecture:
– Edge device continuously optimizes the online
control agent
– Data streamed to a cloud system for large
scale training
20
READS – Accelerator Real-time
Edge AI for Distributed Systems
•  Algorithm-
Architecture Co-
design
–  A common
toolchain to
program FPGA
devices as well
as create
interfaces
21
READS – Accelerator Real-time
Edge AI for Distributed Systems
•  Algorithm-Architecture Co-design
–  Create modular neural network components
•  Regroup, recombine
–  Establish physics/science-aware methodology
•  Hardware-aware quantization and pruning techniques
•  Homogeneous versus heterogeneous quantization
•  Quantization-aware training
•  Control resource re-use trade-offs of high level
synthesis tools
•  Differentiate between the relative hardware cost of
storage, DSP, interconnect
22
READS – Accelerator Real-time
Edge AI for Distributed Systems
23
hls4ml
•  Aids Algorithm-Architecture Co-design
– Translation to neural network building blocks
•  Basic layers (conv, fully connected)
•  Activation functions
– Ongoing investigations to build a
comprehensive library
•  Expansion towards graph neural networks, custom
scientific kernels
•  Custom neurons (long-short-term memory, multi-
head attention, etc. )
24
hls4ml
•  Aids Algorithm-Architecture Co-design
– Ongoing work to automate compatibility with
several backend flows
•  Xilinx Vivado, Intel Quartus HLS, Mentor Graphics
Catapult
•  Integration of model development (training, model
compression) and hardware synthesis
•  Automation of insertion for optimization directives
and pragmas
25
hls4ml
•  Democratize design tools
•  Allow domain scientists to build edge
computing systems with minimal specialized
hardware design training
•  Open source
•  HLS raises level of abstraction
–  Faster simulation, faster evaluation of design
alternatives
–  Allow domain expert trade-off parallelism,
resource re-use, latency constraint, power budget
26
Reconfigurable autoencoder for detector
front-end ASICs
•  Edge computing for particle collider experiments
•  Data collected from a large number of photon
detectors are compressed to representative
information of the “shape”
–  Charge measurements from the detectors are
compressed to a radiation pattern
–  6 million detector channels sending data at 40MHz
–  The data from the original space is compressed to
lower dimensionality by the edge ASIC, transmitted,
then decoded on the receiving end
27
Reconfigurable autoencoder for detector
front-end ASICs
•  System constraints
– Low power
•  Will be part of a larger system with power budget
– Radiation tolerant
– Reprogrammable weights through accessible
registers to enable updates and customization
28
Reconfigurable autoencoder for detector
front-end ASICs
•  Autoencoder:
neural network
with single
convolutional
layer followed
by a dense
layer
29
Reconfigurable autoencoder for detector
front-end ASICs
30
Glossary	of	Deep	Learning:	Autoencoder,	by	Jaron	Collins
Reconfigurable autoencoder for detector
front-end ASICs
•  Dataflow
–  22-bit signals from 48 detector cells
31
Reconfigurable autoencoder for detector
front-end ASICs
•  Network properties
–  CNN layer: eight 3x3x3 kernel matrices resulting in 128
outputs
–  ReLu activation after CNN and after final dense layer
–  6-bits weights
–  Dense layer produces 16 10-bit outputs
–  Chip can be reconfigured to produce as low as total of
64 bits in output
32
Reconfigurable autoencoder for detector
front-end ASICs
33
Reconfigurable autoencoder for detector
front-end ASICs
34
Reconfigurable autoencoder for detector
front-end ASICs
35
•  Protection against Single Event Upsets
Reconfigurable autoencoder for detector
front-end ASICs
36
•  Protection against Single Event Upsets
Reconfigurable autoencoder for detector
front-end ASICs
•  Chip specs
–  6b weight and bias parameters
•  Total: 2,286 = 13,724 bits
–  Parameters loaded via I2C interface
–  Decoder component implemented off-detector on
FPGA
–  Chip latency – 25ns
–  7nJ per inference
–  280mW
–  Can withstand 200MRad ionizing radiation
–  2.5mm2
37
Projects – Preliminary
•  CryoAI - 22nm testchip
•  Adaptive ML accelerators
38
CryoAI - 22nm testchip
•  A system-on-chip (SoC)
•  Goal: test performance and leakage power trends of the
technology at cryogenic temperatures
•  An opportunity to showcase the hls4ml flow
–  Contribute a neural network accelerator module as part of the
SoC
–  Perform anomaly detection
•  Design considerations
–  Programmability -- Cannot reload all of the parameters
–  Optimal sequencing of quantization & pruning
–  Assignment of sub-regions of the network to voltage domains
39
Adaptive ML accelerators
•  Why adaptation?
– Energy constraints
– Input variability
– Translation to new platform/device
– Transfer Learning
40
Adaptive ML accelerators
•  Some directions
–  Emulate loss through drop out/connect
•  Techniques in ML literature equate these phenomena to
forcing weights to zero
•  Loss in (because of) hardware may look very different
–  Continuous diagnostics
•  Evaluation of network certainty
•  Concept of surprise
–  Detection of temporal dominance of classes
•  Reconfigure optimized version for dominant class
41
Adaptive ML accelerators
•  Some directions
–  Critical path based hardware-aware resource
management
•  Class-based CP: using contributions of a neuron to a
specific class
–  Mean Absolute Activation, first order Taylor Approximations
•  Generalized CP: relative participation of output channels at
the routing of the output from a layer
–  Distribute resources (e.g., total number of bits
allocated for weights) according to a criticality metric
42
Adaptive ML accelerators
•  Some directions
–  Characterize circuits (e.g., SRAM) to create models
•  Associate voltage drop/power outage with cell decay
•  Create characteristic bit masks for weights
– Neural network architecture search
•  Lessons learned from design space exploration in
high-level synthesis
•  Search for a new cell from basic building blocks
–  Input: a set of convolutions and pooling of varying size
–  Think of it as your module library
43
Future Directions
•  Fluid implementations
–  Quickly re-targetable from software to hardware
•  Scientific domains offer a vast space of
computational challenges
–  They need interpretable systems
–  Laws of nature apparent in the system’s output
•  Multiple paths need to converge
–  Photonic circuits – not much automated
–  FPGA
–  Neuromorphic – not much automated
–  Quantum - ??
44
Collaborators
•  Northwestern University
–  Han Liu (CS), Kristian Hahn (Physics)
–  Manuel Blanco Valentin, Rui Shi, Bincong Ye, Yingyi
Luo (now at Google), Sid Joshi (now at Intel)
•  Fermi National Laboratories
–  Nhan Tran, Farah Fahim, Christian Herwig, Kiyomi
Seiya, et al.
•  Columbia University
–  Giuseppe di Guglielmo
•  Lehigh University
–  Josh Agar
45
THANK YOU!
46

More Related Content

What's hot

COGNITIVE COMPUTING
COGNITIVE COMPUTINGCOGNITIVE COMPUTING
COGNITIVE COMPUTINGmitali singh
 
A seminar on Brain Chip Interface Abhishek Verma
A  seminar  on Brain Chip Interface Abhishek VermaA  seminar  on Brain Chip Interface Abhishek Verma
A seminar on Brain Chip Interface Abhishek VermaÂßhîshêk Vêrmã
 
Brain Computer Interface
Brain Computer InterfaceBrain Computer Interface
Brain Computer InterfaceSubham Kundu
 
The ECP Exascale Computing Project
The ECP Exascale Computing ProjectThe ECP Exascale Computing Project
The ECP Exascale Computing Projectinside-BigData.com
 
13 nicole neurons growing on silicon presentation
13 nicole neurons growing on silicon presentation 13 nicole neurons growing on silicon presentation
13 nicole neurons growing on silicon presentation Nicole Rivera
 
Blue brain project ppt
Blue brain project pptBlue brain project ppt
Blue brain project pptLishita Shah
 
Deep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesDeep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesGiacomo Indiveri
 
初探人工智慧
初探人工智慧初探人工智慧
初探人工智慧Paul Yang
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD
 
Integer quantization for deep learning inference: principles and empirical ev...
Integer quantization for deep learning inference: principles and empirical ev...Integer quantization for deep learning inference: principles and empirical ev...
Integer quantization for deep learning inference: principles and empirical ev...jemin lee
 
BRAIN COMPUTER INTERFACE (BCI)
BRAIN COMPUTER INTERFACE (BCI)BRAIN COMPUTER INTERFACE (BCI)
BRAIN COMPUTER INTERFACE (BCI)UmmeKalsoom11
 
Cognitive computing ppt.
Cognitive computing ppt.Cognitive computing ppt.
Cognitive computing ppt.KRIPAPIOUS
 
Neurosynaptic chips
Neurosynaptic chipsNeurosynaptic chips
Neurosynaptic chipsJeffrey Funk
 

What's hot (20)

Cognitive computing
Cognitive computing Cognitive computing
Cognitive computing
 
Cognitive computing
Cognitive computingCognitive computing
Cognitive computing
 
BLUE BRAIN TECHNOLOGY
BLUE BRAIN TECHNOLOGYBLUE BRAIN TECHNOLOGY
BLUE BRAIN TECHNOLOGY
 
COGNITIVE COMPUTING
COGNITIVE COMPUTINGCOGNITIVE COMPUTING
COGNITIVE COMPUTING
 
A seminar on Brain Chip Interface Abhishek Verma
A  seminar  on Brain Chip Interface Abhishek VermaA  seminar  on Brain Chip Interface Abhishek Verma
A seminar on Brain Chip Interface Abhishek Verma
 
Brain Computer Interface
Brain Computer InterfaceBrain Computer Interface
Brain Computer Interface
 
Cognitive computing 2016
Cognitive computing 2016Cognitive computing 2016
Cognitive computing 2016
 
The ECP Exascale Computing Project
The ECP Exascale Computing ProjectThe ECP Exascale Computing Project
The ECP Exascale Computing Project
 
13 nicole neurons growing on silicon presentation
13 nicole neurons growing on silicon presentation 13 nicole neurons growing on silicon presentation
13 nicole neurons growing on silicon presentation
 
Blue brain project ppt
Blue brain project pptBlue brain project ppt
Blue brain project ppt
 
Deep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devicesDeep Networks with Neuromorphic VLSI devices
Deep Networks with Neuromorphic VLSI devices
 
Blue brain
Blue brainBlue brain
Blue brain
 
初探人工智慧
初探人工智慧初探人工智慧
初探人工智慧
 
Ppt of nanocomputing
Ppt of nanocomputingPpt of nanocomputing
Ppt of nanocomputing
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
Best Ever PPT Of Bluebrain
Best Ever PPT Of BluebrainBest Ever PPT Of Bluebrain
Best Ever PPT Of Bluebrain
 
Integer quantization for deep learning inference: principles and empirical ev...
Integer quantization for deep learning inference: principles and empirical ev...Integer quantization for deep learning inference: principles and empirical ev...
Integer quantization for deep learning inference: principles and empirical ev...
 
BRAIN COMPUTER INTERFACE (BCI)
BRAIN COMPUTER INTERFACE (BCI)BRAIN COMPUTER INTERFACE (BCI)
BRAIN COMPUTER INTERFACE (BCI)
 
Cognitive computing ppt.
Cognitive computing ppt.Cognitive computing ppt.
Cognitive computing ppt.
 
Neurosynaptic chips
Neurosynaptic chipsNeurosynaptic chips
Neurosynaptic chips
 

Similar to AI Hardware for Real-Time Machine Learning

Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...Facultad de Informática UCM
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?Deepak Shankar
 
REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...
REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...
REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...Jithin T
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Integrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networksIntegrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networksDaniele Gianni
 
SISTec Microelectronics VLSI design
SISTec Microelectronics VLSI designSISTec Microelectronics VLSI design
SISTec Microelectronics VLSI designDr. Ravi Mishra
 
Exploration of Radars and Software Defined Radios using VisualSim
Exploration of  Radars and Software Defined Radios using VisualSimExploration of  Radars and Software Defined Radios using VisualSim
Exploration of Radars and Software Defined Radios using VisualSimDeepak Shankar
 
2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos
2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos
2017 Atlanta Regional User Seminar - Real-Time Microgrid DemosOPAL-RT TECHNOLOGIES
 
Digital Systems Design
Digital Systems DesignDigital Systems Design
Digital Systems DesignReza Sameni
 
Architectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidthArchitectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidthDeepak Shankar
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptxPratik Gohel
 
Summary Of Course Projects
Summary Of Course ProjectsSummary Of Course Projects
Summary Of Course Projectsawan2008
 
Chandan Kumar_3+_Years _EXP
Chandan Kumar_3+_Years _EXPChandan Kumar_3+_Years _EXP
Chandan Kumar_3+_Years _EXPChandan kumar
 

Similar to AI Hardware for Real-Time Machine Learning (20)

Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...
REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...
REAL-TIME SIMULATION TECHNOLOGIES FOR POWER SYSTEMS DESIGN, TESTING, AND ANAL...
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Integrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networksIntegrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networks
 
SISTec Microelectronics VLSI design
SISTec Microelectronics VLSI designSISTec Microelectronics VLSI design
SISTec Microelectronics VLSI design
 
CAOS: A CAD Framework for FPGA-Based Systems
CAOS: A CAD Framework for FPGA-Based SystemsCAOS: A CAD Framework for FPGA-Based Systems
CAOS: A CAD Framework for FPGA-Based Systems
 
Exploration of Radars and Software Defined Radios using VisualSim
Exploration of  Radars and Software Defined Radios using VisualSimExploration of  Radars and Software Defined Radios using VisualSim
Exploration of Radars and Software Defined Radios using VisualSim
 
2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos
2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos
2017 Atlanta Regional User Seminar - Real-Time Microgrid Demos
 
Digital Systems Design
Digital Systems DesignDigital Systems Design
Digital Systems Design
 
L26.B.FA20.ppt
L26.B.FA20.pptL26.B.FA20.ppt
L26.B.FA20.ppt
 
Node architecture
Node architectureNode architecture
Node architecture
 
Architectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidthArchitectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidth
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptx
 
Summary Of Course Projects
Summary Of Course ProjectsSummary Of Course Projects
Summary Of Course Projects
 
Chandan Kumar_3+_Years _EXP
Chandan Kumar_3+_Years _EXPChandan Kumar_3+_Years _EXP
Chandan Kumar_3+_Years _EXP
 

More from Facultad de Informática UCM

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?Facultad de Informática UCM
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...Facultad de Informática UCM
 
DRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersDRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersFacultad de Informática UCM
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmFacultad de Informática UCM
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingFacultad de Informática UCM
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroFacultad de Informática UCM
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Facultad de Informática UCM
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFacultad de Informática UCM
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoFacultad de Informática UCM
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCFacultad de Informática UCM
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Facultad de Informática UCM
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Facultad de Informática UCM
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Facultad de Informática UCM
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windFacultad de Informática UCM
 
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Facultad de Informática UCM
 

More from Facultad de Informática UCM (20)

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
 
DRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersDRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation Computers
 
uElectronics ongoing activities at ESA
uElectronics ongoing activities at ESAuElectronics ongoing activities at ESA
uElectronics ongoing activities at ESA
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura Arm
 
Formalizing Mathematics in Lean
Formalizing Mathematics in LeanFormalizing Mathematics in Lean
Formalizing Mathematics in Lean
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented Computing
 
Computer Design Concepts for Machine Learning
Computer Design Concepts for Machine LearningComputer Design Concepts for Machine Learning
Computer Design Concepts for Machine Learning
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuro
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error Correction
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intento
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPC
 
Type and proof structures for concurrency
Type and proof structures for concurrencyType and proof structures for concurrency
Type and proof structures for concurrency
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
 
Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore wind
 
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
 

Recently uploaded

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Recently uploaded (20)

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 

AI Hardware for Real-Time Machine Learning

  • 2. Computer Engineering Research @Northwestern •  Housed within the Electrical and Computer Engineering Department, co-owned by the Computer Science Department – Faculty have dual appointments – PhD programs in both departments have Computer Engineering tracks •  Design Automation and VLSI, Computer Architecture, Internet of Things, Embedded Systems, Big Data & AI 2
  • 3. Computer Engineering Research @Northwestern •  Hardware assisted ML •  Neuromorphic computing •  Hardware security •  Photonic circuits in computer architectures •  Emphatic computing •  Batteryless computing •  Secure and verified software for automotive application •  Adaptive architecture – compiler in the loop 3
  • 4. Research Interests •  Electronic Design Automation (EDA) •  Circuits and Systems •  High Performance Computing (HPC) Systems 4
  • 5. Research Activities •  Design Automation –  High-Level Synthesis •  Converting applications described with programming languages to hardware •  Circuits and Systems –  Thermal and Power Management of ICs •  Temperature Sensors, 3D Integration, Power and Performance Modeling using Machine Learning 5
  • 6. Research Activities •  High Performance Computing Systems – FPGA-based accelerators for HPC applications – Thermal and Power Management in large scale systems 6
  • 7. HW Assisted ML •  Design Automation for real-time AI •  Cyberinfrastructure for scientists who deploy HW systems for ML 7 Physics Material Science Astro- physics Particle Tracking Image Reconstruction Real-time control of microscopy Accelerator Control Semantic compression Data filtering, compression, reconstruction, feedback controllers
  • 8. Handling Big Data •  Source of the computational challenge – Particle acceleration experiment in high-energy physics •  Billions of “event” happen every ~25ns creating Pb/ sec data rates •  “Interesting” events (~3% of all) need to be recognized and filtered for further processing –  Sky survey data collected from observatories stream in at rates ~Gb/s (or more) •  Limited energy (generators in South Pole) to move data to datacenters 8
  • 9. Handling Big Data •  Source of the computational challenge –  CPU-GPU systems perform streaming inference on images captured by an electron microscope within ~300ms latency –  Desired goal is to perform inference on images captured at 10s of millions of frames per second under 50ms latency –  ~50ms latency could allow real-time control of the EM during material synthesis •  Think of making defects for a quantum material, deposition of nano layers, etc. 9
  • 10. Key areas of Overlap: System Constraints 10
  • 11. Key areas of Overlap: System Constraints 11
  • 12. Key areas of Overlap: System Constraints 12
  • 13. Projects – Complete/Underway •  READS – Accelerator Real-time Edge AI for Distributed Systems •  Design of a reconfigurable autoencoder algorithm for detector front-end ASICs 13
  • 14. Projects – Preliminary •  CryoAI - 22nm testchip •  Adaptive ML accelerators 14
  • 15. READS – Accelerator Real-time Edge AI for Distributed Systems •  Goal: integrate ML into accelerator operations •  Challenges: Beam Loss –  High Energy Physics (HEP) experiment use proton beams –  Particles get lost through interactions with the beam vacuum pipe –  Intensity/pace of beam extraction affects radiation in the environment –  Human operators tune parameters of hundreds to thousands of devices inside the accelerator complex –  If beam loss (“leak”) is at extreme levels, system needs to shut-down – loss of access to facility 15
  • 16. READS – Accelerator Real-time Edge AI for Distributed Systems 16 •  Muon Complex •  Protons are accelerators within the Booster •  Injected into the Recycler Ring –  The proton beam is “prepared” •  Proton beam is released into the Delivery Ring –  Multiple experiments stationed around it •  Proton beam is extracted to match the exact intensity required for an experiments - spill
  • 17. READS – Accelerator Real-time Edge AI for Distributed Systems 17 •  Two main mechanisms for beam control/correction –  Tune Quads – quadrupole magnets –  RFKO Kickers - heaters
  • 18. READS – Accelerator Real-time Edge AI for Distributed Systems •  System Design Goal: PID Controller gains need to be optimized in real-time ~ms •  The ML Processor receives inputs from sensors – beam position monitor (BPM) – beam loss monitor (BLM) 18
  • 19. READS – Accelerator Real-time Edge AI for Distributed Systems •  System Architecture: 19
  • 20. READS – Accelerator Real-time Edge AI for Distributed Systems •  System Architecture: – Edge device continuously optimizes the online control agent – Data streamed to a cloud system for large scale training 20
  • 21. READS – Accelerator Real-time Edge AI for Distributed Systems •  Algorithm- Architecture Co- design –  A common toolchain to program FPGA devices as well as create interfaces 21
  • 22. READS – Accelerator Real-time Edge AI for Distributed Systems •  Algorithm-Architecture Co-design –  Create modular neural network components •  Regroup, recombine –  Establish physics/science-aware methodology •  Hardware-aware quantization and pruning techniques •  Homogeneous versus heterogeneous quantization •  Quantization-aware training •  Control resource re-use trade-offs of high level synthesis tools •  Differentiate between the relative hardware cost of storage, DSP, interconnect 22
  • 23. READS – Accelerator Real-time Edge AI for Distributed Systems 23
  • 24. hls4ml •  Aids Algorithm-Architecture Co-design – Translation to neural network building blocks •  Basic layers (conv, fully connected) •  Activation functions – Ongoing investigations to build a comprehensive library •  Expansion towards graph neural networks, custom scientific kernels •  Custom neurons (long-short-term memory, multi- head attention, etc. ) 24
  • 25. hls4ml •  Aids Algorithm-Architecture Co-design – Ongoing work to automate compatibility with several backend flows •  Xilinx Vivado, Intel Quartus HLS, Mentor Graphics Catapult •  Integration of model development (training, model compression) and hardware synthesis •  Automation of insertion for optimization directives and pragmas 25
  • 26. hls4ml •  Democratize design tools •  Allow domain scientists to build edge computing systems with minimal specialized hardware design training •  Open source •  HLS raises level of abstraction –  Faster simulation, faster evaluation of design alternatives –  Allow domain expert trade-off parallelism, resource re-use, latency constraint, power budget 26
  • 27. Reconfigurable autoencoder for detector front-end ASICs •  Edge computing for particle collider experiments •  Data collected from a large number of photon detectors are compressed to representative information of the “shape” –  Charge measurements from the detectors are compressed to a radiation pattern –  6 million detector channels sending data at 40MHz –  The data from the original space is compressed to lower dimensionality by the edge ASIC, transmitted, then decoded on the receiving end 27
  • 28. Reconfigurable autoencoder for detector front-end ASICs •  System constraints – Low power •  Will be part of a larger system with power budget – Radiation tolerant – Reprogrammable weights through accessible registers to enable updates and customization 28
  • 29. Reconfigurable autoencoder for detector front-end ASICs •  Autoencoder: neural network with single convolutional layer followed by a dense layer 29
  • 30. Reconfigurable autoencoder for detector front-end ASICs 30 Glossary of Deep Learning: Autoencoder, by Jaron Collins
  • 31. Reconfigurable autoencoder for detector front-end ASICs •  Dataflow –  22-bit signals from 48 detector cells 31
  • 32. Reconfigurable autoencoder for detector front-end ASICs •  Network properties –  CNN layer: eight 3x3x3 kernel matrices resulting in 128 outputs –  ReLu activation after CNN and after final dense layer –  6-bits weights –  Dense layer produces 16 10-bit outputs –  Chip can be reconfigured to produce as low as total of 64 bits in output 32
  • 33. Reconfigurable autoencoder for detector front-end ASICs 33
  • 34. Reconfigurable autoencoder for detector front-end ASICs 34
  • 35. Reconfigurable autoencoder for detector front-end ASICs 35 •  Protection against Single Event Upsets
  • 36. Reconfigurable autoencoder for detector front-end ASICs 36 •  Protection against Single Event Upsets
  • 37. Reconfigurable autoencoder for detector front-end ASICs •  Chip specs –  6b weight and bias parameters •  Total: 2,286 = 13,724 bits –  Parameters loaded via I2C interface –  Decoder component implemented off-detector on FPGA –  Chip latency – 25ns –  7nJ per inference –  280mW –  Can withstand 200MRad ionizing radiation –  2.5mm2 37
  • 38. Projects – Preliminary •  CryoAI - 22nm testchip •  Adaptive ML accelerators 38
  • 39. CryoAI - 22nm testchip •  A system-on-chip (SoC) •  Goal: test performance and leakage power trends of the technology at cryogenic temperatures •  An opportunity to showcase the hls4ml flow –  Contribute a neural network accelerator module as part of the SoC –  Perform anomaly detection •  Design considerations –  Programmability -- Cannot reload all of the parameters –  Optimal sequencing of quantization & pruning –  Assignment of sub-regions of the network to voltage domains 39
  • 40. Adaptive ML accelerators •  Why adaptation? – Energy constraints – Input variability – Translation to new platform/device – Transfer Learning 40
  • 41. Adaptive ML accelerators •  Some directions –  Emulate loss through drop out/connect •  Techniques in ML literature equate these phenomena to forcing weights to zero •  Loss in (because of) hardware may look very different –  Continuous diagnostics •  Evaluation of network certainty •  Concept of surprise –  Detection of temporal dominance of classes •  Reconfigure optimized version for dominant class 41
  • 42. Adaptive ML accelerators •  Some directions –  Critical path based hardware-aware resource management •  Class-based CP: using contributions of a neuron to a specific class –  Mean Absolute Activation, first order Taylor Approximations •  Generalized CP: relative participation of output channels at the routing of the output from a layer –  Distribute resources (e.g., total number of bits allocated for weights) according to a criticality metric 42
  • 43. Adaptive ML accelerators •  Some directions –  Characterize circuits (e.g., SRAM) to create models •  Associate voltage drop/power outage with cell decay •  Create characteristic bit masks for weights – Neural network architecture search •  Lessons learned from design space exploration in high-level synthesis •  Search for a new cell from basic building blocks –  Input: a set of convolutions and pooling of varying size –  Think of it as your module library 43
  • 44. Future Directions •  Fluid implementations –  Quickly re-targetable from software to hardware •  Scientific domains offer a vast space of computational challenges –  They need interpretable systems –  Laws of nature apparent in the system’s output •  Multiple paths need to converge –  Photonic circuits – not much automated –  FPGA –  Neuromorphic – not much automated –  Quantum - ?? 44
  • 45. Collaborators •  Northwestern University –  Han Liu (CS), Kristian Hahn (Physics) –  Manuel Blanco Valentin, Rui Shi, Bincong Ye, Yingyi Luo (now at Google), Sid Joshi (now at Intel) •  Fermi National Laboratories –  Nhan Tran, Farah Fahim, Christian Herwig, Kiyomi Seiya, et al. •  Columbia University –  Giuseppe di Guglielmo •  Lehigh University –  Josh Agar 45