SlideShare a Scribd company logo
1 of 28
Download to read offline
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
Real Time Machine Learning (RTML)
Andreas Olofsson
Program Manager
DARPA/MTO
Proposers Day
April 2, 2019
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 2
Background
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 3
Objective:
Exploit the physics of emerging devices, analog CMOS, and
non-Boolean computational models to achieve new levels of
performance and power for real-time sensor imaging systems.
Approach:
TA1: Image Application for Benchmarking: Recreate a
traditional image processing pipeline (IPP) using UPSIDE
Compute models showing no degradation in performance.
TA2: MS CMOS Demonstration: Mixed signal CMOS
implementation of the computational model and system test
bed showing 1x105x combined speed-power improvement for
analog CMOS.
TA3: Emerging Device Implementation: Image processing
demonstration combining next-generation devices with new
computation model. 1x107x (projected)
Goal: Demonstrate the capability and pathway toward embedded computing efficiency in ISR
applications w/ >1,000x processing speed and >10,000x improvement in power consumption
Mapped into
emerging devices
and analog CMOS
Detected Salient Pixels
Extracted Library
RAC
DPU
7 Nodes
TC
L1L1 L1L1L2 L2L3
0.4mm
0.9mm
Analog Vector
Matrix Multiply
Analog, Floating Gate
Pattern MatchOscillators
Memristors
Graphene
Emerging Devices Analog CMOS
Benchmarked using object classification and tracking applications
Low-precision probabilistic computing algorithms
Ex. Edges3x3 pixels
Image Pixels
DARPA UPSIDE program (2012-2018)
Unconventional Processing of Signals for Intelligent Data Exploitation
4
Selected UPSIDE results
16μm
18 μm
Functional Array
Dummy Array
Dummy Array
DummyArray
DummyArray
(b)
O1
Floating-Gate Array
c
IE IE
I1
I3
I5
I7
I9
I2
I4
I6
I8
I10
O2 O3 O4 O5 O6 O7 O8 O9 O10
(a)
2 Layer MLP Neural Network
InputNeurons
Output Neurons
Incoming
Image
• Mixed signal processing (50TOPS/W)
• Sparse image reconstruction in memristors
• Numerous publications (Nature, …)
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
• First memristor based multilayer perceptron
• Flash based 55nm analog computing (>10TOPS/W)
• Numerous publications (Nature, …)
University of Michigan UCSB
Key Takeaways
Analog computing beats digital on VMMs
Challenges:
• Comparing results (lack of data)  RTML
• Transition valley of death  RTML
• High cost of design  RTML
• Manufacturing latency too long RTML
• Manufacturability and scalability  RTML
5
Building a proper baseline for path-finding AI HW research
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
ISSCC2019
• Extreme expense of HW development
means extremely sparse data
• How can we know if a new result is
good without a baseline?
• A compiler would let us “paint the
space” of possibilities
• Objective: Better science
6
Generating “right-sized” HW for SWaP constrained systems
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
A. Canziani, et al, “Analysis of deep neural networks”
• 10-100X network tradeoffs
• Additional micro-tradeoffs (bit-
width, pruning, etc.)
• Having more accuracy than needed
wastes energy, latency, and power
• A compiler would enable generation
of right-sized HW
• Objective: Enable new applications
7
Optimizing hardware for ultra-low latency
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
• Current HW optimized for throughput
and programmability
• Extreme expense of HW development
means latency of ASICs is unexplored.
• Green-field: How low can we go?
• Objective: Enable new applications
Source: NVIDIA
8
Building bridges
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
Application Experts ML Experts Platform Experts
RTML
(New!)
Objective: Faster innovation
Source: NVIDIA, Getty Images, Wikipedia
TensorFlow PyTorch
9
Example of a low-latency application
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
Source: Qualcomm, 2017Source: Qualcomm
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 10
DARPA RTML Program
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 11
DARPA RTML program details
The DARPA RTML program seeks to create no-human-in-the-loop
hardware generators and compilers to enable fully automated
creation of ML Application-Specific Integrated Circuits
(ASICs) from high level source code.
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 12
Phase 1: machine learning hardware compiler
• Develop hardware generator that converts programs expressed in common ML frameworks (such as
TensorFlow, PyTorch) and generate standard Verilog code and hardware configurations
• Generate synthesizable Verilog that can be fed into layout generation tools, such as from DARPA IDEA
• Demonstrate a compiler that auto-generates a large catalog of scalable ML hardware instances
• Demonstrate generation of instances for a diversity of architectures
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 13
The RTML generator should support a diversity of ML architectures. Architectures of interest include:
a) conventional feed forward (convolutional) neural networks,
b) recurrent networks and their specialized versions,
c) neuroscience-inspired architectures, such as spike time-dependent neural nets including their stochastic
counterparts,
d) non-neural ML architectures inspired by psychophysics as well as statistical techniques,
e) classical supervised learning (e.g., regression and decision trees),
f) unsupervised learning (e.g., clustering) approaches,
g) semi-supervised learning methods,
h) generative adversarial learning techniques, and
i) other approaches such as transfer learning, reinforcement learning, manifold learning, and/or life-long
learning
RTML general purpose generator
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 14
Phase 1 RTML generator metrics
Metrics
Type Training and Inference
Peak Performance
Scalable configurable at generation with support
up to full reticle size at 14nm
Inference Energy Efficiency1 >10 TOPS/W
Min Number of Architectures2 10
Hardware Generation Automation 100% (ML to Verilog)
I/O Interface
Highly efficient chip-to-chip interface
(such as from the DARPA CHIPS program)
Design Input (source code)
High level network description. Support for
TensorFlow, PyTorch, Caffe2, CNTK, MXNet, ONNX
Generator (Compiler Front-end) Output Verilog
Deliverable
Software, license3, generator source code, flow scripts, documentation,
GDSII for generated designs
1Program is interested in real work accomplished per Watt, not arbitrary peak mathematical ops/W. As a general guidance we are specifying a 10 TOPS/W at 14nm
as a minimum threshold with the understanding that efficiency numbers are tightly coupled to accuracy, data sets, and actual applications. Efficiency metric includes all
SoC power including IO power needed to sustain peak throughput. Based upon normalized MAC for the proposed application.
2To demonstrate a general purpose ML compiler, teams are expected to complete GDSII implementation of multiple ML architectures
3Delivered with a minimum of government purpose rights; open source licenses are preferred.
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 15
An introduction to the IDEA silicon compiler (RTL/schematic to
GDSII)
Data
IDEA
Unified Layout Generator
Package BoardChip
Models
Training
24 hours, No Human In the Loop
2018
• Program Kickoff (Jun)
2019
• First Integration Exercise (Jan)
2019
• Alpha code drop (Jun)
2020
• A usable Silicon Compiler
• 50% PPA
2022
• A great Silicon Compiler
• 100% PPA
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 16
An introduction to the CHIPS interface
• AIB (Advanced Interface Bus) is a PHY-level interface standard
for high bandwidth, low power die-to-die communication
• AIB is a clock-forwarded parallel data transfer like DDR DRAM
• High density with 2.5D interposer (e.g., CoWoS, EMIB) for
multi-chip packaging
• AIB is PHY level (OSI Layer 1)
• Can build protocols like AXI-4 on top of AIB
• AIB Performance:
• 1Tbps/mm shoreline
• ~0.1pJ/bit
• <5ns latency
• Open Source!
• Standard and reference implementation
• https://github.com/intel/aib-phy-hardware
AIB Adopters:
-Boeing
-Intrinsix
-Synopsys
-Intel
-Lockheed Martin
-Sandia
-Jariet
-NCSU
-U. of Michigan
-Ayar Labs
ADC/DAC
Machine Learning
Memory
Processors
Adjacent IP
Etc. …
Your
Chiplet
AIB
Our Chiplet
AIB
AIB
AIB
CHIPS Platform
Stratix 10
FPGA die
14nm
A
I
B
A
I
B
A
I
B
A
I
B
A
I
B
A
I
B
Ethernet Tile
56G PAM/28G NRZ
Your Chip
Here
Your Chip
Here
Opt1 Opt2 Opt4 Opt5
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 17
• Design space exploration through circuit implementation of multiple ML architectures
• General purpose, tunable generator that can support optimization of ML hardware for specific
requirements
• Hardware demonstration of RTML for a particular application area
• Application areas:
• Future high bandwidth wireless communication systems, like the 60 GHz range of the 5G standard
• High bandwidth image processing in SWaP constrained systems
• DARPA will provide fabrication support through a number of separately funded multi-project or
dedicated wafer runs
Phase 2: real time machine learning systems
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 18
Phase 2 RTML metrics
Phase 2 Hardware Guidelines Min1 Max1
Data Throughput 400 Kbps 400 Gbps
Latency 100 µs 100 s
Total Power2 200 µW 200 W
Application-Specific Accuracy3 0.6 0.99
Dataset Proposer defined4
I/O Interface Highly efficient chip to chip interface (such as CHIPS)
Design Input (source code)
High level network description. Support for
TensorFlow, PyTorch, Caffe2, CNTK, and MXNet, ONNX
Design Output GDSII ready for manufacturing
Hardware Generation Automation 100%
Deliverables
Software, license5, Design Source code, flow scripts, documentation,
GDSII, chiplet hardware
1Teams are expected to explore a wide trade space of power, latency, accuracy, and data throughput and show the ability to tune hardware over a large range of performance metrics. Max
values are not expected to be achieved simultaneously.
2Power must include everything needed to operate, including power delivery, thermal management, external memory, and sensor interfaces.
3For example, ResNet152 has an accuracy of > 0.96 on the ImageNet database:
http://image-net.org/challenges/LSVRC/2015/results
4Proposals are expected to outline a clear plan for validating the quality of the compiler output, including details of the publicly available benchmarks and datasets from industry, government, and
academia that will be used
5Delivered with a minimum of government purpose rights; open source licenses are preferred
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 19
RTML schedule
• 0 months (Fall 2019): Kickoff workshop
• 9 months (Mid 2020): Alpha release of RTML generator at joint NSF/DARPA workshop
• 18 months (Spring 2021): Release of V1.0 RTML generator and demonstration with a RTML compiler flow
• 27 months (End 2021): Release of V1.5 tunable hardware generator
• 36 months (Fall 2022): Hardware demonstration of a real time machine for specific application
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 20
RTML seeks answers for the following research questions
• Can we build an application specific silicon compiler for RTML?
• What subset of current ML frameworks syntax/methods can be supported with a compiler?
• What needs to be added to current ML frameworks to support efficient translation?
• What hardware architectures are best suited for real time operation?
• What are the lower latency limits for various RTML tasks?
• What is the lowest SWaP feasible for various RTML tasks?
• What are the tradeoffs between energy efficiency, throughput, latency, area, accuracy?
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 21
• Investigatory research that does not result in deliverable hardware designs
• Circuits that cannot be produced in standards CMOS foundries (like 14nm)
• New Domain Specific Languages
• New approaches to physical layout (RTL to GDSII)
• Incremental efforts
RTML does NOT seek proposals for these areas
22
Joint NSF collaboration
• NSF: Single phase, exploratory research into circuit architectures and algorithms
• DARPA:
• Phase 1: Fully automated hardware generators “compilers” for state of the art machine learning algorithms
and networks, using existing programming frameworks (TensorFlow, etc.) as inputs
• Phase 2: Deliver novel machine learning architectures and circuit generators that enable real time machine
learning for autonomous machines
• Joint solicitation release and workshops at 9 and 18 mos into each phase
• DARPA teams pull in NSF work during Phase 1 to Phase 2 transition
DARPA Phase 1 (18 mos) DARPA Phase 2 (18 mos)
NSF Phase 1 (36 mos)
Alpha
Release
V1.0 Release
and GDSII
Delivery
V1.5 Release
& Tapeout
Silicon
Demo
NSF and DARPA team to explore
rapid development of energy efficient
hardware and real-time machine
learning architectures
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 23
Required:
• Collaboration with other program performers
• Active participation in joint DARPA-NSF workshops every 9 months
• Open interfaces
Strongly encouraged:
• Publishing code and results early and often
• Permissive (non-viral, non-proprietary) open source licensing
Collaboration and licensing
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 24
Funding of DARPA RTML Phase 2
• RTML includes a base Phase 1 and option Phase 2
• The proposed planning and costing by Phase (and by Task) provides DARPA with convenient times to
evaluate funding options and technical progress
• Progression into Phase 2 is not guaranteed; factors that may affect Phase 2 funding decisions
• Availability of funding
• Cost of proposals selected for funding
• Demonstrated performance relative to program goals
• Interaction with government evaluation teams
• Compatibility with potential national security needs
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 25
Important dates
• BAA Posting Date: March 15, 2019
• Proposers Day: April 2, 2019
• FAQ Submission Deadline: April 15, 2019 at 1:00 PM
o DARPA will post a consolidated Question and Answer (FAQ) document on a regular basis. To
access the posting go to: http://www.darpa.mil/work-with-us/opportunities.
• Proposal Due Date: May 1, 2019 at 1:00 PM
• Estimated period of performance start: October 2019
• Questions: HR001119S0037@darpa.mil
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 26
1. Overall Scientific and Technical Merit
o Demonstrate that the proposed technical approach is innovative, feasible, achievable, and complete
o A clear and feasible plan for release of high quality software is provided
o Task descriptions and associated technical elements provided are complete and in a logical sequence with all proposed research
clearly defined such that a final outcome that achieves the goal
2. Potential Contribution and Relevance to the DARPA Mission
o Note the updated wording, with an emphasis on contribution to U.S. national security and U.S. technological capabilities
3. Impact on Machine Learning Landscape
o The proposed research will successfully complete a fundamental exploration of the tradeoffs between system efficiency and
performance for a number of ML architectures
o The proposed research significantly advanced the state of the art in machine learning hardware
4. Cost Realism
o Ensure proposed costs are realistic for the technical and management approach and accurately reflect the goals and objectives of the
solicitation
o Verify that proposed costs are sufficiently detailed, complete, and consistent with the Statement of Work
Evaluation criteria, in order of importance
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 27
Agenda
RTML Proposers Day
DARPA - 675 N Randolph Street, Arlington, VA 22203
Tuesday, April 2, 2019
Start End Time Speaker
8:00 AM 9:00 AM 60 Registration and Poster Setup
9:00 AM 9:15 AM 15 Welcome - Security, Logistics Ron Baxter
9:15 AM 9:55 AM 40 DARPA RTML Program Overview Andreas Olofsson
9:55 AM 10:15 AM 20 NSF RTML Collaboration Overview Sankar Basu
10:15 AM 10:45 AM 30 Contracting Overview Michael Blackstone
10:45 AM 11:00 AM 15 Break
11:00 AM 11:45 AM 45 Question and Answer Session Andreas Olofsson
11:45 AM 1:00 PM 75 Lunch (On Your Own)
1:00 PM 3:00 PM 120 Poster and Networking Session All
3:00 PM 3:00 PM 0 Conclude
www.darpa.mil
"Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 28

More Related Content

What's hot

How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformGanesan Narayanasamy
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersIntel® Software
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performanceinside-BigData.com
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM Ganesan Narayanasamy
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsGanesan Narayanasamy
 
ISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationEric Van Hensbergen
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learningGanesan Narayanasamy
 
Overview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future RoadmapOverview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future Roadmapinside-BigData.com
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsIntel® Software
 

What's hot (20)

How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
ISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel Presentation
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learning
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
Overview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future RoadmapOverview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future Roadmap
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 

Similar to Real time machine learning proposers day v3

RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V International
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesIntel® Software
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!TigerGraph
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john maoNAVER D2
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning Dr. Swaminathan Kathirvel
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale SupercomputerSagar Dolas
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAHAkash M Shah
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...PT Datacomm Diangraha
 
Designing High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCDesigning High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCObject Automation
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data ScienceDesigning High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data ScienceObject Automation
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceLEGATO project
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based MicroprocessorsPerformance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based MicroprocessorsHannes Tschofenig
 

Similar to Real time machine learning proposers day v3 (20)

RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john mao
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAH
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
Designing High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCDesigning High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPC
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data ScienceDesigning High-Performance and Scalable Middleware for HPC, AI and Data Science
Designing High-Performance and Scalable Middleware for HPC, AI and Data Science
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Accelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learningAccelerating TensorFlow with RDMA for high-performance deep learning
Accelerating TensorFlow with RDMA for high-performance deep learning
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
chameleon chip
chameleon chipchameleon chip
chameleon chip
 
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based MicroprocessorsPerformance of State-of-the-Art Cryptography on ARM-based Microprocessors
Performance of State-of-the-Art Cryptography on ARM-based Microprocessors
 

More from mustafa sarac

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma sonmustafa sarac
 
Latka december digital
Latka december digitalLatka december digital
Latka december digitalmustafa sarac
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualmustafa sarac
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpymustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmersmustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizmustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mimustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Marketsmustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimimustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tshmustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guidemustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dicemustafa sarac
 
Handbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentHandbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentmustafa sarac
 

More from mustafa sarac (20)

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
The book of Why
The book of WhyThe book of Why
The book of Why
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
 
Handbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentHandbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatment
 

Recently uploaded

Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 

Recently uploaded (20)

Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 

Real time machine learning proposers day v3

  • 1. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" Real Time Machine Learning (RTML) Andreas Olofsson Program Manager DARPA/MTO Proposers Day April 2, 2019
  • 2. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 2 Background
  • 3. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 3 Objective: Exploit the physics of emerging devices, analog CMOS, and non-Boolean computational models to achieve new levels of performance and power for real-time sensor imaging systems. Approach: TA1: Image Application for Benchmarking: Recreate a traditional image processing pipeline (IPP) using UPSIDE Compute models showing no degradation in performance. TA2: MS CMOS Demonstration: Mixed signal CMOS implementation of the computational model and system test bed showing 1x105x combined speed-power improvement for analog CMOS. TA3: Emerging Device Implementation: Image processing demonstration combining next-generation devices with new computation model. 1x107x (projected) Goal: Demonstrate the capability and pathway toward embedded computing efficiency in ISR applications w/ >1,000x processing speed and >10,000x improvement in power consumption Mapped into emerging devices and analog CMOS Detected Salient Pixels Extracted Library RAC DPU 7 Nodes TC L1L1 L1L1L2 L2L3 0.4mm 0.9mm Analog Vector Matrix Multiply Analog, Floating Gate Pattern MatchOscillators Memristors Graphene Emerging Devices Analog CMOS Benchmarked using object classification and tracking applications Low-precision probabilistic computing algorithms Ex. Edges3x3 pixels Image Pixels DARPA UPSIDE program (2012-2018) Unconventional Processing of Signals for Intelligent Data Exploitation
  • 4. 4 Selected UPSIDE results 16μm 18 μm Functional Array Dummy Array Dummy Array DummyArray DummyArray (b) O1 Floating-Gate Array c IE IE I1 I3 I5 I7 I9 I2 I4 I6 I8 I10 O2 O3 O4 O5 O6 O7 O8 O9 O10 (a) 2 Layer MLP Neural Network InputNeurons Output Neurons Incoming Image • Mixed signal processing (50TOPS/W) • Sparse image reconstruction in memristors • Numerous publications (Nature, …) "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" • First memristor based multilayer perceptron • Flash based 55nm analog computing (>10TOPS/W) • Numerous publications (Nature, …) University of Michigan UCSB Key Takeaways Analog computing beats digital on VMMs Challenges: • Comparing results (lack of data)  RTML • Transition valley of death  RTML • High cost of design  RTML • Manufacturing latency too long RTML • Manufacturability and scalability  RTML
  • 5. 5 Building a proper baseline for path-finding AI HW research "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" ISSCC2019 • Extreme expense of HW development means extremely sparse data • How can we know if a new result is good without a baseline? • A compiler would let us “paint the space” of possibilities • Objective: Better science
  • 6. 6 Generating “right-sized” HW for SWaP constrained systems "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" A. Canziani, et al, “Analysis of deep neural networks” • 10-100X network tradeoffs • Additional micro-tradeoffs (bit- width, pruning, etc.) • Having more accuracy than needed wastes energy, latency, and power • A compiler would enable generation of right-sized HW • Objective: Enable new applications
  • 7. 7 Optimizing hardware for ultra-low latency "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" • Current HW optimized for throughput and programmability • Extreme expense of HW development means latency of ASICs is unexplored. • Green-field: How low can we go? • Objective: Enable new applications Source: NVIDIA
  • 8. 8 Building bridges "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" Application Experts ML Experts Platform Experts RTML (New!) Objective: Faster innovation Source: NVIDIA, Getty Images, Wikipedia TensorFlow PyTorch
  • 9. 9 Example of a low-latency application "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" Source: Qualcomm, 2017Source: Qualcomm
  • 10. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 10 DARPA RTML Program
  • 11. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 11 DARPA RTML program details The DARPA RTML program seeks to create no-human-in-the-loop hardware generators and compilers to enable fully automated creation of ML Application-Specific Integrated Circuits (ASICs) from high level source code.
  • 12. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 12 Phase 1: machine learning hardware compiler • Develop hardware generator that converts programs expressed in common ML frameworks (such as TensorFlow, PyTorch) and generate standard Verilog code and hardware configurations • Generate synthesizable Verilog that can be fed into layout generation tools, such as from DARPA IDEA • Demonstrate a compiler that auto-generates a large catalog of scalable ML hardware instances • Demonstrate generation of instances for a diversity of architectures
  • 13. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 13 The RTML generator should support a diversity of ML architectures. Architectures of interest include: a) conventional feed forward (convolutional) neural networks, b) recurrent networks and their specialized versions, c) neuroscience-inspired architectures, such as spike time-dependent neural nets including their stochastic counterparts, d) non-neural ML architectures inspired by psychophysics as well as statistical techniques, e) classical supervised learning (e.g., regression and decision trees), f) unsupervised learning (e.g., clustering) approaches, g) semi-supervised learning methods, h) generative adversarial learning techniques, and i) other approaches such as transfer learning, reinforcement learning, manifold learning, and/or life-long learning RTML general purpose generator
  • 14. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 14 Phase 1 RTML generator metrics Metrics Type Training and Inference Peak Performance Scalable configurable at generation with support up to full reticle size at 14nm Inference Energy Efficiency1 >10 TOPS/W Min Number of Architectures2 10 Hardware Generation Automation 100% (ML to Verilog) I/O Interface Highly efficient chip-to-chip interface (such as from the DARPA CHIPS program) Design Input (source code) High level network description. Support for TensorFlow, PyTorch, Caffe2, CNTK, MXNet, ONNX Generator (Compiler Front-end) Output Verilog Deliverable Software, license3, generator source code, flow scripts, documentation, GDSII for generated designs 1Program is interested in real work accomplished per Watt, not arbitrary peak mathematical ops/W. As a general guidance we are specifying a 10 TOPS/W at 14nm as a minimum threshold with the understanding that efficiency numbers are tightly coupled to accuracy, data sets, and actual applications. Efficiency metric includes all SoC power including IO power needed to sustain peak throughput. Based upon normalized MAC for the proposed application. 2To demonstrate a general purpose ML compiler, teams are expected to complete GDSII implementation of multiple ML architectures 3Delivered with a minimum of government purpose rights; open source licenses are preferred.
  • 15. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 15 An introduction to the IDEA silicon compiler (RTL/schematic to GDSII) Data IDEA Unified Layout Generator Package BoardChip Models Training 24 hours, No Human In the Loop 2018 • Program Kickoff (Jun) 2019 • First Integration Exercise (Jan) 2019 • Alpha code drop (Jun) 2020 • A usable Silicon Compiler • 50% PPA 2022 • A great Silicon Compiler • 100% PPA
  • 16. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 16 An introduction to the CHIPS interface • AIB (Advanced Interface Bus) is a PHY-level interface standard for high bandwidth, low power die-to-die communication • AIB is a clock-forwarded parallel data transfer like DDR DRAM • High density with 2.5D interposer (e.g., CoWoS, EMIB) for multi-chip packaging • AIB is PHY level (OSI Layer 1) • Can build protocols like AXI-4 on top of AIB • AIB Performance: • 1Tbps/mm shoreline • ~0.1pJ/bit • <5ns latency • Open Source! • Standard and reference implementation • https://github.com/intel/aib-phy-hardware AIB Adopters: -Boeing -Intrinsix -Synopsys -Intel -Lockheed Martin -Sandia -Jariet -NCSU -U. of Michigan -Ayar Labs ADC/DAC Machine Learning Memory Processors Adjacent IP Etc. … Your Chiplet AIB Our Chiplet AIB AIB AIB CHIPS Platform Stratix 10 FPGA die 14nm A I B A I B A I B A I B A I B A I B Ethernet Tile 56G PAM/28G NRZ Your Chip Here Your Chip Here Opt1 Opt2 Opt4 Opt5
  • 17. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 17 • Design space exploration through circuit implementation of multiple ML architectures • General purpose, tunable generator that can support optimization of ML hardware for specific requirements • Hardware demonstration of RTML for a particular application area • Application areas: • Future high bandwidth wireless communication systems, like the 60 GHz range of the 5G standard • High bandwidth image processing in SWaP constrained systems • DARPA will provide fabrication support through a number of separately funded multi-project or dedicated wafer runs Phase 2: real time machine learning systems
  • 18. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 18 Phase 2 RTML metrics Phase 2 Hardware Guidelines Min1 Max1 Data Throughput 400 Kbps 400 Gbps Latency 100 µs 100 s Total Power2 200 µW 200 W Application-Specific Accuracy3 0.6 0.99 Dataset Proposer defined4 I/O Interface Highly efficient chip to chip interface (such as CHIPS) Design Input (source code) High level network description. Support for TensorFlow, PyTorch, Caffe2, CNTK, and MXNet, ONNX Design Output GDSII ready for manufacturing Hardware Generation Automation 100% Deliverables Software, license5, Design Source code, flow scripts, documentation, GDSII, chiplet hardware 1Teams are expected to explore a wide trade space of power, latency, accuracy, and data throughput and show the ability to tune hardware over a large range of performance metrics. Max values are not expected to be achieved simultaneously. 2Power must include everything needed to operate, including power delivery, thermal management, external memory, and sensor interfaces. 3For example, ResNet152 has an accuracy of > 0.96 on the ImageNet database: http://image-net.org/challenges/LSVRC/2015/results 4Proposals are expected to outline a clear plan for validating the quality of the compiler output, including details of the publicly available benchmarks and datasets from industry, government, and academia that will be used 5Delivered with a minimum of government purpose rights; open source licenses are preferred
  • 19. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 19 RTML schedule • 0 months (Fall 2019): Kickoff workshop • 9 months (Mid 2020): Alpha release of RTML generator at joint NSF/DARPA workshop • 18 months (Spring 2021): Release of V1.0 RTML generator and demonstration with a RTML compiler flow • 27 months (End 2021): Release of V1.5 tunable hardware generator • 36 months (Fall 2022): Hardware demonstration of a real time machine for specific application
  • 20. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 20 RTML seeks answers for the following research questions • Can we build an application specific silicon compiler for RTML? • What subset of current ML frameworks syntax/methods can be supported with a compiler? • What needs to be added to current ML frameworks to support efficient translation? • What hardware architectures are best suited for real time operation? • What are the lower latency limits for various RTML tasks? • What is the lowest SWaP feasible for various RTML tasks? • What are the tradeoffs between energy efficiency, throughput, latency, area, accuracy?
  • 21. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 21 • Investigatory research that does not result in deliverable hardware designs • Circuits that cannot be produced in standards CMOS foundries (like 14nm) • New Domain Specific Languages • New approaches to physical layout (RTL to GDSII) • Incremental efforts RTML does NOT seek proposals for these areas
  • 22. 22 Joint NSF collaboration • NSF: Single phase, exploratory research into circuit architectures and algorithms • DARPA: • Phase 1: Fully automated hardware generators “compilers” for state of the art machine learning algorithms and networks, using existing programming frameworks (TensorFlow, etc.) as inputs • Phase 2: Deliver novel machine learning architectures and circuit generators that enable real time machine learning for autonomous machines • Joint solicitation release and workshops at 9 and 18 mos into each phase • DARPA teams pull in NSF work during Phase 1 to Phase 2 transition DARPA Phase 1 (18 mos) DARPA Phase 2 (18 mos) NSF Phase 1 (36 mos) Alpha Release V1.0 Release and GDSII Delivery V1.5 Release & Tapeout Silicon Demo NSF and DARPA team to explore rapid development of energy efficient hardware and real-time machine learning architectures "Distribution Statement "A" Approved for Public Release, Distribution Unlimited"
  • 23. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 23 Required: • Collaboration with other program performers • Active participation in joint DARPA-NSF workshops every 9 months • Open interfaces Strongly encouraged: • Publishing code and results early and often • Permissive (non-viral, non-proprietary) open source licensing Collaboration and licensing
  • 24. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 24 Funding of DARPA RTML Phase 2 • RTML includes a base Phase 1 and option Phase 2 • The proposed planning and costing by Phase (and by Task) provides DARPA with convenient times to evaluate funding options and technical progress • Progression into Phase 2 is not guaranteed; factors that may affect Phase 2 funding decisions • Availability of funding • Cost of proposals selected for funding • Demonstrated performance relative to program goals • Interaction with government evaluation teams • Compatibility with potential national security needs
  • 25. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 25 Important dates • BAA Posting Date: March 15, 2019 • Proposers Day: April 2, 2019 • FAQ Submission Deadline: April 15, 2019 at 1:00 PM o DARPA will post a consolidated Question and Answer (FAQ) document on a regular basis. To access the posting go to: http://www.darpa.mil/work-with-us/opportunities. • Proposal Due Date: May 1, 2019 at 1:00 PM • Estimated period of performance start: October 2019 • Questions: HR001119S0037@darpa.mil
  • 26. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 26 1. Overall Scientific and Technical Merit o Demonstrate that the proposed technical approach is innovative, feasible, achievable, and complete o A clear and feasible plan for release of high quality software is provided o Task descriptions and associated technical elements provided are complete and in a logical sequence with all proposed research clearly defined such that a final outcome that achieves the goal 2. Potential Contribution and Relevance to the DARPA Mission o Note the updated wording, with an emphasis on contribution to U.S. national security and U.S. technological capabilities 3. Impact on Machine Learning Landscape o The proposed research will successfully complete a fundamental exploration of the tradeoffs between system efficiency and performance for a number of ML architectures o The proposed research significantly advanced the state of the art in machine learning hardware 4. Cost Realism o Ensure proposed costs are realistic for the technical and management approach and accurately reflect the goals and objectives of the solicitation o Verify that proposed costs are sufficiently detailed, complete, and consistent with the Statement of Work Evaluation criteria, in order of importance
  • 27. "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 27 Agenda RTML Proposers Day DARPA - 675 N Randolph Street, Arlington, VA 22203 Tuesday, April 2, 2019 Start End Time Speaker 8:00 AM 9:00 AM 60 Registration and Poster Setup 9:00 AM 9:15 AM 15 Welcome - Security, Logistics Ron Baxter 9:15 AM 9:55 AM 40 DARPA RTML Program Overview Andreas Olofsson 9:55 AM 10:15 AM 20 NSF RTML Collaboration Overview Sankar Basu 10:15 AM 10:45 AM 30 Contracting Overview Michael Blackstone 10:45 AM 11:00 AM 15 Break 11:00 AM 11:45 AM 45 Question and Answer Session Andreas Olofsson 11:45 AM 1:00 PM 75 Lunch (On Your Own) 1:00 PM 3:00 PM 120 Poster and Networking Session All 3:00 PM 3:00 PM 0 Conclude
  • 28. www.darpa.mil "Distribution Statement "A" Approved for Public Release, Distribution Unlimited" 28