SlideShare a Scribd company logo
1 of 29
Download to read offline
1
DIPARTIMENTO DI ELETTRONICA,
INFORMAZIONE E BIOINGEGNERIA
AMPS
Automatic Mapping, Partitioning and Scheduling
for hardware acceleration on FPGAs
Mirko Salaris: mirko.salaris@mail.polimi.it
Marco Rabozzi: marco.rabozzi@polimi.it
May 17-31, 2019
NGCX at San Francisco
2
Steps for software acceleration on FPGA
• Software profiling
• Identification of candidate hardware functions
• Design Space Exploration of hardware functions
implementations
• Choice of the function to implement in hardware
3
Steps for software acceleration on FPGA
✓ Software profiling
✓ Identification of candidate hardware functions
✓ Design Space Exploration of hardware functions
implementations
✓ Choice of the function to implement in hardware
Already supported by CAOS
1
[1] CAOS: CAD as an Adaptive OpenPlatform Service, http://caos.necst.it
4
Steps for software acceleration on FPGA
✓ Software profiling
✓ Identification of candidate hardware functions
✓ Design Space Exploration of hardware functions
implementations
✓ Choice of the function to implement in hardware
What about the acceleration
of multiple functions?
5
Steps for software acceleration on FPGA
✓ Software profiling
✓ Identification of candidate hardware functions
✓ Design Space Exploration of hardware functions
implementations
• Hardware functions implementations selection
6
Steps for software acceleration on FPGA
✓ Software profiling
✓ Identification of candidate hardware functions
✓ Design Space Exploration of hardware functions
implementations
• Hardware functions implementations selection
• Hardware functions partitioning
into one or more bitstreams
• Scheduling of the FPGA
reconfigurations
7
AMPS – High Level Overview
8
AMPS – Profiling
Function Self Time % Total Time %
funA 98.71% 32.26%
funB 92.65% 12.83%
funC 89.26% 27.98%
funD 94.37% 9.41%
funE 2.73% 68.52%
… … …
9
AMPS – Profiling
Function Self Time % Total Time %
funA 98.71% 32.26%
funB 92.65% 12.83%
funC 89.26% 27.98%
funD 94.37% 9.41%
funE 2.73% 68.52%
… … …
10
AMPS – Call Trace Analysis
The list of function calls, in order
11
AMPS – Call Trace Analysis
The list of function calls, in order
funE
funB
funD
funA
funA
funA
funA
funA
funF
funG
funG
funG
funE
funB
funD
funB
funD
funF
funG
funH
funC
funE
funA
funA
funA
funF
funG
funG
funH
funC
funF
funG
funG
funE
funA
funA
funA
funB
funD
funB
funD
funB
funF
funG
funF
funG
funH
funC
funE
funA
funA
funA
12
AMPS – Call Trace Analysis
The list of function calls, in order
funE
funB
funD
funA
funA
funA
funA
funA
funF
funG
funG
funG
funE
funB
funD
funB
funD
funF
funG
funH
funC
funE
funA
funA
funA
funF
funG
funG
funH
funC
funF
funG
funG
funE
funA
funA
funA
funB
funD
funB
funD
funB
funF
funG
funF
funG
funH
funC
funE
funA
funA
funA
NOT synthesizable in
hardware:
funE, funF, funG, funH
13
AMPS – Call Trace Analysis
The list of function calls, in order
funB
funD
funA
funA
funA
funA
funA
funB
funD
funB
funD
funC
funA
funA
funA
funC
funA
funA
funA
funB
funD
funB
funD
funB
funC
funA
funA
funA
14
AMPS – Call Trace Analysis
The list of function calls, in order
funB
funD
funA
funA
funA
funA
funA
funB
funD
funB
funD
funC
funA
funA
funA
funC
funA
funA
funA
funB
funD
funB
funD
funB
funC
funA
funA
funA
funA is always called in
blocks of multiples calls
15
AMPS – Call Trace Analysis
The list of function calls, in order
funB
funD
funA
funA
funA
funA
funA
funB
funD
funB
funD
funC
funA
funA
funA
funC
funA
funA
funA
funB
funD
funB
funD
funB
funC
funA
funA
funA
funA is always called in
blocks of multiples calls
funB and funD are always
called in quick succession
and in an alternate
fashion
16
AMPS – Call Trace Analysis
The list of function calls, in order
funB
funD
funA
funA
funA
funA
funA
funB
funD
funB
funD
funC
funA
funA
funA
funC
funA
funA
funA
funB
funD
funB
funD
funB
funC
funA
funA
funA
funA is always called in
blocks of multiples calls
funB and funD are always
called in quick succession
and in an alternate
fashion
Other patterns?
17
AMPS – DSE
Function Implementation Performance Resources
function_1 F1.impl_1 Execution time: 14.68s
Clock Frequency: 200MHz
BRAM_18K: 1523 (35%)
FF: 1211 (0.05%)
LUT: 2211 (0.19%)
[…]
F1.impl_2 Execution time: 12.47s
Clock Frequency: 220MHz
BRAM_18K: 3 (0.07%)
FF: 1274 (0.05%)
LUT: 1937 (0.16%)
[…]
function_2 F2.impl_1 […] […]
Automated Design Space Exploration and Roofline Analysis for FPGA-based HLS Applications
Marco Siracusa, Marco Rabozzi, Lorenzo di Tucci, Marco Domenico Santambrogio
18
AMPS – DSE
19
Partitioning, Mapping
and Scheduling
20
Partitioning, Mapping
and Scheduling
21
Is this the best?
Partitioning, Mapping
and Scheduling
22
Function Self Time % Total Time %
funA 98.71% 32.26%
funC 89.26% 27.98%
Partitioning, Mapping
and Scheduling
23
Function Self Time % Total Time %
funA 98.71% 32.26%
funC 89.26% 27.98%
funA is always called in
blocks of multiples calls
funC is called few times
Partitioning, Mapping
and Scheduling
24
Function Self Time % Total Time %
funA 98.71% 32.26%
funC 89.26% 27.98%
funA is always called in
blocks of multiples calls
funC is called few times
Partitioning, Mapping
and Scheduling
25
Function Self Time % Total Time %
funA 98.71% 32.26%
funC 89.26% 27.98%
funA is always called in
blocks of multiples calls
funC is called few times
funB and funD are always
called in quick succession
and in an alternate
fashion
Partitioning, Mapping
and Scheduling
26
Function Self Time % Total Time %
funA 98.71% 32.26%
funC 89.26% 27.98%
funA is always called in
blocks of multiples calls
funC is called few times
funB and funD are always
called in quick succession
and in an alternate
fashion
Partitioning, Mapping
and Scheduling
27
Function Self Time % Total Time %
funA 98.71% 32.26%
funC 89.26% 27.98%
funA is always called in
blocks of multiples calls
funC is called few times
funB and funD are always
called in quick succession
and in an alternate
fashion
Is this the best?
Partitioning, Mapping
and Scheduling
28
Conclusions
• The concurrent acceleration of multiple functions requires
multiple steps
• There is no easy way to decouple these steps while still
guaranteeing optimality
Future works:
• Validate the proposed flow on a set of real applications
• Integrate this flow into CAOS
29
DIPARTIMENTO DI ELETTRONICA,
INFORMAZIONE E BIOINGEGNERIA
Mirko Salaris: mirko.salaris@mail.polimi.it
Marco Rabozzi: marco.rabozzi@polimi.it
Thank You

More Related Content

Similar to Automatic mapping, partitioning and scheduling for hardware acceleration on FPGAs

SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsBrendan Gregg
 
List intersection for web search: Algorithms, Cost Models, and Optimizations
List intersection for web search: Algorithms, Cost Models, and OptimizationsList intersection for web search: Algorithms, Cost Models, and Optimizations
List intersection for web search: Algorithms, Cost Models, and OptimizationsSunghwan Kim
 
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
MCA Daemon: Hybrid Throughput Analysis Beyond Basic BlocksMCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
MCA Daemon: Hybrid Throughput Analysis Beyond Basic BlocksMin-Yih Hsu
 
Introduction to architecture exploration
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture explorationDeepak Shankar
 
Advanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflowAdvanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflowDatabricks
 
Unit i-fundamentals of programmable DSP processors
Unit i-fundamentals of programmable DSP processorsUnit i-fundamentals of programmable DSP processors
Unit i-fundamentals of programmable DSP processorsManish K
 
ExtraV - Boosting Graph Processing Near Storage with a Coherent Accelerator
ExtraV - Boosting Graph Processing Near Storage with a Coherent AcceleratorExtraV - Boosting Graph Processing Near Storage with a Coherent Accelerator
ExtraV - Boosting Graph Processing Near Storage with a Coherent AcceleratorJinho Lee
 
Crossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCsCrossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCsAndreas Koschak
 
Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...
Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...
Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...Virtual Forge
 
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQLHBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQLCloudera, Inc.
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing LandscapeSasha Goldshtein
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdfssuser28de9e
 
OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...
OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...
OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...NECST Lab @ Politecnico di Milano
 
1 Vampir Overview
1 Vampir Overview1 Vampir Overview
1 Vampir OverviewPTIHPA
 
Graph-Tool in Practice
Graph-Tool in PracticeGraph-Tool in Practice
Graph-Tool in PracticeMosky Liu
 

Similar to Automatic mapping, partitioning and scheduling for hardware acceleration on FPGAs (20)

SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
List intersection for web search: Algorithms, Cost Models, and Optimizations
List intersection for web search: Algorithms, Cost Models, and OptimizationsList intersection for web search: Algorithms, Cost Models, and Optimizations
List intersection for web search: Algorithms, Cost Models, and Optimizations
 
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
MCA Daemon: Hybrid Throughput Analysis Beyond Basic BlocksMCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
 
Introduction to architecture exploration
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture exploration
 
Advanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflowAdvanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflow
 
Unit i-fundamentals of programmable DSP processors
Unit i-fundamentals of programmable DSP processorsUnit i-fundamentals of programmable DSP processors
Unit i-fundamentals of programmable DSP processors
 
Decision Support System
Decision Support SystemDecision Support System
Decision Support System
 
Sayeh extension(v23)
Sayeh extension(v23)Sayeh extension(v23)
Sayeh extension(v23)
 
ExtraV - Boosting Graph Processing Near Storage with a Coherent Accelerator
ExtraV - Boosting Graph Processing Near Storage with a Coherent AcceleratorExtraV - Boosting Graph Processing Near Storage with a Coherent Accelerator
ExtraV - Boosting Graph Processing Near Storage with a Coherent Accelerator
 
Crossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCsCrossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCs
 
Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...
Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...
Case Study: Automating Code Reviews for Custom SAP ABAP Applications with Vir...
 
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQLHBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
HBaseCon 2013: How (and Why) Phoenix Puts the SQL Back into NoSQL
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdf
 
osi-oss-dbs.pptx
osi-oss-dbs.pptxosi-oss-dbs.pptx
osi-oss-dbs.pptx
 
OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...
OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...
OXiGen: Automated FPGA design flow from C applications to dataflow kernels - ...
 
chapter 17.pdf
chapter 17.pdfchapter 17.pdf
chapter 17.pdf
 
1 Vampir Overview
1 Vampir Overview1 Vampir Overview
1 Vampir Overview
 
Graph-Tool in Practice
Graph-Tool in PracticeGraph-Tool in Practice
Graph-Tool in Practice
 

More from NECST Lab @ Politecnico di Milano

Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingNECST Lab @ Politecnico di Milano
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...NECST Lab @ Politecnico di Milano
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification SystemNECST Lab @ Politecnico di Milano
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingNECST Lab @ Politecnico di Milano
 

More from NECST Lab @ Politecnico di Milano (20)

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matching
 

Recently uploaded

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 

Recently uploaded (20)

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 

Automatic mapping, partitioning and scheduling for hardware acceleration on FPGAs

  • 1. 1 DIPARTIMENTO DI ELETTRONICA, INFORMAZIONE E BIOINGEGNERIA AMPS Automatic Mapping, Partitioning and Scheduling for hardware acceleration on FPGAs Mirko Salaris: mirko.salaris@mail.polimi.it Marco Rabozzi: marco.rabozzi@polimi.it May 17-31, 2019 NGCX at San Francisco
  • 2. 2 Steps for software acceleration on FPGA • Software profiling • Identification of candidate hardware functions • Design Space Exploration of hardware functions implementations • Choice of the function to implement in hardware
  • 3. 3 Steps for software acceleration on FPGA ✓ Software profiling ✓ Identification of candidate hardware functions ✓ Design Space Exploration of hardware functions implementations ✓ Choice of the function to implement in hardware Already supported by CAOS 1 [1] CAOS: CAD as an Adaptive OpenPlatform Service, http://caos.necst.it
  • 4. 4 Steps for software acceleration on FPGA ✓ Software profiling ✓ Identification of candidate hardware functions ✓ Design Space Exploration of hardware functions implementations ✓ Choice of the function to implement in hardware What about the acceleration of multiple functions?
  • 5. 5 Steps for software acceleration on FPGA ✓ Software profiling ✓ Identification of candidate hardware functions ✓ Design Space Exploration of hardware functions implementations • Hardware functions implementations selection
  • 6. 6 Steps for software acceleration on FPGA ✓ Software profiling ✓ Identification of candidate hardware functions ✓ Design Space Exploration of hardware functions implementations • Hardware functions implementations selection • Hardware functions partitioning into one or more bitstreams • Scheduling of the FPGA reconfigurations
  • 7. 7 AMPS – High Level Overview
  • 8. 8 AMPS – Profiling Function Self Time % Total Time % funA 98.71% 32.26% funB 92.65% 12.83% funC 89.26% 27.98% funD 94.37% 9.41% funE 2.73% 68.52% … … …
  • 9. 9 AMPS – Profiling Function Self Time % Total Time % funA 98.71% 32.26% funB 92.65% 12.83% funC 89.26% 27.98% funD 94.37% 9.41% funE 2.73% 68.52% … … …
  • 10. 10 AMPS – Call Trace Analysis The list of function calls, in order
  • 11. 11 AMPS – Call Trace Analysis The list of function calls, in order funE funB funD funA funA funA funA funA funF funG funG funG funE funB funD funB funD funF funG funH funC funE funA funA funA funF funG funG funH funC funF funG funG funE funA funA funA funB funD funB funD funB funF funG funF funG funH funC funE funA funA funA
  • 12. 12 AMPS – Call Trace Analysis The list of function calls, in order funE funB funD funA funA funA funA funA funF funG funG funG funE funB funD funB funD funF funG funH funC funE funA funA funA funF funG funG funH funC funF funG funG funE funA funA funA funB funD funB funD funB funF funG funF funG funH funC funE funA funA funA NOT synthesizable in hardware: funE, funF, funG, funH
  • 13. 13 AMPS – Call Trace Analysis The list of function calls, in order funB funD funA funA funA funA funA funB funD funB funD funC funA funA funA funC funA funA funA funB funD funB funD funB funC funA funA funA
  • 14. 14 AMPS – Call Trace Analysis The list of function calls, in order funB funD funA funA funA funA funA funB funD funB funD funC funA funA funA funC funA funA funA funB funD funB funD funB funC funA funA funA funA is always called in blocks of multiples calls
  • 15. 15 AMPS – Call Trace Analysis The list of function calls, in order funB funD funA funA funA funA funA funB funD funB funD funC funA funA funA funC funA funA funA funB funD funB funD funB funC funA funA funA funA is always called in blocks of multiples calls funB and funD are always called in quick succession and in an alternate fashion
  • 16. 16 AMPS – Call Trace Analysis The list of function calls, in order funB funD funA funA funA funA funA funB funD funB funD funC funA funA funA funC funA funA funA funB funD funB funD funB funC funA funA funA funA is always called in blocks of multiples calls funB and funD are always called in quick succession and in an alternate fashion Other patterns?
  • 17. 17 AMPS – DSE Function Implementation Performance Resources function_1 F1.impl_1 Execution time: 14.68s Clock Frequency: 200MHz BRAM_18K: 1523 (35%) FF: 1211 (0.05%) LUT: 2211 (0.19%) […] F1.impl_2 Execution time: 12.47s Clock Frequency: 220MHz BRAM_18K: 3 (0.07%) FF: 1274 (0.05%) LUT: 1937 (0.16%) […] function_2 F2.impl_1 […] […] Automated Design Space Exploration and Roofline Analysis for FPGA-based HLS Applications Marco Siracusa, Marco Rabozzi, Lorenzo di Tucci, Marco Domenico Santambrogio
  • 21. 21 Is this the best? Partitioning, Mapping and Scheduling
  • 22. 22 Function Self Time % Total Time % funA 98.71% 32.26% funC 89.26% 27.98% Partitioning, Mapping and Scheduling
  • 23. 23 Function Self Time % Total Time % funA 98.71% 32.26% funC 89.26% 27.98% funA is always called in blocks of multiples calls funC is called few times Partitioning, Mapping and Scheduling
  • 24. 24 Function Self Time % Total Time % funA 98.71% 32.26% funC 89.26% 27.98% funA is always called in blocks of multiples calls funC is called few times Partitioning, Mapping and Scheduling
  • 25. 25 Function Self Time % Total Time % funA 98.71% 32.26% funC 89.26% 27.98% funA is always called in blocks of multiples calls funC is called few times funB and funD are always called in quick succession and in an alternate fashion Partitioning, Mapping and Scheduling
  • 26. 26 Function Self Time % Total Time % funA 98.71% 32.26% funC 89.26% 27.98% funA is always called in blocks of multiples calls funC is called few times funB and funD are always called in quick succession and in an alternate fashion Partitioning, Mapping and Scheduling
  • 27. 27 Function Self Time % Total Time % funA 98.71% 32.26% funC 89.26% 27.98% funA is always called in blocks of multiples calls funC is called few times funB and funD are always called in quick succession and in an alternate fashion Is this the best? Partitioning, Mapping and Scheduling
  • 28. 28 Conclusions • The concurrent acceleration of multiple functions requires multiple steps • There is no easy way to decouple these steps while still guaranteeing optimality Future works: • Validate the proposed flow on a set of real applications • Integrate this flow into CAOS
  • 29. 29 DIPARTIMENTO DI ELETTRONICA, INFORMAZIONE E BIOINGEGNERIA Mirko Salaris: mirko.salaris@mail.polimi.it Marco Rabozzi: marco.rabozzi@polimi.it Thank You