SlideShare a Scribd company logo
1 of 45
FPGA Accelerated Computing Using
Amazon EC2 F1 Instances
D a v i d P e l l e r i n
H e a d o f W W B u s i n e s s D e v e l o p m e n t , I n f o t e c h , A W S
P i e t e r v a n R o o y e n
C E O a n d F o u n d e r , E d i c o G e n o m e
R a m i M e h i o
V P o f E n g i n e e r i n g , E d i c o G e n o m e
C M P 3 0 8
N o v e m b e r 3 0 , 2 0 1 7
AWS re:INVENT
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
WHY USE ACCELERATED COMPUTING?
P A R A L L E L I S M I N C R E A S E S T H R O U G H O U T …
CPU: high speed, highly flexible GPU/FPGA: high throughput, high efficiency
GPUs and FPGAs can provide massive parallelism and higher efficiency than
CPUs for many categories of applications
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
NVIDIA Tesla
V100 GPU
P3: GPU-accelerated computing
§ Enabling a high degree of parallelism–each
GPU has thousands of cores
§ Consistent, well documented set of APIs
(CUDA, OpenACC, OpenCL)
§ Supported by a wide variety of ISVs and
open source frameworks
Xilinx
UltraScale+
FPGA
F1: FPGA-accelerated computing
§ Massively parallel–each FPGA includes millions
of parallel system logic cells
§ Flexible–no fixed instruction set, can
implement wide or narrow datapaths
§ Programmable using available, cloud-based
FPGA development tools
ACCELERATED COMPUTING ON AWS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PARALLEL PROCESSING IN GPU AND FPGA
A GPU is effective at processing the same instruction in
parallel, for example, calculating pixel values in parallel
for graphics shading, or running many parallel financial
computations. A GPU has a well-defined instruction-set,
and fixed word sizes.
An FPGA is effective at processing the same or
different instructions in parallel, for example, creating a
complex pipeline of parallel, multistage operations on a
video stream, or performing a sequence of dependent
calculations and data manipulations for genomics
processing. An FPGA does not have a predefined
instruction-set, or a fixed data width.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PARALLEL PROCESSING IN GPU AND FPGA
• Tens to hundreds of
processing cores
• Pre-defined instruction set
and datapath widths
• Optimized for general
purpose computing
CPU
• Thousands of processing
cores
• Pre-defined instruction set
and datapath widths
• Highly effective at parallel
execution
GPU
• Millions of programmable
digital logic cells
• No predefined instruction
set or datapath widths
• Hardware-timed
execution, massively
parallel
FPGA
DRAM
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
§ Make FPGAs available as standard AWS instances to a
large community of developers and to millions of
potential customers
§ Simplify the development process by providing cloud-
based FPGA and C/C++ software development flows
§ Allow developers to focus on algorithm design by
abstracting FPGA I/O using well-defined interfaces
§ Provide a Marketplace for FPGA applications, providing
more choice and easy access for all AWS customers
FPGA ACCELERATION USING F1: GOALS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon
Machine
Image (AMI)
Amazon FPGA
Image (AFI)
EC2 F1
Instance
CPU
Application
on F1
DDR-4
Attached
MemoryDDR-4
Attached
Memory
PCIe
DDR
Controllers
Launch Instance
and Load AFI
An F1 instance
can have any
number of AFIs
An AFI can be
loaded into the
FPGA in seconds
FPGA ACCELERATION USING F1
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
§ Up to eight Xilinx UltraScale+ 16nm VU9P FPGA devices in a single instance
§ The f1.16xlarge size provides:
§ Eight FPGAs, each with over two million customer-accessible FPGA
programmable logic cells and over 5000 programmable DSP blocks
§ Each of the eight FPGAs has four DDR-4 interfaces, with each interface
accessing a 16GiB, 72-bit wide, ECC-protected memory
Instance Size FPGAs DDR-4
(GiB)
vCPUs Instance
Memory (GiB)
NVMe Instance
Storage (GB)
Network
Bandwidth
f1.2xlarge 1 4 x 16 8 122 1 x 470 Up to 10 Gbps
f1.16xlarge 8 32 x 16 64 976 4 x 940 25 Gbps
F1 FPGA INSTANCE TYPES
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CUSTOM LOGIC AND THE FPGA SHELL
AWS FPGA Shell
provides standard, pre-tested, and secure
I/O components, allowing FPGA developers
to focus on their differentiating value
The FPGA Shell removes the need to
develop I/O related FPGA hardware
Software Development Kit (SDK)
provides required software interfaces for
FPGA management and communication
Hardware Development Kit (HDK)
provides required FPGA Shell components
SDK
HDK
Software
application
FPGA
Custom Logic
Custom Logic
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CREATE THE AMAZON FPGA IMAGE (AFI)
GENERATE AN ENCRYPTED AFI USING THE GENERATED DCP
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
OPENCL IS AVAILABLE FOR F1
§ Familiar development experience to accelerate C/C++
applications
§ 50+ F1 code examples available that span multiple
domains: security, image processing, and accelerated
algorithms
§ Already supported on the FPGA Developer AMI, no need to
upgrade/install
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CREATE THE AMAZON FPGA IMAGE (AFI)
XILINX SDACCEL PROVIDES AN ALTERNATIVE, C/C++/OPENCL BASED DESIGN FLOW
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 FPGA deployment
via Marketplace
Amazon
Machine
Image (AMI)
Amazon FPGA Image (AFI)
AFI is secured, encrypted,
dynamically loaded into the FPGA—
can’t be copied or downloaded
Customers
AWS Marketplace
DELIVERING FPGA PARTNER SOLUTIONS
VIA AWS MARKETPLACE
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
§ Financial computing
§ Genomics sequencing
§ Image and video processing
§ Big data and machine learning
§ Test and measurement
§ Security, compression
§ Developer tools
§ …and more
F1 USE CASES AND PARTNERS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS MARKETPLACE
DISCOVER, PROCURE, DEPLOY, AND MANAGE SOFTWARE IN THE CLOUD
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
GETTING STARTED WITH F1
https://github.com/awslabs/aws-fpga-app-notes/tree/master/reInvent17_Developer_Workshop
§ Gain hands-on experience
with AWS F1
§ Learn how to develop
FPGA-accelerated
applications
§ Learn the OpenCL flow
and the Xilinx SDAccel
development environment
DRAGEN on AWS
Marketplace
P i e te r va n R ooy e n, C EO a nd F ounde r
R a mi M e hi o, V P of Eng i ne e r i ng
Edico Genome overview
50
Employees
World Record for
Fastest Genetic
Diagnosis
Founded in
Jan 2013
Located in
San Diego, CA
11Issued
20Pending
Patents
17PetaBytes
Processed by
Customers to Date
Lead Investors
Qualcomm
Dell EMC
Cloud
App
Major Tech
Partnerships
Genomic big data
By 2025, genomics could well represent the biggest of big
data fields
Source: Challenges For Genomics In The Age Of Big Data, July 2015,
Forbes
Twitter GenomicsYouTube Astronomy
1 Zettabyte
Genomic data and Moore’s Law
2016 2017 2018 2019 2020
Genomic Data
Doubles Every Seven
Months
Moore’s Law
Doubles Every Two
Years?
Alternative
technologies are
needed to
address big data
challenges
Why DRAGEN?
DRAGEN Complete Suite
Somatic V2 RNA
Tumor-Only
and
Tumor/Normal
Analysis
Transcriptome
Analysis with
Splice Junction
Alignment
Germline V2
Clinical Grade
End-to-End
BCLàVCF
Including
Advanced PCR
Error
Correction
Available
Today!
GATK Best
Practices
100% GATK
Concordance
Population
Flexible Family
Trio or Large
Scale Joint
Genotyping
Cohort Analysis
VLRD
Virtual Long
Read Detection
on
CNV
Copy Number
Variant Analysis
for Somatic
Exome
Methylation
Methyl-Seq
or BS-Seq
Available
Soon
RNA V2
Transcriptome
Analysis with
Splice Junction
Alignment
Coming Soon:
Differential
Expression
Acceleration: How do we do it?
DRAGEN FPGA platform enables massive parallel processing resulting in revolutionary data analysis
capabilities
DRAGEN software/hardware stack
FPGA accelerator is the foundation and the key driver of revolutionary compute+storage platform applications
User Interface Layer
HAL
DMA Driver
IO Layer
Pipeline Layer
SW Stack
Arbiter
CROSSBAR
4x
DDR4
Ctrlr
Accelerator
Engine 2
Accelerator
Engine 4
Accelerator
Engine 1
Accelerator
Engine 3
4x16 GB
DDR4
Memory
PCIe 3.0 x8 Interface
N channel DMA
Application Host Memory
APPLICATION
AppspecificGeneric
DRAGEN architecture
a n d h a r d w a r e p o r t t o F 1
Specificity
Architecture key points
• SW HAL to insulate application code
from the platform
• Edico DMA SW driver and HW DMA
channel to be independent of FPGA
device vendor
• Separate HW infrastructure layer from
acceleration layer
• Integrate DRAGEN HW infrastructure
layer with F1 instance HDK
• Size acceleration clusters for VU9P
device
• Tradeoff cluster size as opposed to clock
speed
DRAGEN run time acceleration
o v e r C P U - o n l y s o l u t i o n s
Mapping/Aligning MAP/A/Sort/Dedup/VC
Onsite AWS Onsite AWS F1.2X AWS F1.16X
30X Whole
Human Genome
8 min 4 min 20 min 59 min 17 min
Exome 1 min 30 sec 2 min 3 min 1.5 min
Acceleration over CPU Only Normalized by Number of Cores
Current
Times
Acceleration over CPU
only solution
Projected
Times
Acceleration Over CPU
Only Solution
F1 – 2X 59 min 32x 44 min 43x
F1 – 16X 17 min 26x 10-13 min 40x
Onsite 20 min 29x 14 min 40x
DRAGEN Germline Pipeline: Analysis
Time for Genomes
FASTQ BAM
VCF/gVCF
DRAGEN Complete Suite
Whole Genome, Exome & Panels
Version 2
DRAGEN Execution Time
FASTQ
on
S3
FASTQ
on
Instance
Disk
Input file
download
BAM/VCF
on
instance
Disk
BAM/VCF
on
S3
Output file upload
Hash
Table
S3
Hash
Table on
Instance
Disk
Reference
download
DRAGEN Genome Pipeline execution:
F1.2Xlarge
DRAGEN Complete Suite
Whole Genome, Exome & Panels
Version 2
10 Min20 Min 60 min 15 min
FASTQ BAM
VCF/gVCF
DRAGEN Execution Time
FASTQ
on
S3
FASTQ
on
Instance
Disk
Input file
download
BAM/VCF
on
instance
Disk
BAM/VCF
on
S3
Output file upload
Hash
Table
S3
Hash
Table on
Instance
Disk
Reference
download
Input streaming: F1.2Xlarge
DRAGEN execution time
S3
streaming Output file
upload
Reference
download
10 Min
60 min 15 min
30
s
FASTQ BAM
VCF/gVCF
BAM/VCF
on
instance
Disk
BAM/VCF
on
S3
Hash
Table
S3
Hash
Table on
Instance
Disk
Reference
download
10 Min
60 min 15 min
Output file streaming to Amazon S3
FASTQ BAM
VCF/gVCF
DRAGEN execution time
Input S3
streaming
Output file
streaming
Reference
download
2 min
60 min30s 30s
Optimized solution on F1.16Xlarge
FASTQ BAM
VCF/gVCF
DRAGEN
execution time
Input S3
streaming
Output file
streaming
Reference
download
1 min
17 min30s 30s
Accuracy
Product release roadmap
• Map/Align
• Sort/Dedup
• Variant Calling
Complete Suite
• Alt-Aware
Mapping
• Adv. Error
Detection
• Next
Generation
Accuracy
• Discrete VLRD
• Integrated VLRD
• Integrated FRD
• CNV
V1 V2 V3
Previous Available Today! Q1 2018
For Genomes and Exomes
Somatic V2 RNA Germline V2 GATK Best
Practices
Population VLRD
DRAGEN Germline V2 pipeline
gain in SNP detection performance large gain in indel detection
performance
Comparison against best-performing GATK-HC mode (BQSR)
DRAGEN Somatic V2 pipeline
DRAGEN Somatic v. 2
Mutect2
DRAGEN Somatic v. 2
Mutect2
DRAGEN Somatic v. 2
Mutect2
DRAGEN Somatic v. 2
Mutect2
DRAGEN Germline V3 pipeline
PrecisionFDA Challenge
PrecisionFDA Hidden Treasures: Warm Up Challenge, Oct. 2017
Best Overall Performance
https://precision.fda.gov/challenges/1/view/result
s
Scalability
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DRAGEN Workflow on AWS
AWS Services Used:
• EC2 instances
• AWS Batch
• F1 instances
https://aws.amazon.com/blogs/compute/accelerating-precision-medicine-at-scale/
Network
architecture
Control
• Web VPC + Database VPC
• No customer data
Compute (region specific)
• Auto scaled Dragen instances
• Dragen receives job description
from control channel
• Dragen streams data from
Amazon S3, performs
computation and uploads it
back to S3
• All Dragen <=> S3
communication is over HTTPS
• No inter-Dragen instance
communication
Fastest Analysis of 1000 Whole Human Genomes
Guinness World Record: Analysis Overview
DRAGEN Germline Pipeline V2
1000x f1.2xlarge instances
Upload VCF files to
S3
Download FASTQs from
S3 to EBS
Average: 111 min
1,020 Genomes Analyzed
Summary
§ FPGA acceleration results in up to 43X improvement for genomics
applications
§ Streaming I/O using Amazon S3 greatly increases throughput
§ Parallelizing across multiple FPGAs using F1.16xlarge results in
another 4X+ acceleration
§ Per-second billing and Spot instances provide opportunities for
additional cost savings
§ Deployment to F1 FPGA instances via Marketplace makes accelerated
genomics widely available
Thank you!
C M P 3 0 8

More Related Content

What's hot

Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIQualcomm Research
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro
 
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...Po-Chuan Chen
 
generative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsgenerative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsAdventureWorld5
 
NLP techniques for log analysis
NLP techniques for log analysisNLP techniques for log analysis
NLP techniques for log analysisJacob Perkins
 
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...Naoki (Neo) SATO
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and DefenseKishor Datta Gupta
 
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬VINCI Digital - Industrial IoT (IIoT) Strategic Advisory
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyPekka Abrahamsson / Tampere University
 
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Universitat Politècnica de Catalunya
 
How to build a generative AI solution From prototyping to production.pdf
How to build a generative AI solution From prototyping to production.pdfHow to build a generative AI solution From prototyping to production.pdf
How to build a generative AI solution From prototyping to production.pdfStephenAmell4
 
從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...
從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...
從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...Jian-Kai Wang
 
ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)Sudhanshu Janwadkar
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxColleen Farrelly
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMsLoic Merckel
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
 
Mother of Language`s Langchain
Mother of Language`s LangchainMother of Language`s Langchain
Mother of Language`s LangchainJun-hang Lee
 

What's hot (20)

Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V Introduction
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
 
generative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsgenerative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language models
 
NLP techniques for log analysis
NLP techniques for log analysisNLP techniques for log analysis
NLP techniques for log analysis
 
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
 
How to build a generative AI solution From prototyping to production.pdf
How to build a generative AI solution From prototyping to production.pdfHow to build a generative AI solution From prototyping to production.pdf
How to build a generative AI solution From prototyping to production.pdf
 
從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...
從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...
從圖像辨識到物件偵測,進階的圖影像人工智慧 (From Image Classification to Object Detection, Advance...
 
Style gan
Style ganStyle gan
Style gan
 
ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)ASIC design Flow (Digital Design)
ASIC design Flow (Digital Design)
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
 
Lec13
Lec13Lec13
Lec13
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
Mother of Language`s Langchain
Mother of Language`s LangchainMother of Language`s Langchain
Mother of Language`s Langchain
 

Similar to FPGA Accelerated Genomics Using AWS F1 Instances

Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...Amazon Web Services
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated ComputingAWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017
Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017 Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017
Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017 Gadi Hutt
 
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksDeep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksAmazon Web Services
 
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Amazon Web Services
 
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Amazon Web Services
 
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Amazon Web Services
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...tdc-globalcode
 
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Amazon Web Services
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon Web Services
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overviewYutaka Kawai
 
Introduction to Amazon EC2 F1 Instances
Introduction to Amazon EC2 F1 Instances Introduction to Amazon EC2 F1 Instances
Introduction to Amazon EC2 F1 Instances Amazon Web Services
 
Computação de Alta Performance (HPC) na AWS - CMP201 - Sao Paulo Summit
Computação de Alta Performance (HPC) na AWS -  CMP201 - Sao Paulo SummitComputação de Alta Performance (HPC) na AWS -  CMP201 - Sao Paulo Summit
Computação de Alta Performance (HPC) na AWS - CMP201 - Sao Paulo SummitAmazon Web Services
 
How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...
How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...
How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...Amazon Web Services
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD
 

Similar to FPGA Accelerated Genomics Using AWS F1 Instances (20)

Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated Computing
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated Computing
 
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated ComputingAWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing
 
Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017
Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017 Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017
Amazon EC2 F1 Developing Cloud-Scale Accelerations Sep 13, 2017
 
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksDeep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
 
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
 
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
 
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017
 
FPGA MeetUp
FPGA MeetUpFPGA MeetUp
FPGA MeetUp
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overview
 
Introduction to Amazon EC2 F1 Instances
Introduction to Amazon EC2 F1 Instances Introduction to Amazon EC2 F1 Instances
Introduction to Amazon EC2 F1 Instances
 
GTC 2022 Keynote
GTC 2022 KeynoteGTC 2022 Keynote
GTC 2022 Keynote
 
AMD It's Time to ROC
AMD It's Time to ROCAMD It's Time to ROC
AMD It's Time to ROC
 
Computação de Alta Performance (HPC) na AWS - CMP201 - Sao Paulo Summit
Computação de Alta Performance (HPC) na AWS -  CMP201 - Sao Paulo SummitComputação de Alta Performance (HPC) na AWS -  CMP201 - Sao Paulo Summit
Computação de Alta Performance (HPC) na AWS - CMP201 - Sao Paulo Summit
 
How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...
How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...
How to Get the HPC Best-in-class Performance via Intel Xeon Skylake Processor...
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

FPGA Accelerated Genomics Using AWS F1 Instances

  • 1. FPGA Accelerated Computing Using Amazon EC2 F1 Instances D a v i d P e l l e r i n H e a d o f W W B u s i n e s s D e v e l o p m e n t , I n f o t e c h , A W S P i e t e r v a n R o o y e n C E O a n d F o u n d e r , E d i c o G e n o m e R a m i M e h i o V P o f E n g i n e e r i n g , E d i c o G e n o m e C M P 3 0 8 N o v e m b e r 3 0 , 2 0 1 7 AWS re:INVENT
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. WHY USE ACCELERATED COMPUTING? P A R A L L E L I S M I N C R E A S E S T H R O U G H O U T … CPU: high speed, highly flexible GPU/FPGA: high throughput, high efficiency GPUs and FPGAs can provide massive parallelism and higher efficiency than CPUs for many categories of applications
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. NVIDIA Tesla V100 GPU P3: GPU-accelerated computing § Enabling a high degree of parallelism–each GPU has thousands of cores § Consistent, well documented set of APIs (CUDA, OpenACC, OpenCL) § Supported by a wide variety of ISVs and open source frameworks Xilinx UltraScale+ FPGA F1: FPGA-accelerated computing § Massively parallel–each FPGA includes millions of parallel system logic cells § Flexible–no fixed instruction set, can implement wide or narrow datapaths § Programmable using available, cloud-based FPGA development tools ACCELERATED COMPUTING ON AWS
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PARALLEL PROCESSING IN GPU AND FPGA A GPU is effective at processing the same instruction in parallel, for example, calculating pixel values in parallel for graphics shading, or running many parallel financial computations. A GPU has a well-defined instruction-set, and fixed word sizes. An FPGA is effective at processing the same or different instructions in parallel, for example, creating a complex pipeline of parallel, multistage operations on a video stream, or performing a sequence of dependent calculations and data manipulations for genomics processing. An FPGA does not have a predefined instruction-set, or a fixed data width.
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PARALLEL PROCESSING IN GPU AND FPGA • Tens to hundreds of processing cores • Pre-defined instruction set and datapath widths • Optimized for general purpose computing CPU • Thousands of processing cores • Pre-defined instruction set and datapath widths • Highly effective at parallel execution GPU • Millions of programmable digital logic cells • No predefined instruction set or datapath widths • Hardware-timed execution, massively parallel FPGA DRAM Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. § Make FPGAs available as standard AWS instances to a large community of developers and to millions of potential customers § Simplify the development process by providing cloud- based FPGA and C/C++ software development flows § Allow developers to focus on algorithm design by abstracting FPGA I/O using well-defined interfaces § Provide a Marketplace for FPGA applications, providing more choice and easy access for all AWS customers FPGA ACCELERATION USING F1: GOALS
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Machine Image (AMI) Amazon FPGA Image (AFI) EC2 F1 Instance CPU Application on F1 DDR-4 Attached MemoryDDR-4 Attached Memory PCIe DDR Controllers Launch Instance and Load AFI An F1 instance can have any number of AFIs An AFI can be loaded into the FPGA in seconds FPGA ACCELERATION USING F1
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. § Up to eight Xilinx UltraScale+ 16nm VU9P FPGA devices in a single instance § The f1.16xlarge size provides: § Eight FPGAs, each with over two million customer-accessible FPGA programmable logic cells and over 5000 programmable DSP blocks § Each of the eight FPGAs has four DDR-4 interfaces, with each interface accessing a 16GiB, 72-bit wide, ECC-protected memory Instance Size FPGAs DDR-4 (GiB) vCPUs Instance Memory (GiB) NVMe Instance Storage (GB) Network Bandwidth f1.2xlarge 1 4 x 16 8 122 1 x 470 Up to 10 Gbps f1.16xlarge 8 32 x 16 64 976 4 x 940 25 Gbps F1 FPGA INSTANCE TYPES
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CUSTOM LOGIC AND THE FPGA SHELL AWS FPGA Shell provides standard, pre-tested, and secure I/O components, allowing FPGA developers to focus on their differentiating value The FPGA Shell removes the need to develop I/O related FPGA hardware Software Development Kit (SDK) provides required software interfaces for FPGA management and communication Hardware Development Kit (HDK) provides required FPGA Shell components SDK HDK Software application FPGA Custom Logic Custom Logic
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CREATE THE AMAZON FPGA IMAGE (AFI) GENERATE AN ENCRYPTED AFI USING THE GENERATED DCP
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. OPENCL IS AVAILABLE FOR F1 § Familiar development experience to accelerate C/C++ applications § 50+ F1 code examples available that span multiple domains: security, image processing, and accelerated algorithms § Already supported on the FPGA Developer AMI, no need to upgrade/install
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CREATE THE AMAZON FPGA IMAGE (AFI) XILINX SDACCEL PROVIDES AN ALTERNATIVE, C/C++/OPENCL BASED DESIGN FLOW
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 FPGA deployment via Marketplace Amazon Machine Image (AMI) Amazon FPGA Image (AFI) AFI is secured, encrypted, dynamically loaded into the FPGA— can’t be copied or downloaded Customers AWS Marketplace DELIVERING FPGA PARTNER SOLUTIONS VIA AWS MARKETPLACE
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. § Financial computing § Genomics sequencing § Image and video processing § Big data and machine learning § Test and measurement § Security, compression § Developer tools § …and more F1 USE CASES AND PARTNERS
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS MARKETPLACE DISCOVER, PROCURE, DEPLOY, AND MANAGE SOFTWARE IN THE CLOUD
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. GETTING STARTED WITH F1 https://github.com/awslabs/aws-fpga-app-notes/tree/master/reInvent17_Developer_Workshop § Gain hands-on experience with AWS F1 § Learn how to develop FPGA-accelerated applications § Learn the OpenCL flow and the Xilinx SDAccel development environment
  • 18. DRAGEN on AWS Marketplace P i e te r va n R ooy e n, C EO a nd F ounde r R a mi M e hi o, V P of Eng i ne e r i ng
  • 19. Edico Genome overview 50 Employees World Record for Fastest Genetic Diagnosis Founded in Jan 2013 Located in San Diego, CA 11Issued 20Pending Patents 17PetaBytes Processed by Customers to Date Lead Investors Qualcomm Dell EMC Cloud App Major Tech Partnerships
  • 20. Genomic big data By 2025, genomics could well represent the biggest of big data fields Source: Challenges For Genomics In The Age Of Big Data, July 2015, Forbes Twitter GenomicsYouTube Astronomy 1 Zettabyte
  • 21. Genomic data and Moore’s Law 2016 2017 2018 2019 2020 Genomic Data Doubles Every Seven Months Moore’s Law Doubles Every Two Years? Alternative technologies are needed to address big data challenges
  • 23. DRAGEN Complete Suite Somatic V2 RNA Tumor-Only and Tumor/Normal Analysis Transcriptome Analysis with Splice Junction Alignment Germline V2 Clinical Grade End-to-End BCLàVCF Including Advanced PCR Error Correction Available Today! GATK Best Practices 100% GATK Concordance Population Flexible Family Trio or Large Scale Joint Genotyping Cohort Analysis VLRD Virtual Long Read Detection on CNV Copy Number Variant Analysis for Somatic Exome Methylation Methyl-Seq or BS-Seq Available Soon RNA V2 Transcriptome Analysis with Splice Junction Alignment Coming Soon: Differential Expression
  • 24. Acceleration: How do we do it? DRAGEN FPGA platform enables massive parallel processing resulting in revolutionary data analysis capabilities
  • 25. DRAGEN software/hardware stack FPGA accelerator is the foundation and the key driver of revolutionary compute+storage platform applications User Interface Layer HAL DMA Driver IO Layer Pipeline Layer SW Stack Arbiter CROSSBAR 4x DDR4 Ctrlr Accelerator Engine 2 Accelerator Engine 4 Accelerator Engine 1 Accelerator Engine 3 4x16 GB DDR4 Memory PCIe 3.0 x8 Interface N channel DMA Application Host Memory APPLICATION AppspecificGeneric
  • 26. DRAGEN architecture a n d h a r d w a r e p o r t t o F 1 Specificity Architecture key points • SW HAL to insulate application code from the platform • Edico DMA SW driver and HW DMA channel to be independent of FPGA device vendor • Separate HW infrastructure layer from acceleration layer • Integrate DRAGEN HW infrastructure layer with F1 instance HDK • Size acceleration clusters for VU9P device • Tradeoff cluster size as opposed to clock speed
  • 27. DRAGEN run time acceleration o v e r C P U - o n l y s o l u t i o n s Mapping/Aligning MAP/A/Sort/Dedup/VC Onsite AWS Onsite AWS F1.2X AWS F1.16X 30X Whole Human Genome 8 min 4 min 20 min 59 min 17 min Exome 1 min 30 sec 2 min 3 min 1.5 min Acceleration over CPU Only Normalized by Number of Cores Current Times Acceleration over CPU only solution Projected Times Acceleration Over CPU Only Solution F1 – 2X 59 min 32x 44 min 43x F1 – 16X 17 min 26x 10-13 min 40x Onsite 20 min 29x 14 min 40x
  • 28. DRAGEN Germline Pipeline: Analysis Time for Genomes FASTQ BAM VCF/gVCF DRAGEN Complete Suite Whole Genome, Exome & Panels Version 2 DRAGEN Execution Time FASTQ on S3 FASTQ on Instance Disk Input file download BAM/VCF on instance Disk BAM/VCF on S3 Output file upload Hash Table S3 Hash Table on Instance Disk Reference download
  • 29. DRAGEN Genome Pipeline execution: F1.2Xlarge DRAGEN Complete Suite Whole Genome, Exome & Panels Version 2 10 Min20 Min 60 min 15 min FASTQ BAM VCF/gVCF DRAGEN Execution Time FASTQ on S3 FASTQ on Instance Disk Input file download BAM/VCF on instance Disk BAM/VCF on S3 Output file upload Hash Table S3 Hash Table on Instance Disk Reference download
  • 30. Input streaming: F1.2Xlarge DRAGEN execution time S3 streaming Output file upload Reference download 10 Min 60 min 15 min 30 s FASTQ BAM VCF/gVCF BAM/VCF on instance Disk BAM/VCF on S3 Hash Table S3 Hash Table on Instance Disk Reference download 10 Min 60 min 15 min
  • 31. Output file streaming to Amazon S3 FASTQ BAM VCF/gVCF DRAGEN execution time Input S3 streaming Output file streaming Reference download 2 min 60 min30s 30s
  • 32. Optimized solution on F1.16Xlarge FASTQ BAM VCF/gVCF DRAGEN execution time Input S3 streaming Output file streaming Reference download 1 min 17 min30s 30s
  • 34. Product release roadmap • Map/Align • Sort/Dedup • Variant Calling Complete Suite • Alt-Aware Mapping • Adv. Error Detection • Next Generation Accuracy • Discrete VLRD • Integrated VLRD • Integrated FRD • CNV V1 V2 V3 Previous Available Today! Q1 2018 For Genomes and Exomes Somatic V2 RNA Germline V2 GATK Best Practices Population VLRD
  • 35. DRAGEN Germline V2 pipeline gain in SNP detection performance large gain in indel detection performance Comparison against best-performing GATK-HC mode (BQSR)
  • 36. DRAGEN Somatic V2 pipeline DRAGEN Somatic v. 2 Mutect2 DRAGEN Somatic v. 2 Mutect2 DRAGEN Somatic v. 2 Mutect2 DRAGEN Somatic v. 2 Mutect2
  • 37. DRAGEN Germline V3 pipeline
  • 38. PrecisionFDA Challenge PrecisionFDA Hidden Treasures: Warm Up Challenge, Oct. 2017 Best Overall Performance https://precision.fda.gov/challenges/1/view/result s
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. DRAGEN Workflow on AWS AWS Services Used: • EC2 instances • AWS Batch • F1 instances https://aws.amazon.com/blogs/compute/accelerating-precision-medicine-at-scale/
  • 41. Network architecture Control • Web VPC + Database VPC • No customer data Compute (region specific) • Auto scaled Dragen instances • Dragen receives job description from control channel • Dragen streams data from Amazon S3, performs computation and uploads it back to S3 • All Dragen <=> S3 communication is over HTTPS • No inter-Dragen instance communication
  • 42. Fastest Analysis of 1000 Whole Human Genomes
  • 43. Guinness World Record: Analysis Overview DRAGEN Germline Pipeline V2 1000x f1.2xlarge instances Upload VCF files to S3 Download FASTQs from S3 to EBS Average: 111 min 1,020 Genomes Analyzed
  • 44. Summary § FPGA acceleration results in up to 43X improvement for genomics applications § Streaming I/O using Amazon S3 greatly increases throughput § Parallelizing across multiple FPGAs using F1.16xlarge results in another 4X+ acceleration § Per-second billing and Spot instances provide opportunities for additional cost savings § Deployment to F1 FPGA instances via Marketplace makes accelerated genomics widely available
  • 45. Thank you! C M P 3 0 8