SlideShare a Scribd company logo
1 of 37
BIG DATA IN GPUS 
John Tran | StampedeCon2014, May 29 2014, St Louis, MO
“If you were plowing a field, which 
would you rather use? Two strong 
oxen or 1024 chickens?” 
—Seymour Cray
Example CPU: Xeon E5-2687W 
! 2.27 B transistors 
! 8 cores, 16 threads @ 3.1 GHz 
! 0.35 SP TFLOPS 
! 0.17 DP TFLOPS 
! 256 GB DDR3 @1600 MHz 
! 51.2 GB/s 
! 150 W 
! 20 MB L3 cache 
! Single thread Perf 
! branch prediction 
! out of order execution
Example GPU: Tesla K40 
! 7.1 B transistors 
! 2880 cores, 30720 threads @ 
745 MHz 
! 4.29 SP TFLOPS 
! 1.43 DP TFLOPS 
! 12 GB GDDR5 @ 3GHz 
! 288 GB/s memory BW 
! 235 W 
! PCIE Gen3 x16 
! 12 GB/s
Math and memory peak throughput 
4.29 
TFLOPS Xeon E5-2687-W Tesla K40 
0.35 0.17 
1.43 
5 
4 
3 
2 
1 
0 
SP TFLOPS DP TFLOPS 
51.2 
288 
400 
300 
200 
100 
0 
Memory BW 
GB/s 
Xeon E5-2687W Tesla K40
The Chickens are Winning 
! Parallel computing is no longer “the future” 
! If you are not parallel, you are already behind 
! GPUs win in 
! Performance == $$ 
! Power == $$ 
! Cost == $$
Where did these GPUs come from?
OK, but what about computing?
All Computing is Parallel Computing
Parallel Computing 
CPU GPU
The Basic Idea – Accelerated Computing 
Application Code 
Compute-Intensive Functions 
Rest of Sequential 
CPU Code 
GPU CPU 
CUDA
Quick CUDA C example 
Standard C Code Parallel C Code 
void saxpy(int n, float a, 
float *x, float *y) 
{ 
for (int i = 0; i < n; ++i) 
y[i] = a*x[i] + y[i]; 
} 
int N = 1<<20; 
// Perform SAXPY on 1M elements 
saxpy(N, 2.0, x, y); 
__global__ 
void saxpy(int n, float a, 
float *x, float *y) 
{ 
int i = blockIdx.x*blockDim.x + threadIdx.x; 
if (i < n) y[i] = a*x[i] + y[i]; 
} 
int N = 1<<20; 
cudaMemcpy(x, d_x, N, cudaMemcpyHostToDevice); 
cudaMemcpy(y, d_y, N, cudaMemcpyHostToDevice); 
// Perform SAXPY on 1M elements 
saxpy<<<4096,256>>>(N, 2.0, x, y); 
cudaMemcpy(d_y, y, N, cudaMemcpyDeviceToHost); 
http://developer.nvidia.com/cuda-toolkit
How else can you program it? 
! Libraries 
! Thrust, BLAS, SPARSE, FFT, NPP, RAND 
! Directives 
! OpenACC 
! Languages 
! CUDA C, CUDA C++, thrust, python, fortran, C++ proposal, matlab, gpu.net 
! Learn 
! “get cuda,” Udacity, Coursera
How does this matter to Big Data?
! 90 M monthly active users 
! 17 M tracks tagged / day 
! 27 M tracks in DB 
“GPUs enable us to handle our tremendous processing needs at a 
substantial cost savings, delivering twice the performance per dollar 
compared to a CPU-based system.” 
-Jason Titus, CTO, Shazam
Deep Neural Networks for image 
classification
Google Datacenter Stanford AI Lab 
1000 CPU Servers 
600 kWatts 
$5,000,000 
3 GPU-Accelerated Servers 
3.6 kWatts 
$21,000 
Deep learning with COTS HPC systems, A Coates, B Huval, T Wang, D Wu, A Ng, B Catanzaro, NIPS 2013
Speech Recognition
The DataScope at JHU 
5PB of science data (in 2010) 
“The Data-Scope will allow us to mine out 
relationships among data that already exist 
but that we can’t yet handle and to sift 
discoveries from what seems like an 
overwhelming flow of information. 
New discoveries will definitely emerge this 
way. There are relationships and patterns 
that we just cannot fathom buried in that 
onslaught of data. Data-Scope will tease 
these out.” 
– Alex Szalay, JHU
HIV Capsid
Beating Heart Surgery 
Patient stands to lose 1 point of IQ every 
10 min with heart stopped 
Only ~2% of heart surgeons will operate 
on a beating heart 
GPU enables real-time motion 
compensation to virtually stop beating 
heart for surgeons: 
Courtesy Laboratoire d’Informatique de Robotique et de Microelectronique de Montpellier
NVBIO
Final Thoughts 
! Parallel computing is here 
! Re-think parallel or get left behind 
! Scale up before scaling out 
! Several orders of magnitude parallelism increase by using a GPU 
! Do you really need a cluster? 
! GPUs are the most efficient solution for parallel problems 
! Perf / $ 
! Perf / Watt
All Computing is Parallel Computing

More Related Content

What's hot

Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
DataWorks Summit
 
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
DataWorks Summit
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
Holden Karau
 

What's hot (20)

A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
 
Valerii Vasylkov Erlang. measurements and benefits.
Valerii Vasylkov Erlang. measurements and benefits.Valerii Vasylkov Erlang. measurements and benefits.
Valerii Vasylkov Erlang. measurements and benefits.
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리
 
Why your Spark job is failing
Why your Spark job is failingWhy your Spark job is failing
Why your Spark job is failing
 

Viewers also liked

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
Kohei KaiGai
 

Viewers also liked (20)

Deep learning on spark
Deep learning on sparkDeep learning on spark
Deep learning on spark
 
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web RenderingSIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
SIGGRAPH 2012: GPU-Accelerated 2D and Web Rendering
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering
 
GPU Ecosystem
GPU EcosystemGPU Ecosystem
GPU Ecosystem
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
Accelerating Machine Learning Applications on Spark Using GPUs
Accelerating Machine Learning Applications on Spark Using GPUsAccelerating Machine Learning Applications on Spark Using GPUs
Accelerating Machine Learning Applications on Spark Using GPUs
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...Enabling Graph Analytics at Scale:  The Opportunity for GPU-Acceleration of D...
Enabling Graph Analytics at Scale: The Opportunity for GPU-Acceleration of D...
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overview
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
 
Hadoop + GPU
Hadoop + GPUHadoop + GPU
Hadoop + GPU
 
Deep Learning on Hadoop
Deep Learning on HadoopDeep Learning on Hadoop
Deep Learning on Hadoop
 
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
From Machine Learning to Learning Machines: Creating an End-to-End Cognitive ...
 
DeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François GarillotDeepLearning4J and Spark: Successes and Challenges - François Garillot
DeepLearning4J and Spark: Successes and Challenges - François Garillot
 
How to Solve Real-Time Data Problems
How to Solve Real-Time Data ProblemsHow to Solve Real-Time Data Problems
How to Solve Real-Time Data Problems
 
Containerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the CloudContainerizing GPU Applications with Docker for Scaling to the Cloud
Containerizing GPU Applications with Docker for Scaling to the Cloud
 
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim Hunter
 

Similar to GPUs in Big Data - StampedeCon 2014

Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
CanSecWest
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
Heiko Joerg Schick
 
General Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareGeneral Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics Hardware
Daniel Blezek
 
A Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep LearningA Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep Learning
Suntae Kim
 

Similar to GPUs in Big Data - StampedeCon 2014 (20)

[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
 
Making AI efficient
Making AI efficientMaking AI efficient
Making AI efficient
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUs
 
Embedded system Design introduction _ Karakola
Embedded system Design introduction _ KarakolaEmbedded system Design introduction _ Karakola
Embedded system Design introduction _ Karakola
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
MaPU-HPCA2016
MaPU-HPCA2016MaPU-HPCA2016
MaPU-HPCA2016
 
Nvidia in bioinformatics
Nvidia in bioinformaticsNvidia in bioinformatics
Nvidia in bioinformatics
 
Lrz kurs: big data analysis
Lrz kurs: big data analysisLrz kurs: big data analysis
Lrz kurs: big data analysis
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlap
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
General Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics HardwareGeneral Purpose Computing using Graphics Hardware
General Purpose Computing using Graphics Hardware
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
 
A Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep LearningA Development of Log-based Game AI using Deep Learning
A Development of Log-based Game AI using Deep Learning
 

More from StampedeCon

Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
StampedeCon
 

More from StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

GPUs in Big Data - StampedeCon 2014

  • 1. BIG DATA IN GPUS John Tran | StampedeCon2014, May 29 2014, St Louis, MO
  • 2. “If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?” —Seymour Cray
  • 3. Example CPU: Xeon E5-2687W ! 2.27 B transistors ! 8 cores, 16 threads @ 3.1 GHz ! 0.35 SP TFLOPS ! 0.17 DP TFLOPS ! 256 GB DDR3 @1600 MHz ! 51.2 GB/s ! 150 W ! 20 MB L3 cache ! Single thread Perf ! branch prediction ! out of order execution
  • 4. Example GPU: Tesla K40 ! 7.1 B transistors ! 2880 cores, 30720 threads @ 745 MHz ! 4.29 SP TFLOPS ! 1.43 DP TFLOPS ! 12 GB GDDR5 @ 3GHz ! 288 GB/s memory BW ! 235 W ! PCIE Gen3 x16 ! 12 GB/s
  • 5. Math and memory peak throughput 4.29 TFLOPS Xeon E5-2687-W Tesla K40 0.35 0.17 1.43 5 4 3 2 1 0 SP TFLOPS DP TFLOPS 51.2 288 400 300 200 100 0 Memory BW GB/s Xeon E5-2687W Tesla K40
  • 6. The Chickens are Winning ! Parallel computing is no longer “the future” ! If you are not parallel, you are already behind ! GPUs win in ! Performance == $$ ! Power == $$ ! Cost == $$
  • 7. Where did these GPUs come from?
  • 8.
  • 9.
  • 10.
  • 11. OK, but what about computing?
  • 12. All Computing is Parallel Computing
  • 14. The Basic Idea – Accelerated Computing Application Code Compute-Intensive Functions Rest of Sequential CPU Code GPU CPU CUDA
  • 15. Quick CUDA C example Standard C Code Parallel C Code void saxpy(int n, float a, float *x, float *y) { for (int i = 0; i < n; ++i) y[i] = a*x[i] + y[i]; } int N = 1<<20; // Perform SAXPY on 1M elements saxpy(N, 2.0, x, y); __global__ void saxpy(int n, float a, float *x, float *y) { int i = blockIdx.x*blockDim.x + threadIdx.x; if (i < n) y[i] = a*x[i] + y[i]; } int N = 1<<20; cudaMemcpy(x, d_x, N, cudaMemcpyHostToDevice); cudaMemcpy(y, d_y, N, cudaMemcpyHostToDevice); // Perform SAXPY on 1M elements saxpy<<<4096,256>>>(N, 2.0, x, y); cudaMemcpy(d_y, y, N, cudaMemcpyDeviceToHost); http://developer.nvidia.com/cuda-toolkit
  • 16. How else can you program it? ! Libraries ! Thrust, BLAS, SPARSE, FFT, NPP, RAND ! Directives ! OpenACC ! Languages ! CUDA C, CUDA C++, thrust, python, fortran, C++ proposal, matlab, gpu.net ! Learn ! “get cuda,” Udacity, Coursera
  • 17. How does this matter to Big Data?
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. ! 90 M monthly active users ! 17 M tracks tagged / day ! 27 M tracks in DB “GPUs enable us to handle our tremendous processing needs at a substantial cost savings, delivering twice the performance per dollar compared to a CPU-based system.” -Jason Titus, CTO, Shazam
  • 24.
  • 25. Deep Neural Networks for image classification
  • 26.
  • 27. Google Datacenter Stanford AI Lab 1000 CPU Servers 600 kWatts $5,000,000 3 GPU-Accelerated Servers 3.6 kWatts $21,000 Deep learning with COTS HPC systems, A Coates, B Huval, T Wang, D Wu, A Ng, B Catanzaro, NIPS 2013
  • 29.
  • 30.
  • 31. The DataScope at JHU 5PB of science data (in 2010) “The Data-Scope will allow us to mine out relationships among data that already exist but that we can’t yet handle and to sift discoveries from what seems like an overwhelming flow of information. New discoveries will definitely emerge this way. There are relationships and patterns that we just cannot fathom buried in that onslaught of data. Data-Scope will tease these out.” – Alex Szalay, JHU
  • 33. Beating Heart Surgery Patient stands to lose 1 point of IQ every 10 min with heart stopped Only ~2% of heart surgeons will operate on a beating heart GPU enables real-time motion compensation to virtually stop beating heart for surgeons: Courtesy Laboratoire d’Informatique de Robotique et de Microelectronique de Montpellier
  • 34.
  • 35. NVBIO
  • 36. Final Thoughts ! Parallel computing is here ! Re-think parallel or get left behind ! Scale up before scaling out ! Several orders of magnitude parallelism increase by using a GPU ! Do you really need a cluster? ! GPUs are the most efficient solution for parallel problems ! Perf / $ ! Perf / Watt
  • 37. All Computing is Parallel Computing