SlideShare a Scribd company logo
1 of 34
Download to read offline
GPU Computing for Data Science
John Joo
john.joo@dominodatalab.com
Data Science Evangelist @ Domino Data Lab
Outline
ā€¢ Why use GPUs?
ā€¢ Example applications in data science
ā€¢ Programming your GPU
Case Study:
Monte Carlo Simulations
ā€¢ Simulate behavior when randomness
is a key component
ā€¢ Average the results of many
simulations
ā€¢ Make predictions
Little Information in One ā€œNoisy Simulationā€
Price(t+1) = Price(t) e InterestRateā€¢dt + noise
Many ā€œNoisy Simulationsā€ āž” Actionable Information
Price(t+1) = Price(t) e InterestRateā€¢dt + noise
Monte Carlo Simulations Are Often Slow
ā€¢ Lots of simulation data is required to
create valid models
ā€¢ Generating lots of data takes time
ā€¢ CPU works sequentially
CPUs designed for sequential, complex tasks
Source: Mythbusters https://youtu.be/-P28LKWTzrI
GPUs designed for parallel, low level tasks
Source: Mythbusters https://youtu.be/-P28LKWTzrI
GPUs designed for parallel, low level tasks
Source: Mythbusters https://youtu.be/-P28LKWTzrI
Applications of GPU Computing in Data Science
ā€¢ Matrix Manipulation
ā€¢ Numerical Analysis
ā€¢ Sorting
ā€¢ FFT
ā€¢ String matching
ā€¢ Monte Carlo simulations
ā€¢ Machine learning
ā€¢ Search
Algorithms for GPU Acceleration
ā€¢ Inherently parallel
ā€¢ Matrix operations
ā€¢ High FLoat-point Operations Per Sec
(FLOPS)
GPUs Make Deep Learning Accessible
Google
Datacenter
Stanford AI Lab
# of machines 1,000 3
# of CPUs or
GPUs
2,000 CPUs 12 GPUs
Cores 16,000 18,432
Power used 600 kW 4 kW
Cost $5,000,000 $33,000
Adam Coates, Brody Huval,Tao Wang, David Wu, Bryan Catanzaro, Ng Andrew ; JMLR W&CP 28 (3) : 1337ā€“1345, 2013
CPU vs GPU Architecture:
Structured for Different Purposes
CPU
4-8 High Performance Cores
GPU
100s-1000s of bare bones cores
Both CPU and GPU are required
CPU GPU
Compute intensive
functions
Everything else
General Purpose GPU Computing (GPGPU)
Heterogeneous Computing
Getting Started: Hardware
ā€¢ Need a computer with GPU
ā€¢ GPU should not be operating your
display
Spin up a GPU/CPU computer with 1 click.
8 CPU cores, 15 GB RAM
1,536 GPU cores, 4GB RAM
Getting Started: Hardware
āœ”
Programming CPU
ā€¢ Sequential
ā€¢ Write code top to bottom
ā€¢ Can do complex tasks
ā€¢ Independent
Programming GPU
ā€¢ Parallel
ā€¢ Multi-threaded - race conditions
ā€¢ Low level tasks
ā€¢ Dependent on CPU
Getting Started: Software
Talking to your GPU
CUDA and OpenCL are GPU computing frameworks
Choosing How to Interface with GPU:
Simplicity vs Flexibility
Application
speciļ¬c
libraries
General
purpose GPU
libraries
Custom
CUDA/
OpenCL code
Flexibility
Simplicity
Low
Low
High
High
Application Speciļ¬c Libraries
Python
ā€¢ Theano - Symbolic math
ā€¢ TensorFlow - ML
ā€¢ Lasagne - NN
ā€¢ Pylearn2 - ML
ā€¢ mxnet - NN
ā€¢ ABSsysbio - Systems Bio
R
ā€¢ cudaBayesreg - fMRI
ā€¢ mxnet - NN
ā€¢ rpud -SVM
ā€¢ rgpu - bioinformatics
Tutorial on using Theano, Lasagne, and no-learn:
http://blog.dominodatalab.com/gpu-computing-and-deep-learning/
General Purpose GPU Libraries
ā€¢ Python and R wrappers for basic matrix
and linear algebra operations
ā€¢ scikit-cuda
ā€¢ cudamat
ā€¢ gputools
ā€¢ HiPLARM
ā€¢ Drop-in library
Drop-in Library
Credit: NVIDIA
Also works for Python!
http://scelementary.com/2015/04/09/nvidia-nvblas-in-numpy.html
Custom CUDA/OpenCL Code
1. Allocate memory on the GPU
2. Transfer data from CPU to GPU
3. Launch the kernel to operate on the CPU
cores
4. Transfer results back to CPU
Example of using Python and CUDA:
Monte Carlo Simulations
ā€¢ Using PyCuda to interface Python and
CUDA
ā€¢ Simulating 3 million paths, 100 time steps
each
Python Code for CPU
Python/PyCUDA Code for GPU
8 more lines of code
Python Code for CPU
Python/PyCUDA Code for CPU
1. Allocate memory on the GPU
Python Code for CPU
Python/PyCUDA Code for CPU
2. Transfer data from CPU to GPU
Python Code for CPU
Python/PyCUDA Code for CPU
3. Launch the kernel to operate on the CPU cores
Python Code for CPU
Python/PyCUDA Code for CPU
4. Transfer results back to CPU
Python Code for CPU
26 sec
Python/PyCUDA Code for CPU
8 more lines of code
1.5 sec
17x speed up
Some sample Jupyter notebooks
ā€¢ https://app.dominodatalab.com/johnjoo/gpu_examples
ā€¢ Monte Carlo example using PyCUDA
ā€¢ PyCUDA example compiling CUDA C for kernel
instructions
ā€¢ Scikit-cuda example of matrix multiplication
ā€¢ Calculating a distance matrix using rpud
More resources
ā€¢ NVIDIA
ā€¢ https://developer.nvidia.com/how-to-cuda-python
ā€¢ Berkeley GPU workshop
ā€¢ http://www.stat.berkeley.edu/scf/paciorek-
gpuWorkshop.html
ā€¢ Duke Statistics on GPU (Python)
ā€¢ http://people.duke.edu/~ccc14/sta-663/
CUDAPython.html
ā€¢ Andreas Klocknerā€™s webpage (Python)
ā€¢ http://mathema.tician.de/
ā€¢ Summary of GPU libraries
ā€¢ http://fastml.com/running-things-on-a-gpu/
More resources
ā€¢ Walk through of CUDA programming in R
ā€¢ http://blog.revolutionanalytics.com/2015/01/parallel-
programming-with-gpus-and-r.html
ā€¢ List of libraries for GPU computing in R
ā€¢ https://cran.r-project.org/web/views/
HighPerformanceComputing.html
ā€¢ Matrix computations in Machine Learning
ā€¢ http://numml.kyb.tuebingen.mpg.de/numl09/
talk_dhillon.pdf
Questions?
john.joo@dominodatalab.com
blog.dominodatalab.com
john.joo@dominodatalab.com
blog.dominodatalab.com

More Related Content

What's hot

The Future of Everything
The Future of EverythingThe Future of Everything
The Future of EverythingCharbel Zeaiter
Ā 
Infographics Key Data KPI presentation slides
Infographics Key Data KPI presentation slidesInfographics Key Data KPI presentation slides
Infographics Key Data KPI presentation slidesPeter Zvirinsky
Ā 
Q3 2022 DBX Investor Presentation.pdf
Q3 2022 DBX Investor Presentation.pdfQ3 2022 DBX Investor Presentation.pdf
Q3 2022 DBX Investor Presentation.pdfDropbox
Ā 
2023 Q2 Crypto Industry Report | CoinGecko
2023 Q2 Crypto Industry Report | CoinGecko2023 Q2 Crypto Industry Report | CoinGecko
2023 Q2 Crypto Industry Report | CoinGeckoCoinGecko
Ā 
Swift Paris - Dealing The Cards
Swift Paris - Dealing The CardsSwift Paris - Dealing The Cards
Swift Paris - Dealing The CardsZenly
Ā 
How to Determine the ROI of Anything
How to Determine the ROI of AnythingHow to Determine the ROI of Anything
How to Determine the ROI of AnythingGary Vaynerchuk
Ā 
The Hourglass Effect - A Decade of Displacement
The Hourglass Effect - A Decade of DisplacementThe Hourglass Effect - A Decade of Displacement
The Hourglass Effect - A Decade of DisplacementFrank Rotman
Ā 
Launching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's BackLaunching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's Backjoshelman
Ā 
EY: Why graph technology makes sense for fraud detection and customer 360 pro...
EY: Why graph technology makes sense for fraud detection and customer 360 pro...EY: Why graph technology makes sense for fraud detection and customer 360 pro...
EY: Why graph technology makes sense for fraud detection and customer 360 pro...Neo4j
Ā 
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017Carol Smith
Ā 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 TigerGraph
Ā 
Digital 2023 Australia (February 2023) v01
Digital 2023 Australia (February 2023) v01Digital 2023 Australia (February 2023) v01
Digital 2023 Australia (February 2023) v01DataReportal
Ā 
Intelligent Banking: AI cases in Retail and Commercial Banking
Intelligent Banking: AI cases in Retail and Commercial BankingIntelligent Banking: AI cases in Retail and Commercial Banking
Intelligent Banking: AI cases in Retail and Commercial BankingDmitry Petukhov
Ā 
Banish The Buzzwords
Banish The BuzzwordsBanish The Buzzwords
Banish The BuzzwordsLinkedIn
Ā 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
Ā 
Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation
Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation  Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation
Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation Mathew Sweezey
Ā 
chatgpt dalle.pptx
chatgpt dalle.pptxchatgpt dalle.pptx
chatgpt dalle.pptxEllen Edmands
Ā 
[Infographic] How will Internet of Things (IoT) change the world as we know it?
[Infographic] How will Internet of Things (IoT) change the world as we know it?[Infographic] How will Internet of Things (IoT) change the world as we know it?
[Infographic] How will Internet of Things (IoT) change the world as we know it?InterQuest Group
Ā 
2017 holiday survey: An annual analysis of the peak shopping season
2017 holiday survey: An annual analysis of the peak shopping season2017 holiday survey: An annual analysis of the peak shopping season
2017 holiday survey: An annual analysis of the peak shopping seasonDeloitte United States
Ā 

What's hot (20)

ChatGPT SEO Guide 2023
ChatGPT SEO Guide 2023ChatGPT SEO Guide 2023
ChatGPT SEO Guide 2023
Ā 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of Everything
Ā 
Infographics Key Data KPI presentation slides
Infographics Key Data KPI presentation slidesInfographics Key Data KPI presentation slides
Infographics Key Data KPI presentation slides
Ā 
Q3 2022 DBX Investor Presentation.pdf
Q3 2022 DBX Investor Presentation.pdfQ3 2022 DBX Investor Presentation.pdf
Q3 2022 DBX Investor Presentation.pdf
Ā 
2023 Q2 Crypto Industry Report | CoinGecko
2023 Q2 Crypto Industry Report | CoinGecko2023 Q2 Crypto Industry Report | CoinGecko
2023 Q2 Crypto Industry Report | CoinGecko
Ā 
Swift Paris - Dealing The Cards
Swift Paris - Dealing The CardsSwift Paris - Dealing The Cards
Swift Paris - Dealing The Cards
Ā 
How to Determine the ROI of Anything
How to Determine the ROI of AnythingHow to Determine the ROI of Anything
How to Determine the ROI of Anything
Ā 
The Hourglass Effect - A Decade of Displacement
The Hourglass Effect - A Decade of DisplacementThe Hourglass Effect - A Decade of Displacement
The Hourglass Effect - A Decade of Displacement
Ā 
Launching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's BackLaunching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's Back
Ā 
EY: Why graph technology makes sense for fraud detection and customer 360 pro...
EY: Why graph technology makes sense for fraud detection and customer 360 pro...EY: Why graph technology makes sense for fraud detection and customer 360 pro...
EY: Why graph technology makes sense for fraud detection and customer 360 pro...
Ā 
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017
Ā 
Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4 Graph Gurus 15: Introducing TigerGraph 2.4
Graph Gurus 15: Introducing TigerGraph 2.4
Ā 
Digital 2023 Australia (February 2023) v01
Digital 2023 Australia (February 2023) v01Digital 2023 Australia (February 2023) v01
Digital 2023 Australia (February 2023) v01
Ā 
Intelligent Banking: AI cases in Retail and Commercial Banking
Intelligent Banking: AI cases in Retail and Commercial BankingIntelligent Banking: AI cases in Retail and Commercial Banking
Intelligent Banking: AI cases in Retail and Commercial Banking
Ā 
Banish The Buzzwords
Banish The BuzzwordsBanish The Buzzwords
Banish The Buzzwords
Ā 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Ā 
Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation
Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation  Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation
Deriving Intelligence from Customer Actions: Data Marketing 2015 presentation
Ā 
chatgpt dalle.pptx
chatgpt dalle.pptxchatgpt dalle.pptx
chatgpt dalle.pptx
Ā 
[Infographic] How will Internet of Things (IoT) change the world as we know it?
[Infographic] How will Internet of Things (IoT) change the world as we know it?[Infographic] How will Internet of Things (IoT) change the world as we know it?
[Infographic] How will Internet of Things (IoT) change the world as we know it?
Ā 
2017 holiday survey: An annual analysis of the peak shopping season
2017 holiday survey: An annual analysis of the peak shopping season2017 holiday survey: An annual analysis of the peak shopping season
2017 holiday survey: An annual analysis of the peak shopping season
Ā 

Viewers also liked

DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDATAVERSITY
Ā 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Hamilton
Ā 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
Ā 
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionDeloitte United States
Ā 
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data storesTomas Doran
Ā 
Net Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidNet Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidAureus Analytics
Ā 
Pollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending BusinessPollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending BusinessPollen VC
Ā 
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...Jonathan Gray
Ā 
Data made out of functions
Data made out of functionsData made out of functions
Data made out of functionskenbot
Ā 
GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom Brian Housand
Ā 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShareSlideShare
Ā 
Mobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalMobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalAleyda SolĆ­s
Ā 
Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Emiland
Ā 
IT in Healthcare
IT in HealthcareIT in Healthcare
IT in HealthcareNetApp
Ā 
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings CEW Georgetown
Ā 
SXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsSXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsOgilvy Consulting
Ā 
Creative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsCreative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsTommaso Di Bartolo
Ā 
Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)a16z
Ā 
The Physical Interface
The Physical InterfaceThe Physical Interface
The Physical InterfaceJosh Clark
Ā 

Viewers also liked (19)

DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data Quality
Ā 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
Ā 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Ā 
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
Ā 
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
Ā 
Net Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to AvoidNet Promoter Score Pitfalls to Avoid
Net Promoter Score Pitfalls to Avoid
Ā 
Pollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending BusinessPollen VC Building A Digital Lending Business
Pollen VC Building A Digital Lending Business
Ā 
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ways of Seeing Data: Towards a Critical Literacy for Data Visualisations as R...
Ā 
Data made out of functions
Data made out of functionsData made out of functions
Data made out of functions
Ā 
GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom
Ā 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
Ā 
Mobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigitalMobile-First SEO - The Marketers Edition #3XEDigital
Mobile-First SEO - The Marketers Edition #3XEDigital
Ā 
Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.Dear NSA, let me take care of your slides.
Dear NSA, let me take care of your slides.
Ā 
IT in Healthcare
IT in HealthcareIT in Healthcare
IT in Healthcare
Ā 
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings
Ā 
SXSW 2016: The Need To Knows
SXSW 2016: The Need To KnowsSXSW 2016: The Need To Knows
SXSW 2016: The Need To Knows
Ā 
Creative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage StartupsCreative Traction Methodology - For Early Stage Startups
Creative Traction Methodology - For Early Stage Startups
Ā 
Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)Mobile Is Eating the World (2016)
Mobile Is Eating the World (2016)
Ā 
The Physical Interface
The Physical InterfaceThe Physical Interface
The Physical Interface
Ā 

Similar to GPU Computing for Data Science

"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
Ā 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonJen Aman
Ā 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
Ā 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCLMuhaza Liebenlito
Ā 
Pgopencl
PgopenclPgopencl
PgopenclTim Child
Ā 
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmAnne Nicolas
Ā 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practicesLior Sidi
Ā 
OpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADOpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADDesign World
Ā 
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkRed Hat Developers
Ā 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...Rogue Wave Software
Ā 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU ContinuumOfer Rosenberg
Ā 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
Ā 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
Ā 
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012DefCamp
Ā 
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsArnon Shimoni
Ā 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialGanesan Narayanasamy
Ā 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
Ā 

Similar to GPU Computing for Data Science (20)

"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
Ā 
GPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And PythonGPU Computing With Apache Spark And Python
GPU Computing With Apache Spark And Python
Ā 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
Ā 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
Ā 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
Ā 
Pgopencl
PgopenclPgopencl
Pgopencl
Ā 
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Ā 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
Ā 
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
Ā 
OpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CADOpenCL & the Future of Desktop High Performance Computing in CAD
OpenCL & the Future of Desktop High Performance Computing in CAD
Ā 
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech Talk
Ā 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
Ā 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
Ā 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
Ā 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Ā 
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
Ā 
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holds
Ā 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
Ā 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
Ā 
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
Ā 

More from Domino Data Lab

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...Domino Data Lab
Ā 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
Ā 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
Ā 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
Ā 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationDomino Data Lab
Ā 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
Ā 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
Ā 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
Ā 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceDomino Data Lab
Ā 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
Ā 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
Ā 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at ScaleDomino Data Lab
Ā 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
Ā 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data ScientistsDomino Data Lab
Ā 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data SmartDomino Data Lab
Ā 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
Ā 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
Ā 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
Ā 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
Ā 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
Ā 

More from Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
Ā 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
Ā 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
Ā 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
Ā 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
Ā 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
Ā 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
Ā 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
Ā 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
Ā 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
Ā 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
Ā 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
Ā 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
Ā 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
Ā 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
Ā 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Ā 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
Ā 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Ā 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
Ā 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
Ā 

Recently uploaded

TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
Ā 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
Ā 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
Ā 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
Ā 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
Ā 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Å abatka
Ā 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
Ā 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
Ā 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
Ā 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
Ā 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
Ā 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
Ā 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
Ā 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
Ā 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
Ā 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
Ā 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
Ā 

Recently uploaded (17)

TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
Ā 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
Ā 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
Ā 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
Ā 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Ā 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
Ā 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
Ā 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
Ā 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
Ā 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
Ā 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
Ā 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
Ā 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
Ā 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
Ā 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
Ā 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
Ā 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Ā 

GPU Computing for Data Science

  • 1. GPU Computing for Data Science John Joo john.joo@dominodatalab.com Data Science Evangelist @ Domino Data Lab
  • 2. Outline ā€¢ Why use GPUs? ā€¢ Example applications in data science ā€¢ Programming your GPU
  • 3. Case Study: Monte Carlo Simulations ā€¢ Simulate behavior when randomness is a key component ā€¢ Average the results of many simulations ā€¢ Make predictions
  • 4. Little Information in One ā€œNoisy Simulationā€ Price(t+1) = Price(t) e InterestRateā€¢dt + noise
  • 5. Many ā€œNoisy Simulationsā€ āž” Actionable Information Price(t+1) = Price(t) e InterestRateā€¢dt + noise
  • 6. Monte Carlo Simulations Are Often Slow ā€¢ Lots of simulation data is required to create valid models ā€¢ Generating lots of data takes time ā€¢ CPU works sequentially
  • 7. CPUs designed for sequential, complex tasks Source: Mythbusters https://youtu.be/-P28LKWTzrI
  • 8. GPUs designed for parallel, low level tasks Source: Mythbusters https://youtu.be/-P28LKWTzrI
  • 9. GPUs designed for parallel, low level tasks Source: Mythbusters https://youtu.be/-P28LKWTzrI
  • 10. Applications of GPU Computing in Data Science ā€¢ Matrix Manipulation ā€¢ Numerical Analysis ā€¢ Sorting ā€¢ FFT ā€¢ String matching ā€¢ Monte Carlo simulations ā€¢ Machine learning ā€¢ Search Algorithms for GPU Acceleration ā€¢ Inherently parallel ā€¢ Matrix operations ā€¢ High FLoat-point Operations Per Sec (FLOPS)
  • 11. GPUs Make Deep Learning Accessible Google Datacenter Stanford AI Lab # of machines 1,000 3 # of CPUs or GPUs 2,000 CPUs 12 GPUs Cores 16,000 18,432 Power used 600 kW 4 kW Cost $5,000,000 $33,000 Adam Coates, Brody Huval,Tao Wang, David Wu, Bryan Catanzaro, Ng Andrew ; JMLR W&CP 28 (3) : 1337ā€“1345, 2013
  • 12. CPU vs GPU Architecture: Structured for Different Purposes CPU 4-8 High Performance Cores GPU 100s-1000s of bare bones cores
  • 13. Both CPU and GPU are required CPU GPU Compute intensive functions Everything else General Purpose GPU Computing (GPGPU) Heterogeneous Computing
  • 14. Getting Started: Hardware ā€¢ Need a computer with GPU ā€¢ GPU should not be operating your display Spin up a GPU/CPU computer with 1 click. 8 CPU cores, 15 GB RAM 1,536 GPU cores, 4GB RAM
  • 16. Programming CPU ā€¢ Sequential ā€¢ Write code top to bottom ā€¢ Can do complex tasks ā€¢ Independent Programming GPU ā€¢ Parallel ā€¢ Multi-threaded - race conditions ā€¢ Low level tasks ā€¢ Dependent on CPU Getting Started: Software
  • 17. Talking to your GPU CUDA and OpenCL are GPU computing frameworks
  • 18. Choosing How to Interface with GPU: Simplicity vs Flexibility Application speciļ¬c libraries General purpose GPU libraries Custom CUDA/ OpenCL code Flexibility Simplicity Low Low High High
  • 19. Application Speciļ¬c Libraries Python ā€¢ Theano - Symbolic math ā€¢ TensorFlow - ML ā€¢ Lasagne - NN ā€¢ Pylearn2 - ML ā€¢ mxnet - NN ā€¢ ABSsysbio - Systems Bio R ā€¢ cudaBayesreg - fMRI ā€¢ mxnet - NN ā€¢ rpud -SVM ā€¢ rgpu - bioinformatics Tutorial on using Theano, Lasagne, and no-learn: http://blog.dominodatalab.com/gpu-computing-and-deep-learning/
  • 20. General Purpose GPU Libraries ā€¢ Python and R wrappers for basic matrix and linear algebra operations ā€¢ scikit-cuda ā€¢ cudamat ā€¢ gputools ā€¢ HiPLARM ā€¢ Drop-in library
  • 21. Drop-in Library Credit: NVIDIA Also works for Python! http://scelementary.com/2015/04/09/nvidia-nvblas-in-numpy.html
  • 22. Custom CUDA/OpenCL Code 1. Allocate memory on the GPU 2. Transfer data from CPU to GPU 3. Launch the kernel to operate on the CPU cores 4. Transfer results back to CPU
  • 23. Example of using Python and CUDA: Monte Carlo Simulations ā€¢ Using PyCuda to interface Python and CUDA ā€¢ Simulating 3 million paths, 100 time steps each
  • 24. Python Code for CPU Python/PyCUDA Code for GPU 8 more lines of code
  • 25. Python Code for CPU Python/PyCUDA Code for CPU 1. Allocate memory on the GPU
  • 26. Python Code for CPU Python/PyCUDA Code for CPU 2. Transfer data from CPU to GPU
  • 27. Python Code for CPU Python/PyCUDA Code for CPU 3. Launch the kernel to operate on the CPU cores
  • 28. Python Code for CPU Python/PyCUDA Code for CPU 4. Transfer results back to CPU
  • 29. Python Code for CPU 26 sec Python/PyCUDA Code for CPU 8 more lines of code 1.5 sec 17x speed up
  • 30. Some sample Jupyter notebooks ā€¢ https://app.dominodatalab.com/johnjoo/gpu_examples ā€¢ Monte Carlo example using PyCUDA ā€¢ PyCUDA example compiling CUDA C for kernel instructions ā€¢ Scikit-cuda example of matrix multiplication ā€¢ Calculating a distance matrix using rpud
  • 31. More resources ā€¢ NVIDIA ā€¢ https://developer.nvidia.com/how-to-cuda-python ā€¢ Berkeley GPU workshop ā€¢ http://www.stat.berkeley.edu/scf/paciorek- gpuWorkshop.html ā€¢ Duke Statistics on GPU (Python) ā€¢ http://people.duke.edu/~ccc14/sta-663/ CUDAPython.html ā€¢ Andreas Klocknerā€™s webpage (Python) ā€¢ http://mathema.tician.de/ ā€¢ Summary of GPU libraries ā€¢ http://fastml.com/running-things-on-a-gpu/
  • 32. More resources ā€¢ Walk through of CUDA programming in R ā€¢ http://blog.revolutionanalytics.com/2015/01/parallel- programming-with-gpus-and-r.html ā€¢ List of libraries for GPU computing in R ā€¢ https://cran.r-project.org/web/views/ HighPerformanceComputing.html ā€¢ Matrix computations in Machine Learning ā€¢ http://numml.kyb.tuebingen.mpg.de/numl09/ talk_dhillon.pdf