The document discusses using GPUs on Google Cloud Platform for accelerating compute-intensive workloads. It describes how GPUs can provide significant performance gains for machine learning, high performance computing, and visualization workloads. It provides examples of customers like Schlumberger leveraging GPUs on GCP for oil exploration and Shazam for music fingerprinting. The document also highlights the flexibility, scalability, and cost benefits of using GPUs on Google Cloud Platform.
3. Compute Workloads Powered by Accelerators
Machine Learning
Training
High Performance
Computing
VisualizationMachine Learning
Inference
Proprietary + Confidential
4. Computing with GPUs
● Machine Learning Training and Inference- TensorFlow
● Frame Rendering and image composition - V-Ray by ChaosGroup
● Physical Simulation and Analysis (CFD, FEM, Structural Mechanics)
● Real-time Visual Analytics and SQL Database - MapD
● FFT-based 3D Protein Docking - MEGADOCK
● Faster than real-time 4K video transcoding - Colorfront Transkoder
● Open Source Video Transcoding - FFmpeg, libav
● Open Source Sequence Mapping/Alignment - BarraCUDA
● Subsurface Analysis for the Oil & Gas industry - Reverse Time Migration
● Risk Management and Derivatives Pricing - Computational Finance
Workloads that require compute-intensive
processing of massive amounts of data can benefit
from the parallel architecture of the GPU
5. What is GPU Accelerated Computing?
Application Code
GPU CPU
Optimized for
Parallel throughput
Rest of Sequential
CPU Code
Compute-intensive
Portions of the
application
Optimized for
Sequential Execution
Proprietary + Confidential
11. One of the Best
infrastructure with
system-level integration
Google Cloud Advantages over Colocation Hosting
Faster Provisioning with
scaled nodes, storage &
networking
Lower capital
expenditure with
On-demand Pricing
Security built-in
with data access
& user-control
Proprietary + Confidential
12. Intel
Skylake CPU
NVIDIA
P100 GPU
Google Cloud Leadership Across All Workloads
Proprietary + Confidential
Machine Learning
Training
Machine Learning
Inference
High Performance
Computing
Visualization*
14. Confidential + Proprietary
Proprietary + Confidential
Google Cloud End-to End AI Platform
Industry Use-cases
In-loop inferencing for trained models
Cloud AI products
Pre-trained ML APIs to
Building custom ML models
ML Framework
Industry-standard & widely adopted
Infrastructure
One of the best processors for ML/DL
CPU GPU TPU
Enterprise
Applications
Visualization*HPC
Machine Learning Training & Inference
*coming soon
16. Proprietary + Confidential
Google Cloud has newest NVIDIA GPUs - usable in GCE, CloudML and GKE (alpha)
NVIDIA Tesla P100 NVIDIA Tesla K80
New! GPUs on Google Cloud
Beta
Launch
General
Availability
17. Currently available accelerators
NVIDIA Tesla K80 NVIDIA Tesla P100
Workloads Compute Compute and visualization
Number of GPUs/instance 1, 2, 4, or 8 1, 2, or 4
Local SSD Up to 3 TB Up to 3 TB
Max vCPU/RAM 64/416 GB 64/208 GB
Available in zones
us-east1, us-west1,
europe-west1, asia-east1
us-east1, us-west1,
us-central1,
europe-west1, asia-east1
19. Cloud GPUs - State of Art Technology & Pricing
● Introducing sustained use discounts
on Google Cloud for both NVIDIA K80
and P100 GPUs
● NVIDIA Tesla P100 is state of the art
of GPU technology. Based on the
Pascal GPU architecture, increase
overall throughput with fewer
instances while saving money
● P100 GPUs can accelerate workloads
by up to 10x compared to K801
1 Source: The 10x performance boost compares 1 P100 GPU versus 1 K80 GPU (½ of a K80
board) for machine learning inference workloads that benefits from the P100 FP16 precision.
Performance will vary by workload. Download this datasheet for more information.
20. New! NVIDIA Tesla P100 GPU on Google Cloud
Tesla P100
Performance:
(wrt Tesla K80)
4X 10X
Varies by
application
ML Training ML Inference HPC
Proprietary + Confidential
4X P100 to 4X K80
Googlenet, Tensorflow
1X P100 to ½X K80
Resnet50, TensorRT
~3X faster for Audio (Shazam)
~3X Life Science VASP
Source: Nvidia
21. Proprietary + Confidential
GPUs in the Cloud Optimize Time and Cost
Speed Up Complex
Compute Jobs
● Offers the breadth of GPU
capability for speeding up
compute-intensive jobs such
as building ML & DL models
● One of the best interactive
graphics experience with
remote workstations
● No capital investment
● Custom machine types:
Configure an instance with
exactly the number of CPUs,
GPUs, memory and local SSD
that you need for
your workloads
● Lower TCO with per minute
pricing : you can choose the
GPU that best suits your
needs and pay only for what
you use
Advantages of using GPUs in Google Cloud
22. Proprietary + Confidential
Bare Metal Performance Attach GPUs To Any Machine TypeFlexible GPU Counts Per Instance
GPUs are offered in passthrough mode to
provide bare metal performance.
Attach up to 4 NVIDIA P100 and 8 NVIDIA K80
GPUs to your instance to get the power that you
need for your applications.
You can mix-match different GCP compute resources,
such as vCPUs, memory, local SSD, GPUs and
persistent disk, to suit the need of your workloads.
Per-Minute Billing Benefits of Google CloudGPU Application Frameworks
Get the same per-minute billing and
sustained-use discounts for GPUs that you
do for the rest of Google Cloud Platform's
resources. Pay only for what you need!
Whether your applications require CUDA or
OpenCL, Compute Engine provides the hardware
that you need to accelerate your workloads.
Run GPU workloads on Google Cloud Platform
where you have access to industry-leading storage,
networking, and data analytics technologies.
Cloud GPU - Key Features
23. Up to 8 NVIDIA TESLA
GPUs per Virtual Machine
On any VM shape with at least 1 vCPU, you can attach 1, 2,
4 or 8 GPUs along with up to 3 TB of Local SSD.
NVIDIA Tesla K80 NVIDIA Tesla P100
Proprietary + Confidential
24. Provision a GPU instance using the console
https://console.cloud.google.com/
29. gcloud beta compute instances create gpu-instance-1
--machine-type n1-standard-16
--zone asia-east1-a
--accelerator type=nvidia-tesla-k80,count=2
--image-family ubuntu-1604-lts
--image-project ubuntu-os-cloud
--maintenance-policy TERMINATE
--restart-on-failure
--metadata startup-script='#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda; then
curl -O
http://developer.download.nvidia.com/compute/cuda/repo
s/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd6
4.deb
dpkg -i
./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
apt-get update
apt-get install cuda -y
fi'
Provisioning a
GPU instance
30. Proprietary + Confidential
Dramatic Value for ML Training
INSTANCES K80 (4X) P100 (8X)
# of
Instances
Annual
Cost
Annual
Savings
Exponential Performance gains
leading to Cost Savings
P100 uses only a quarter
of the training time
ResNet50 Training for 60 Epochs
with 1.3M images dataset
0 20 40 60 80 100
4X K80
8K P100 19
Hours
74 Hours
Workload
• Data scientists iterating to refine
the AI models
• Eg.: Models for speech recognition,
recommendation engine, etc.
• Scenario: 5 Data Scientists, training 10
Networks each per week using
ResNet50 on a 1M image dataset
Source: Nvidia
31. Dramatic Value for ML Inference
CLOUD
INSTANCE
K80
(1/2X GPU)
P100
(1X GPU)
# of
Instances
Annual
Cost
Annual
Savings
Exponential Performance gains
leading to Cost Savings
Workload
• Running Inference networks to
serve customers
• Eg.: Serving recommendations,
responding to speech queries, etc.
• Scenario: 10 dataset models of 30K
Images Inferenced per Sec with
ResNet50
13x more throughput
with P100
(ResNet-50)
Achieve required responsiveness
for great user experience
Throughput(Images/Sec)
Proprietary + Confidential
13X
Source: Nvidia
32. Examples
● NVIDIA GPUs deliver an order of magnitude
faster performance, dramatically reducing
time to insight
● GCP provides flexibility to spin up instances
with high GPU to CPU ratios
● GPUs in the cloud help process data faster
and save money
GPU Use Case #1:
NVIDIA GPU-ACCELERATED
HPC ON GCP
Schlumberger collects 100s of terabytes of
acoustic and seismic data daily using sensors that
can look 10,000s of feet deep.
Proprietary + Confidential
33. Data volumes in the oilfield
Types of Oilfield Data Volume
Seismic Data – Onshore / Land Oilfields
6-10 TBytes / day x 6-12 month duration
x 10s of projects/yr
Seismic Data – Offshore / Marine Oilfields 5-10 PBytes / project x 10s of projects/yr
Well Data – Drilling, Measurements, Testing, … 100s of GBytes / day x 1000s of wells/yr
Sub-surface Model Data 50-100 GBytes / project x 1000s of projects/yr
Production Data – Pumps, pipelines, networks, … 5 MBytes / day/ well x 10,000s of well/yr
34. Seismic exploration In a nutshell
Acquire data
indirect measurements
about rocks illuminate
subsurface
Build subsurface understanding
images, interpretations
and geological models● Large datasets recorded and generated
○ One job’s input: 20-100TB
● Compute and IO intensive, concurrent jobs
○ One small job uses hundreds of nodes
for several day (100K core hours)
○ A large job uses thousands of nodes for
1-2 weeks (3-5M core hours)
35. On the Google Cloud Platform
● GPUs: Provide great performance per $. Nodes with up
to 8 GPUs minimize inter-node communication for
processing
● Computations: instances dynamically created using
deployment manager. VM sized per application for best
combination of I/O bandwidth and CPU power
● File systems: persistent disks for scratch, gluster on
GCE and GCS
● Monitoring and control: Monitoring done with internal
tools and stackdriver. Control messages using pub/sub
● Cloud Interconnect: High speed link for data transfers
before jobs
● Secure: Secure access, secure data, secure
transmissions
36. GPU Use Case #2:
ML at Scale
Examples
● Shazam app creates a digital fingerprint of
the audio & within seconds, matches with
Shazam's database of millions of tracks
● Machines spin up and down quickly, so that
scaling up & down address demand spikes
● Indexing went from daily to hourly
● Dynamic cluster configurations allow for
cloud infrastructure reconfiguration
depending on time of day
● 20 million customer requests per day
“On Google Cloud, we can recompile the index and
reimage the GPU instance in well under an hour, so
the index files are always up-to-date.”
Proprietary + Confidential