SlideShare a Scribd company logo
1 of 13
Download to read offline
OPEN RESEARCH, INC.
QIMING HE*
NASA-GODDARD SPACE FLIGHT CENTER (GSFC)
SHUJIA ZHOU, BEN KOBLER, DAN DUFFY, TOM MCGLYNN
Case Study for Running HPC
Applications in Public Clouds
Case Study for Running HPC Applications in Public Clouds
Motivation: HPC-in-the-cloud
HPC: MPI-based parallel computing programs on
distributed-memory systems.
Public clouds:
Virtualized computing resources
Pay-as-you-go pricing model.
Technically feasible?
Deploy and execute any-size app in any cloud?
Satisfactory performance vs. in-house HPC systems?
“HPC-friendly” clouds?
“HPC-friendly” application type?
Economically feasible?
Cloud servers costs vs. local cost of person, facilities, energy,…
Case Study for Running HPC Applications in Public Clouds
Previous Work
Ed. Walker, “Benchmarking Amazon EC2 for high-
performance scientific computing” , 2008.
NPB Benchmark on EC2 vs. NCSA
5-20% performance gap for NPB-OMP.
10%-1000% performance gap for NPB-MPI.
Jeff. Napper, et. al. , “Can cloud computing reach the
top500?”, 2009.
LINPACK/HPL Benchmark on EC2
1) not scalable as problem and cluster size increase.
2) not economical in terms of FLOPS/$.
Why EC2 only?
Case Study for Running HPC Applications in Public Clouds
Our Approach
NPB
LINPACK/
HPL
Full-size
Application
Benchmarks & Apps Public Clouds
EC2
GoGrid
IBM
Conclusions &
Insights?
Case Study for Running HPC Applications in Public Clouds
Benchmarks and Applications
NPB (from NASA)
LINAPCK/HPL (used by TOP500)
CSFV (NASA weather prediction app based on Cubed-
Sphere Finite-Volume Core)
Benchmark Name derived from Description
BT Block Tridiagonal Solve a synthetic system of nonlinear PDEs using algorithms block tridiagonal.
CG Conjugate Gradient Estimate the largest eigenvalue of a sparse symmetric positive-definite matrix using the inverse iteration
with the conjugate gradient method as a subroutine for solving systems of linear equations
EP Embarrassingly Parallel Generate independent Gaussian random variates using the Marsaglia polar method
FT Fast Fourier Transform Solve a three-dimensional partial differential equation (PDE) using the fast Fourier transform (FFT)
IS Integer Sort Sort small integers using the bucket sort
LU Lower-Upper symmetric Gauss-Seidel Solve a synthetic system of nonlinear PDEs using LU-SSOR algorithm
MG MultiGrid Approximate the solution to a three-dimensional discrete Poisson equation using the V-cycle multigrid
method
SP Scalar Pentadiagonal Solve a synthetic system of nonlinear PDEs using scalar pentadiagonal algorithm
Case Study for Running HPC Applications in Public Clouds
Public Clouds Hardware Specifications
EC2 (Xeon E5345 @2.33GHz)
GoGrid(Xeon E5459@3GHz)
IBM(Nehalem X5570@2.93GHz)
Now 8 cores
Case Study for Running HPC Applications in Public Clouds
NPB-OMP
NPB OpenMP Benchmark (No MPI over network)
To evaluate virtualization overhead per node
Case Study for Running HPC Applications in Public Clouds
Cloud Networking
Latency and Throughput (MPI Messages)
IBM (100Mbps): consistently slow
EC2 (1000Mbps?): inconsistently fast, multi-hop IP packets
GoGrid (1000Mbps): consistently fast
But all three are orders of magnitude slower than NCSA (with
InfiniBand)
Case Study for Running HPC Applications in Public Clouds
NPB-MPI
Re-run Ed Walker’s TestCase in three clouds
Better network, better performance (in line with
earlier researchers)
GoGrid could be better if it has 8-core servers.
Case Study for Running HPC Applications in Public Clouds
HPL
Re-run Jeff Naper’s TesCases in three clouds
(from Naper’s paper)
Case Study for Running HPC Applications in Public Clouds
CSFV
Full-size app (with 110K line of Fortran/MPI code).
Compile- and run-time dependencies, e.g., NETCDF.
Over-subscription helps (a little).
Best-case scenario: only underperforms NASA system by
20%
Case Study for Running HPC Applications in Public Clouds
Other Observations
Programming paradigm matters
MPI, OpenMP, MPI+OpenMP
Embarrassingly Parallel (EP) app works the best.
EP represents a wide-spectrum of apps.
Commercial vs. Open source software in the cloud
No performance difference observed.
Cloud economics
FLOPS/$, FLOPS/watt,…
Case Study for Running HPC Applications in Public Clouds
Future Work
Benchmarks with larger size
Re-do GoGrid with 8-core server
More public clouds
Terremark’s vCloud Express (vmware-based)
Penguin Computing’s HPC-as-a-Service™
IBM upgrades to (at least) 1GbE network soon?
Private Clouds
NASA Nebula
DOE Magellan
Q&A
Case Study for Running HPC Applications in Public Clouds

More Related Content

What's hot

Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web ServicesCreating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services G. Bruce Berriman
 
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...apidays
 
Kyle Ikuma MEMSat presentation
Kyle Ikuma MEMSat presentationKyle Ikuma MEMSat presentation
Kyle Ikuma MEMSat presentationKyleIkuma
 
Development Infographic
Development InfographicDevelopment Infographic
Development InfographicRealMassive
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster packageAlberto Labarga
 
Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...
Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...
Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...Ural-PDC
 
Scientific Computing With Amazon Web Services
Scientific Computing With Amazon Web ServicesScientific Computing With Amazon Web Services
Scientific Computing With Amazon Web ServicesJamie Kinney
 
Accelerating Science with Generative Adversarial Networks
Accelerating Science with Generative Adversarial NetworksAccelerating Science with Generative Adversarial Networks
Accelerating Science with Generative Adversarial NetworksMichela Paganini
 
How to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC HardwareHow to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC Hardwareinside-BigData.com
 
Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...
Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...
Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...Integrated Carbon Observation System (ICOS)
 
k means clustering-based data compression
k means clustering-based data compressionk means clustering-based data compression
k means clustering-based data compressionmohammed alrekabe
 
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W uSpark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W uSpark Summit
 
Easily reduce runtimes with cython
Easily reduce runtimes with cythonEasily reduce runtimes with cython
Easily reduce runtimes with cythonMichal Mucha
 
Arrhenius.jl: A Differentiable Combustion Simulation Package
Arrhenius.jl: A Differentiable Combustion Simulation PackageArrhenius.jl: A Differentiable Combustion Simulation Package
Arrhenius.jl: A Differentiable Combustion Simulation PackageWeiqi Ji
 
FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...
FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...
FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...u772020
 
140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...
140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...
140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...Ptidej Team
 

What's hot (20)

Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web ServicesCreating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
 
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
 
Kyle Ikuma MEMSat presentation
Kyle Ikuma MEMSat presentationKyle Ikuma MEMSat presentation
Kyle Ikuma MEMSat presentation
 
Development Infographic
Development InfographicDevelopment Infographic
Development Infographic
 
Climate data in r with the raster package
Climate data in r with the raster packageClimate data in r with the raster package
Climate data in r with the raster package
 
Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...
Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...
Performance Evaluation of Space Fractional FitzHugh-Nagumo Model: an Implemen...
 
My_final_pres
My_final_presMy_final_pres
My_final_pres
 
Scientific Computing With Amazon Web Services
Scientific Computing With Amazon Web ServicesScientific Computing With Amazon Web Services
Scientific Computing With Amazon Web Services
 
Accelerating Science with Generative Adversarial Networks
Accelerating Science with Generative Adversarial NetworksAccelerating Science with Generative Adversarial Networks
Accelerating Science with Generative Adversarial Networks
 
How to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC HardwareHow to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC Hardware
 
15 sengupta next_generation_satellite_modelling
15 sengupta next_generation_satellite_modelling15 sengupta next_generation_satellite_modelling
15 sengupta next_generation_satellite_modelling
 
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
 
Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...
Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...
Henne, Stephan: Simulating CO₂ plumes from power plants at (sub-)kilometer-sc...
 
k means clustering-based data compression
k means clustering-based data compressionk means clustering-based data compression
k means clustering-based data compression
 
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W uSpark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
 
Easily reduce runtimes with cython
Easily reduce runtimes with cythonEasily reduce runtimes with cython
Easily reduce runtimes with cython
 
Arrhenius.jl: A Differentiable Combustion Simulation Package
Arrhenius.jl: A Differentiable Combustion Simulation PackageArrhenius.jl: A Differentiable Combustion Simulation Package
Arrhenius.jl: A Differentiable Combustion Simulation Package
 
FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...
FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...
FORECASTING OF RENEWABLE ENERGY PRODUCTION BY USING GENETIC ALGORITHM (GA) FO...
 
NASA's Movement Towards Cloud Computing
NASA's Movement Towards Cloud ComputingNASA's Movement Towards Cloud Computing
NASA's Movement Towards Cloud Computing
 
140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...
140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...
140829+an+empirical+study+of+the+impact+of+cloud+patterns+on+quality+of+servi...
 

Viewers also liked

MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...Dell EMC World
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
 
Measuring HPC: Performance, Cost, & Value
Measuring HPC: Performance, Cost, & ValueMeasuring HPC: Performance, Cost, & Value
Measuring HPC: Performance, Cost, & Valueinside-BigData.com
 

Viewers also liked (6)

MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
Measuring HPC: Performance, Cost, & Value
Measuring HPC: Performance, Cost, & ValueMeasuring HPC: Performance, Cost, & Value
Measuring HPC: Performance, Cost, & Value
 
IDC HPC Market Update
IDC HPC Market UpdateIDC HPC Market Update
IDC HPC Market Update
 
2016 IDC HPC Market Update
2016 IDC HPC Market Update2016 IDC HPC Market Update
2016 IDC HPC Market Update
 
HPC Market Update from IDC
HPC Market Update from IDCHPC Market Update from IDC
HPC Market Update from IDC
 

Similar to Nasa HPC in the Cloud

HPC with Clouds and Cloud Technologies
HPC with Clouds and Cloud TechnologiesHPC with Clouds and Cloud Technologies
HPC with Clouds and Cloud TechnologiesInderjeet Singh
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersIan Foster
 
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...George Markomanolis
 
MPWide: A light-weight communication library for wide area message passing an...
MPWide: A light-weight communication library for wide area message passing an...MPWide: A light-weight communication library for wide area message passing an...
MPWide: A light-weight communication library for wide area message passing an...Derek Groen
 
OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021OpenACC
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdfLevLafayette1
 
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraRealizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraMasaharu Munetomo
 
OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudAmazon Web Services
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
 
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesOWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesMahdi Atawneh
 
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Ilham Amezzane
 
Sky Arrays - ArrayDB in action for Sky View Factor Computation
Sky Arrays - ArrayDB in action for Sky View Factor ComputationSky Arrays - ArrayDB in action for Sky View Factor Computation
Sky Arrays - ArrayDB in action for Sky View Factor ComputationEUDAT
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsBita Kazemi
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 

Similar to Nasa HPC in the Cloud (20)

HPC with Clouds and Cloud Technologies
HPC with Clouds and Cloud TechnologiesHPC with Clouds and Cloud Technologies
HPC with Clouds and Cloud Technologies
 
Slide 1
Slide 1Slide 1
Slide 1
 
Slide 1
Slide 1Slide 1
Slide 1
 
cug2011-praveen
cug2011-praveencug2011-praveen
cug2011-praveen
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
 
Paper
PaperPaper
Paper
 
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
 
MPWide: A light-weight communication library for wide area message passing an...
MPWide: A light-weight communication library for wide area message passing an...MPWide: A light-weight communication library for wide area message passing an...
MPWide: A light-weight communication library for wide area message passing an...
 
OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021OpenACC Monthly Highlights: September 2021
OpenACC Monthly Highlights: September 2021
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
 
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale EraRealizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
Realizing Robust and Scalable Evolutionary Algorithms toward Exascale Era
 
OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020
 
Gupta_Keynote_VTDC-3
Gupta_Keynote_VTDC-3Gupta_Keynote_VTDC-3
Gupta_Keynote_VTDC-3
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesOWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
 
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
 
Sky Arrays - ArrayDB in action for Sky View Factor Computation
Sky Arrays - ArrayDB in action for Sky View Factor ComputationSky Arrays - ArrayDB in action for Sky View Factor Computation
Sky Arrays - ArrayDB in action for Sky View Factor Computation
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 

Nasa HPC in the Cloud

  • 1. OPEN RESEARCH, INC. QIMING HE* NASA-GODDARD SPACE FLIGHT CENTER (GSFC) SHUJIA ZHOU, BEN KOBLER, DAN DUFFY, TOM MCGLYNN Case Study for Running HPC Applications in Public Clouds Case Study for Running HPC Applications in Public Clouds
  • 2. Motivation: HPC-in-the-cloud HPC: MPI-based parallel computing programs on distributed-memory systems. Public clouds: Virtualized computing resources Pay-as-you-go pricing model. Technically feasible? Deploy and execute any-size app in any cloud? Satisfactory performance vs. in-house HPC systems? “HPC-friendly” clouds? “HPC-friendly” application type? Economically feasible? Cloud servers costs vs. local cost of person, facilities, energy,… Case Study for Running HPC Applications in Public Clouds
  • 3. Previous Work Ed. Walker, “Benchmarking Amazon EC2 for high- performance scientific computing” , 2008. NPB Benchmark on EC2 vs. NCSA 5-20% performance gap for NPB-OMP. 10%-1000% performance gap for NPB-MPI. Jeff. Napper, et. al. , “Can cloud computing reach the top500?”, 2009. LINPACK/HPL Benchmark on EC2 1) not scalable as problem and cluster size increase. 2) not economical in terms of FLOPS/$. Why EC2 only? Case Study for Running HPC Applications in Public Clouds
  • 4. Our Approach NPB LINPACK/ HPL Full-size Application Benchmarks & Apps Public Clouds EC2 GoGrid IBM Conclusions & Insights? Case Study for Running HPC Applications in Public Clouds
  • 5. Benchmarks and Applications NPB (from NASA) LINAPCK/HPL (used by TOP500) CSFV (NASA weather prediction app based on Cubed- Sphere Finite-Volume Core) Benchmark Name derived from Description BT Block Tridiagonal Solve a synthetic system of nonlinear PDEs using algorithms block tridiagonal. CG Conjugate Gradient Estimate the largest eigenvalue of a sparse symmetric positive-definite matrix using the inverse iteration with the conjugate gradient method as a subroutine for solving systems of linear equations EP Embarrassingly Parallel Generate independent Gaussian random variates using the Marsaglia polar method FT Fast Fourier Transform Solve a three-dimensional partial differential equation (PDE) using the fast Fourier transform (FFT) IS Integer Sort Sort small integers using the bucket sort LU Lower-Upper symmetric Gauss-Seidel Solve a synthetic system of nonlinear PDEs using LU-SSOR algorithm MG MultiGrid Approximate the solution to a three-dimensional discrete Poisson equation using the V-cycle multigrid method SP Scalar Pentadiagonal Solve a synthetic system of nonlinear PDEs using scalar pentadiagonal algorithm Case Study for Running HPC Applications in Public Clouds
  • 6. Public Clouds Hardware Specifications EC2 (Xeon E5345 @2.33GHz) GoGrid(Xeon E5459@3GHz) IBM(Nehalem X5570@2.93GHz) Now 8 cores Case Study for Running HPC Applications in Public Clouds
  • 7. NPB-OMP NPB OpenMP Benchmark (No MPI over network) To evaluate virtualization overhead per node Case Study for Running HPC Applications in Public Clouds
  • 8. Cloud Networking Latency and Throughput (MPI Messages) IBM (100Mbps): consistently slow EC2 (1000Mbps?): inconsistently fast, multi-hop IP packets GoGrid (1000Mbps): consistently fast But all three are orders of magnitude slower than NCSA (with InfiniBand) Case Study for Running HPC Applications in Public Clouds
  • 9. NPB-MPI Re-run Ed Walker’s TestCase in three clouds Better network, better performance (in line with earlier researchers) GoGrid could be better if it has 8-core servers. Case Study for Running HPC Applications in Public Clouds
  • 10. HPL Re-run Jeff Naper’s TesCases in three clouds (from Naper’s paper) Case Study for Running HPC Applications in Public Clouds
  • 11. CSFV Full-size app (with 110K line of Fortran/MPI code). Compile- and run-time dependencies, e.g., NETCDF. Over-subscription helps (a little). Best-case scenario: only underperforms NASA system by 20% Case Study for Running HPC Applications in Public Clouds
  • 12. Other Observations Programming paradigm matters MPI, OpenMP, MPI+OpenMP Embarrassingly Parallel (EP) app works the best. EP represents a wide-spectrum of apps. Commercial vs. Open source software in the cloud No performance difference observed. Cloud economics FLOPS/$, FLOPS/watt,… Case Study for Running HPC Applications in Public Clouds
  • 13. Future Work Benchmarks with larger size Re-do GoGrid with 8-core server More public clouds Terremark’s vCloud Express (vmware-based) Penguin Computing’s HPC-as-a-Service™ IBM upgrades to (at least) 1GbE network soon? Private Clouds NASA Nebula DOE Magellan Q&A Case Study for Running HPC Applications in Public Clouds