SlideShare a Scribd company logo
1 of 20
Download to read offline
TransAtlantic networking
using Cloud links
Igor Sfiligoi
UC San Diego – San Diego Supercomputer Center
The problem
• ESNet TransAtlantic
capacity expected
to not keep up with
WLCG needs
• 400 Gbps now
• Not expected to
exceed 600 Gbps
anytime soon
What can WLCG community do?
•Drastically reduce network traffic
•Find alternatives
Can Cloud networking help?
There is plenty of Network capacity in the Clouds
• Presented during CHEP’19
• Measured TransAtlantic network bandwidth of
• Google (GCP): 1 Tbps (1060 Gbps)
• Amazon (AWS): 450 Gbps
• Microsoft (Azure): 190 Gbps
• And this was without any special arrangements
with the Cloud providers
• And without trying to hit the limit
There is plenty of Network capacity in the Clouds
• Presented during CHEP’19
• Measured TransAtlantic network bandwidth of
• Google (GCP): 1 Tbps (1060 Gbps)
• Amazon (AWS): 450 Gbps
• Microsoft (Azure): 190 Gbps
• And this was without any special arrangements
with the Cloud providers
• And without trying to hit the limit
AWS comparable
to ESNet today
GCP already higher than
projected ESNet capacity
And I just scratched the surface
• PR numbers are 100x as large!
https://www.theregister.co.uk/2020/02/18/orange_telxius_google_transatlantic_cable/
Of course, we need to get data to/from on-prem
• Around 100 Gbps pretty easy to reach
• Demonstrated for all three Cloud providers
• For the purpose of this talk, I will assume
we can get as high as we need with modest effort (on same continent)
AWS West AWS Central GCP West GCP Central Azure West Azure S. Central
100 Gbps 90 Gbps 100 Gbps 100 Gbps 120 Gbps 120 Gbps
Fetching data from California (PRP)
Cost the only real constraint
• Cloud networking is not cheap
• They are in the business for money
• More details on the next few slides
• ESNet still absolutely needed for base-load
• But can we afford the Cloud prices
for occasional bursts when needed?
• Just like we do with Computing?
The Cloud networking cost
Measuring about 1TB of TransAtlantic traffic
• Pretty simple transfer of about 1 TB data from the US To the EU
• Using HTTP and a (couple of) squid(s) to force routing
• Resulting bill, on-prem to on-prem:
• AWS: $146 for ~1.3TB
• GCP: $141 for ~1.2TB
• Azure: $183 for ~1.1TB
Tiered pricing
• My test was the worst-case scenario – Top tier
• Larger transfers will be charged at a lower rate
(No separate TransAtlantic charge)
Tiered pricing
• My test was the worst-case scenario –
Top tier
• Larger transfers will be charged
at a lower rate
Tiered pricing
• My test was the worst-case scenario – Top tier
• Larger transfers will be charged at a lower rate
• Really big transfers expected to happen
under specially negotiated prices
Egres to on-prem price can be managed
• All Cloud providers have option to peer at lower price
• I am aware of Internet2 setup with AWS (AWS Direct)
Estimating large transfer cost: 200 TB/workday
• Let’s assume FNAL to CERN (or the way around)
• 200 TB in 6h would average approx. 100Gbps
• Using the list pricing this should cost approximately
• AWS: $19k
• GCP: $16k
• Azure: $22k
• Assuming we can get AWSDirect to CERN (and that I understand the pricing right):
• AWS: $7k
• Don’t understand yet the peering pricing for the other providers
Estimating large transfer cost: 4 PB/day
• Let’s assume FNAL to CERN (or the way around)
• 4 PB in 24h would average approx. 450Gbps
• Assuming we do not get a large discount, list price:
• AWS: $280k
• GCP: $320k
• Azure: $400k
• Assuming we can get AWSDirect to CERN (and that I understand the pricing right):
• AWS: $120k
• Don’t understand yet the peering pricing for the other providers
Summary
Cloud TransAtlantic networking like compute
• Cloud providers have plenty of high-speed
TransAtlantic networking capacity
• But it comes at a cost
• Story very similar to what you see in Cloud computing
• Plenty of capacity, but at a cost
• I believe same approach should be taken
• Use “on-prem” resources when possible, use Cloud for bursts
• For networking, this means the ESNet TransAtlantic link
Acknowledgments
• This work has been partially sponsored by NSF grants
OAC-1826967, OAC-1541349, OAC-1841530,
OAC-1836650, MPS-1148698, OAC-1941481,
OPP-1600823 and OAC-190444.
• I kindly thank Amazon, Microsoft and Google for providing
Cloud credits that covered most of the incurred Cloud costs

More Related Content

What's hot

SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)Amazon Web Services
 
AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster Recovery
AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster RecoveryAWS Summit Tel Aviv - Enterprise Track - Backup and Disaster Recovery
AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster RecoveryAmazon Web Services
 
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)Amazon Web Services
 
Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users mauerbac
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMRightScale
 
Operational challenges behind Serverless architectures
Operational challenges behind Serverless architecturesOperational challenges behind Serverless architectures
Operational challenges behind Serverless architecturesLaurent Bernaille
 
Ralph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and BillingRalph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and BillingSymposia Media
 
AWS October Webinar Series - Introducing AWS Import / Export Snowball
AWS October Webinar Series - Introducing AWS Import / Export SnowballAWS October Webinar Series - Introducing AWS Import / Export Snowball
AWS October Webinar Series - Introducing AWS Import / Export SnowballAmazon Web Services
 
AWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APACAWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APACAmazon Web Services
 
Data Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & SnowmobileData Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & SnowmobileAmazon Web Services
 
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)Amazon Web Services
 
IT Services - TCO Study by Frost & Sullivan
IT Services - TCO Study by Frost & SullivanIT Services - TCO Study by Frost & Sullivan
IT Services - TCO Study by Frost & SullivanCTRLS
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services InnoTech
 
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017Amazon Web Services
 
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)Amazon Web Services
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Peter Bakas
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...Amazon Web Services
 
Aws cloud infrastructure and cost estimation for angular site
Aws cloud infrastructure and cost estimation for angular siteAws cloud infrastructure and cost estimation for angular site
Aws cloud infrastructure and cost estimation for angular siteLe Kien Truc
 

What's hot (20)

S3 + Snowball
S3 + SnowballS3 + Snowball
S3 + Snowball
 
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
 
AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster Recovery
AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster RecoveryAWS Summit Tel Aviv - Enterprise Track - Backup and Disaster Recovery
AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster Recovery
 
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
 
Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
Operational challenges behind Serverless architectures
Operational challenges behind Serverless architecturesOperational challenges behind Serverless architectures
Operational challenges behind Serverless architectures
 
Ralph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and BillingRalph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and Billing
 
AWS October Webinar Series - Introducing AWS Import / Export Snowball
AWS October Webinar Series - Introducing AWS Import / Export SnowballAWS October Webinar Series - Introducing AWS Import / Export Snowball
AWS October Webinar Series - Introducing AWS Import / Export Snowball
 
AWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APACAWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APAC
 
Data Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & SnowmobileData Migration Using AWS Snowball, Snowball Edge & Snowmobile
Data Migration Using AWS Snowball, Snowball Edge & Snowmobile
 
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
SRV413 Deep Dive on Elastic Block Storage (Amazon EBS)
 
IT Services - TCO Study by Frost & Sullivan
IT Services - TCO Study by Frost & SullivanIT Services - TCO Study by Frost & Sullivan
IT Services - TCO Study by Frost & Sullivan
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services
 
Aws cost strategies
Aws cost strategiesAws cost strategies
Aws cost strategies
 
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
 
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
AWS Summit London 2014 | Maximising EC2 and EBC Performance (400)
 
Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
 
Aws cloud infrastructure and cost estimation for angular site
Aws cloud infrastructure and cost estimation for angular siteAws cloud infrastructure and cost estimation for angular site
Aws cloud infrastructure and cost estimation for angular site
 

Similar to TransAtlantic Networking using Cloud links

Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsIgor Sfiligoi
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Igor Sfiligoi
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceBOSC 2010
 
Demonstrating 100 Gbps in and out of the Clouds
Demonstrating 100 Gbps in and out of the CloudsDemonstrating 100 Gbps in and out of the Clouds
Demonstrating 100 Gbps in and out of the CloudsIgor Sfiligoi
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsIgor Sfiligoi
 
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...Ontico
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
Applications in the Cloud
Applications in the CloudApplications in the Cloud
Applications in the CloudEberhard Wolff
 
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...Amazon Web Services
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstIgor Sfiligoi
 
Aspera bt-big-data-cloud
Aspera bt-big-data-cloudAspera bt-big-data-cloud
Aspera bt-big-data-clouddkumiaspera
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersAmazon Web Services
 
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloudLAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloudJisc
 
Routing for an Anycast CDN
Routing for an Anycast CDNRouting for an Anycast CDN
Routing for an Anycast CDNTom Paseka
 
Cloud Computing in Practice
Cloud Computing in PracticeCloud Computing in Practice
Cloud Computing in PracticeKing Huang
 
A web app in pure Clojure
A web app in pure ClojureA web app in pure Clojure
A web app in pure ClojureDane Schneider
 
Cloud computing-1224001671523233-9
Cloud computing-1224001671523233-9Cloud computing-1224001671523233-9
Cloud computing-1224001671523233-9LLC NewLink
 
IPVS for Docker Containers
IPVS for Docker ContainersIPVS for Docker Containers
IPVS for Docker ContainersBob Sokol
 
[En] IPVS for Docker Containers
[En] IPVS for Docker Containers[En] IPVS for Docker Containers
[En] IPVS for Docker ContainersAndrey Sibirev
 
Highly Available Docker Networking With BGP
Highly Available Docker Networking With BGPHighly Available Docker Networking With BGP
Highly Available Docker Networking With BGPOpenDNS
 

Similar to TransAtlantic Networking using Cloud links (20)

Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substance
 
Demonstrating 100 Gbps in and out of the Clouds
Demonstrating 100 Gbps in and out of the CloudsDemonstrating 100 Gbps in and out of the Clouds
Demonstrating 100 Gbps in and out of the Clouds
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public Clouds
 
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Applications in the Cloud
Applications in the CloudApplications in the Cloud
Applications in the Cloud
 
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
AWS Snowball: Accelerating Large-Scale Data Ingest Into the AWS Cloud | AWS P...
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud Burst
 
Aspera bt-big-data-cloud
Aspera bt-big-data-cloudAspera bt-big-data-cloud
Aspera bt-big-data-cloud
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloudLAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
LAN, WAN, SAN upgrades: hyperconverged vs traditional vs cloud
 
Routing for an Anycast CDN
Routing for an Anycast CDNRouting for an Anycast CDN
Routing for an Anycast CDN
 
Cloud Computing in Practice
Cloud Computing in PracticeCloud Computing in Practice
Cloud Computing in Practice
 
A web app in pure Clojure
A web app in pure ClojureA web app in pure Clojure
A web app in pure Clojure
 
Cloud computing-1224001671523233-9
Cloud computing-1224001671523233-9Cloud computing-1224001671523233-9
Cloud computing-1224001671523233-9
 
IPVS for Docker Containers
IPVS for Docker ContainersIPVS for Docker Containers
IPVS for Docker Containers
 
[En] IPVS for Docker Containers
[En] IPVS for Docker Containers[En] IPVS for Docker Containers
[En] IPVS for Docker Containers
 
Highly Available Docker Networking With BGP
Highly Available Docker Networking With BGPHighly Available Docker Networking With BGP
Highly Available Docker Networking With BGP
 

More from Igor Sfiligoi

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROIgor Sfiligoi
 
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...Igor Sfiligoi
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Igor Sfiligoi
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingIgor Sfiligoi
 
Auto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesAuto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesIgor Sfiligoi
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateIgor Sfiligoi
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsIgor Sfiligoi
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeIgor Sfiligoi
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessIgor Sfiligoi
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputIgor Sfiligoi
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROIgor Sfiligoi
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyIgor Sfiligoi
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCIgor Sfiligoi
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Igor Sfiligoi
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsIgor Sfiligoi
 
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Igor Sfiligoi
 
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
 NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic... NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...Igor Sfiligoi
 
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...Igor Sfiligoi
 
Serving HTC Users in Kubernetes by Leveraging HTCondor
Serving HTC Users in Kubernetes by Leveraging HTCondorServing HTC Users in Kubernetes by Leveraging HTCondor
Serving HTC Users in Kubernetes by Leveraging HTCondorIgor Sfiligoi
 
Burst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runBurst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runIgor Sfiligoi
 

More from Igor Sfiligoi (20)

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYRO
 
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accounting
 
Auto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesAuto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resources
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rate
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance compute
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYRO
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with Admiralty
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...
 
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
 NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic... NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
NRP Engagement webinar - Running a 51k GPU multi-cloud burst for MMA with Ic...
 
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
Running a GPU burst for Multi-Messenger Astrophysics with IceCube across all ...
 
Serving HTC Users in Kubernetes by Leveraging HTCondor
Serving HTC Users in Kubernetes by Leveraging HTCondorServing HTC Users in Kubernetes by Leveraging HTCondor
Serving HTC Users in Kubernetes by Leveraging HTCondor
 
Burst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud runBurst data retrieval after 50k GPU Cloud run
Burst data retrieval after 50k GPU Cloud run
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

TransAtlantic Networking using Cloud links

  • 1. TransAtlantic networking using Cloud links Igor Sfiligoi UC San Diego – San Diego Supercomputer Center
  • 2. The problem • ESNet TransAtlantic capacity expected to not keep up with WLCG needs • 400 Gbps now • Not expected to exceed 600 Gbps anytime soon
  • 3. What can WLCG community do? •Drastically reduce network traffic •Find alternatives
  • 5. There is plenty of Network capacity in the Clouds • Presented during CHEP’19 • Measured TransAtlantic network bandwidth of • Google (GCP): 1 Tbps (1060 Gbps) • Amazon (AWS): 450 Gbps • Microsoft (Azure): 190 Gbps • And this was without any special arrangements with the Cloud providers • And without trying to hit the limit
  • 6. There is plenty of Network capacity in the Clouds • Presented during CHEP’19 • Measured TransAtlantic network bandwidth of • Google (GCP): 1 Tbps (1060 Gbps) • Amazon (AWS): 450 Gbps • Microsoft (Azure): 190 Gbps • And this was without any special arrangements with the Cloud providers • And without trying to hit the limit AWS comparable to ESNet today GCP already higher than projected ESNet capacity
  • 7. And I just scratched the surface • PR numbers are 100x as large! https://www.theregister.co.uk/2020/02/18/orange_telxius_google_transatlantic_cable/
  • 8. Of course, we need to get data to/from on-prem • Around 100 Gbps pretty easy to reach • Demonstrated for all three Cloud providers • For the purpose of this talk, I will assume we can get as high as we need with modest effort (on same continent) AWS West AWS Central GCP West GCP Central Azure West Azure S. Central 100 Gbps 90 Gbps 100 Gbps 100 Gbps 120 Gbps 120 Gbps Fetching data from California (PRP)
  • 9. Cost the only real constraint • Cloud networking is not cheap • They are in the business for money • More details on the next few slides • ESNet still absolutely needed for base-load • But can we afford the Cloud prices for occasional bursts when needed? • Just like we do with Computing?
  • 11. Measuring about 1TB of TransAtlantic traffic • Pretty simple transfer of about 1 TB data from the US To the EU • Using HTTP and a (couple of) squid(s) to force routing • Resulting bill, on-prem to on-prem: • AWS: $146 for ~1.3TB • GCP: $141 for ~1.2TB • Azure: $183 for ~1.1TB
  • 12. Tiered pricing • My test was the worst-case scenario – Top tier • Larger transfers will be charged at a lower rate (No separate TransAtlantic charge)
  • 13. Tiered pricing • My test was the worst-case scenario – Top tier • Larger transfers will be charged at a lower rate
  • 14. Tiered pricing • My test was the worst-case scenario – Top tier • Larger transfers will be charged at a lower rate • Really big transfers expected to happen under specially negotiated prices
  • 15. Egres to on-prem price can be managed • All Cloud providers have option to peer at lower price • I am aware of Internet2 setup with AWS (AWS Direct)
  • 16. Estimating large transfer cost: 200 TB/workday • Let’s assume FNAL to CERN (or the way around) • 200 TB in 6h would average approx. 100Gbps • Using the list pricing this should cost approximately • AWS: $19k • GCP: $16k • Azure: $22k • Assuming we can get AWSDirect to CERN (and that I understand the pricing right): • AWS: $7k • Don’t understand yet the peering pricing for the other providers
  • 17. Estimating large transfer cost: 4 PB/day • Let’s assume FNAL to CERN (or the way around) • 4 PB in 24h would average approx. 450Gbps • Assuming we do not get a large discount, list price: • AWS: $280k • GCP: $320k • Azure: $400k • Assuming we can get AWSDirect to CERN (and that I understand the pricing right): • AWS: $120k • Don’t understand yet the peering pricing for the other providers
  • 19. Cloud TransAtlantic networking like compute • Cloud providers have plenty of high-speed TransAtlantic networking capacity • But it comes at a cost • Story very similar to what you see in Cloud computing • Plenty of capacity, but at a cost • I believe same approach should be taken • Use “on-prem” resources when possible, use Cloud for bursts • For networking, this means the ESNet TransAtlantic link
  • 20. Acknowledgments • This work has been partially sponsored by NSF grants OAC-1826967, OAC-1541349, OAC-1841530, OAC-1836650, MPS-1148698, OAC-1941481, OPP-1600823 and OAC-190444. • I kindly thank Amazon, Microsoft and Google for providing Cloud credits that covered most of the incurred Cloud costs