SlideShare a Scribd company logo
1 of 30
The Earth System Grid Federation:
Origins, Current State, Evolution
Presented by: Ian Foster, Argonne National Laboratory and University of Chicago
PIs: Forrest M. Hoffman (ORNL), Ian Foster (ANL), Sasha Ames (LLNL)
And: Rachana Ananthakrishnan, Jason Boutte, Nathan Collier, Scott Collis, Carlos Downie, Max Grover,
Robert Jacob, Jitendra Kumar, Lee Liming, Giri Prakash, Sarat Sreepathi, Steven Turoscy, Min Xu
A DOE project to address the challenges of terabyte climate data
Image from 2010 report
In the beginning was the Earth System Grid
2006 web interface
What is the Earth System Grid Federation?
● The Earth System Grid Federation
(ESGF) is a globally distributed peer-to-
peer network of data servers using a
common set of protocols and
interfaces to archive and distribute
Earth system model (ESM) output
● ESM output data are used by scientists
all over the world to investigate
consequences of possible climate
change scenarios and the resulting
Earth system feedbacks
ESGF has led data archiving for CMIP since its
conception
● Gathering and sharing of climate data is a key effort of the Coupled Model Intercomparison
Project (CMIP), the worldwide standard experimental protocol for studying general circulation
model output
● This climate modeling research requires enormous scientific and computational resources that
involves over 100 models and spans more than 20 countries
● The World Climate Research Program (WCRP) serves as the primary coordinating body for this
research activity
● The WCRP Working Group on Coupled Modeling (WGCM) relies on the ESGF to support these
activities by coordinating and maintaining the distributed petabyte data archive
● CMIP simulation model runs are key components of periodic assessments by the
Intergovernmental Panel on Climate Change (IPCC).
ESGF Holdings are Large and Growing
● CMIP5 totals >5 PB
● CMIP6 totals >20 PB
● We expect CMIP7
output, including high
resolutions simulations
and more ensembles,
to total >100 PB
● We also plan to expand
Federation holdings by
adding other Earth
science data projects
As of August 22, 2021
Total Distinct Replica
Success of ESGF (2021)
● ESGF data distribution solution has scaled to:
○ Dozens of data nodes
○ Dozens of modeling centers with multiple models
○ 1000s of simulation experiments/variants/ensembles
○ Tens of millions of files
○ Tens of petabytes
○ Thousands of research papers
● ESGF achieved a new level of stability and
reliability for CMIP6
● Much of ESGF’s success is due to the fact that it
has been a paragon of international collaboration,
including strong international governance
But also new challenges:
• Much more data
• Desire for yet more data
• Many more consumers
• New applications
• Aging software stack
• Mixed user support
-US: A New Consortium Project in the USA
● New team from Oak Ridge National Laboratory, Argonne National
Laboratory, and Lawrence Livermore National Laboratory proposed to
modernize the data backplane based on the Globus platform
● ESGF2 proposal was reviewed by panel of 8 scientists on August 30–31, 2021,
and was selected for funding by the US Department of Energy in September
● In collaboration with the ESGF Executive Committee, we will develop and
deploy a new architecture based on the Future Architecture Roadmap
● In addition, we will develop new data discovery tools and data access
interfaces, server-side computing (subsetting & summarizing), and user
computing (Kubernetes & JupyterHub) with improved user & system metrics
● We will add a Resource & Project Liaison group and a Science, User & Facility
Advisory Board; hold outreach activities; and offer a help desk/user support
Design and implementation principles
● Open architecture and protocols
○ Enable substitution of alternative implementations
● Leverage highly available and scalable central services from Globus
○ Reduce complexity, increase reliability, provide economies of scale
● Use proven, modern security technologies and practices
○ Integrated access control; protect against attacks and intrusions
● Use case approach to design, implementation, and evaluation
○ Ensure that solutions meet real user needs
● Integrated instrumentation
○ Metrics drive data management, data access features, capability development
● Focus on performance to deal with big data
○ High-speed data transfer, scalable search, HPC server-side processing
HTTP/S
Globus
Transfer
OpenDAP
Search
data
Data
Node
ESGF
webapp
Custom application
(e.g. portals, scripts)
Identity
Provider
Proxy
Managed
Publication
Service
Central
Search
Service
Authorization
Service
ESGF
Central
Services
Identity Providers
(InCommon, ORCID,
eduGAIN, etc.)
Custom Identity
Provider
Transfer
data
Publish
data
Analysis environment
(e.g. JupyterHub)
Standard APIs
Identity
Provider
Proxy
Managed
Publication
Service
Central
Search
Service
Authorization
Service
…
On prem/cloud
Compute
Data
Node
(1) Use ESGF
webapp to
download or
transfer data
(2) Access with
subsetting or
reduction via
API
(3) Custom web
and thick client
applications
access data via
API
(4) Analysis
application
directly access
data
Personal
Laptop
-US: Open architecture and protocols
-US: “Services from Globus” – What does that mean?
● Globus is a set of cloud-hosted data management services
operated by the University of the Chicago for the community
● Managed services for:
○ Auth: Identity broker and delegation
○ Transfer and Sharing: reliable file transfer
○ Search: Indexing and discovery
○ Flows: Automation
○ Compute: Remote computing
● Hybrid model:
○ Management services in the cloud
○ Agents/Connectors at institutions
All services are:
 Accessible by Web
 Accessible via simple APIs
 Secure
 Reliable
 Easily automatable
[Sanjay Aiyagari]
Globus platform
Operated by UChicago for researchers worldwide
Made possible by the support of 200+ subscribers
DOE’s Current
Earth System
Grid Federation
● Primary server
at LLNL
● Replicating
data from the
global
Federation
● Independent
data node at
ANL
-US
DOE’s Next
Generation
Earth System
Grid Federation
● Co-located at
DOE’s major
computing
facilities
● Replicating
data from the
global
Federation
● Providing
cloud indexing,
automated
migration, and
tape archiving
60,000+ GPUs
-US
Enabling a new level of research productivity
Logging in with her institutional credentials,
Samantha is presented with new data, code, and
papers relevant to her current research. Intrigued by a
new report on extreme precipitation events, she
examines a Jupyter notebook that implements the
method used. Wondering how this method would work
with higher-resolution E3SM data, she quickly locates
required datasets and runs the notebook on a
subset. Results are promising, so she shares them
with collaborators via ESGF2 federated storage, and
they agree that a larger ensemble analysis is called for.
ESGF2 confirms that the full ensemble data are
available at OLCF, so they submit a request to execute
the analysis there. Within 24 hours, results have been
published to ESGF2 for broader consumption, along
with the notebook used to produce and validate results. Flood risk increases with
water availability
New: ESGF2-US Failsafe Data Replication
● In the US, LLNL operates the primary ESGF node, which replicates much of
the CMIP6 and related model output from around the globe
● Since the data at LLNL are contained only on spinning disk, we decided to
replicate the entire ~7.5 PB collection of data to Argonne National Laboratory
(ANL) and Oak Ridge National Laboratory (ORNL)
Solution: Use Globus to transfer all the data over ESnet
● We used custom Globus scripting (thanks to Lukasz Lacinski), ESnet network
monitoring and diagnostics (thanks to Eli Dart), DTN and GPFS optimized
configurations (thanks to Cameron Harr and others), and debugging and
problem-solving (thanks to Sasha Ames, Lee Liming, and others)
ESnet Global Connectivity
Global ESnet interconnectivity—including high
speed connections to London, Amsterdam, and
Geneva—will enable rapid data replication
across most of the Federation data nodes
ESGF2 will make use of the
high bandwidth between DOE
labs and HPC centers
Data will be automatically
migrated and cached
across ORNL, ANL, and
LLNL sites
An ESnet representative is part of our Resource & Project Liaisons group
https://dashboard.globus.org/esgf As of May 4, 2022
1.5 GB/s
4 to 6 GB/s
7.5 PB transferred between mid-Feb and May 4
17,347,671 directories and 28,907,532 files
Cumulative Data Transferred Over Time
GPFS Errors &
Unreadable Files
Transfer Rates Over Time
>7.5 GB/s OLCF ➔ ALCF
New Data Discovery Methods
● Scalable, highly available index
● Globus Search for cloud-based, security-enabled index
● Bulk ingest operations for publication and replication
● Globus Flows to automate publication/replication
● Support discovery (faceted search) and query
● MetaGrid, other interfaces
● (In progress/discussion) STAC interface
New: CMIP6 metadata in Globus Search
● Metadata copied from LLNL index
○ ~5.4 million datasets
○ ~26 million files
● Metadata includes pointers
to files at ALCF
○ HTTPS and Globus
Transfer
● ~5 second response time
to return pointers to all
● 5.4 million datasets
● Future:
● Include OLCF pointers
● Automated metadata
enhancement
An index is created
and managed in
Globus Search
service
Globus search
allows for fine-
grained access
controls for
metadata visibility
New publication pipelines
● Repeatable, reliable process with well-defined steps
● Coordinate across heterogeneous resources/services
● Asynchronous processing of publication requests
● Enforce authorization to ensure only permitted accounts can
publish
Automating publication with Globus Flows
Search
Ingest
Share
Set policy
Identifier
Mint DOI
Auth
Login
Globus
Auth
Web
form
User input
Transfer
Stage
data
funcX
Extract
funcX
Validate
Data publication service architecture
New: Data Proximate Compute
Data Proximate Compute (DPC) is an
important part of ESGF2, as:
○ Volumes for CMIP6 are huge and
CMIP7 will be larger
○ There are active efforts that we want
to enable in HPC workflows: e.g.,
Pangeo community is an example.
○ Not all users have computers or
resources to respond to to LCF calls
With DOE computer centers, we
pursue a staged, multi-faceted DPC
strategy:
1. Built-in server-side operations: e.g.,
subsetting
2. Dynamic replication of subsets to
(elastic) compute platforms
3. Vetted Jupyter notebooks running on
leadership computers
We aim for convergence of HPC and cloud environments so that (1) software
development effort can be shared and (2) each consumer can be directed to the
environment most appropriate for their situation. We do not want to replicate anything!
Unsupervised cloud classification with a rotation-invariant autoencoder
https://takglobus.github.io/AICCA-explorer/
Another approach to making large datasets
accessible is to use unsupervised learning to
generate compact representations
AICCA: AI-Driven Cloud Classification Atlas
801 TB
Unsupervised cloud classification with a rotation-invariant autoencoder
https://takglobus.github.io/AICCA-explorer/
54 GB
15,000x reduction
Account for decreased low cloud off California/Baja: losses in stratocumulus deck
Summary
● ESGF2-US is exploring a next generation Earth System Grid Federation (ESGF2)
○ Designed for an order of magnitude increase in data sizes
○ Highly available, scalable, and fast
○ Automatically migrate data as needed
○ Improved data discovery and sharing tools
○ Server-side computing (e.g., subsetting) for derived data
○ User computing capabilities (e.g., JupyterHub/JupyterLab) near data
● The Globus platform is expected to provide many of the central services of the
ESGF2-US data backplane
● We are eager to understand your use cases for ESGF data, and to connect with
other data sources and services to broaden applications and user base
Thank You!
Ian Foster, foster@anl.gov

More Related Content

Similar to The Earth System Grid Federation: Origins, Current State, Evolution

Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsLarry Smarr
 
Global Network Advancement Group Next Generation Network-Integrated Sys...
      Global Network Advancement GroupNext Generation Network-Integrated Sys...      Global Network Advancement GroupNext Generation Network-Integrated Sys...
Global Network Advancement Group Next Generation Network-Integrated Sys...Larry Smarr
 
Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceIan Foster
 
Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Jisc
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Ola Spjuth
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Larry Smarr
 
First online hangout SC5 - Big Data Europe first pilot-presentation-hangout
First online hangout SC5 - Big Data Europe  first pilot-presentation-hangoutFirst online hangout SC5 - Big Data Europe  first pilot-presentation-hangout
First online hangout SC5 - Big Data Europe first pilot-presentation-hangoutBigData_Europe
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
 
Accelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science CloudAccelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science CloudGlobus
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011Ian Foster
 
Victoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division FermilabVictoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division FermilabVideoguy
 
GEOSHARE_GLP_pres.pdf
GEOSHARE_GLP_pres.pdfGEOSHARE_GLP_pres.pdf
GEOSHARE_GLP_pres.pdfJose Lozano
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer OverlordsIan Foster
 
AusCover portal presentation
AusCover portal presentationAusCover portal presentation
AusCover portal presentationTERN Australia
 
Data accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereData accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereAlex Hardisty
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008Ian Foster
 

Similar to The Earth System Grid Federation: Origins, Current State, Evolution (20)

Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
 
Global Network Advancement Group Next Generation Network-Integrated Sys...
      Global Network Advancement GroupNext Generation Network-Integrated Sys...      Global Network Advancement GroupNext Generation Network-Integrated Sys...
Global Network Advancement Group Next Generation Network-Integrated Sys...
 
Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy Science
 
Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
 
First online hangout SC5 - Big Data Europe first pilot-presentation-hangout
First online hangout SC5 - Big Data Europe  first pilot-presentation-hangoutFirst online hangout SC5 - Big Data Europe  first pilot-presentation-hangout
First online hangout SC5 - Big Data Europe first pilot-presentation-hangout
 
ApacheCon NA 2013
ApacheCon NA 2013ApacheCon NA 2013
ApacheCon NA 2013
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Accelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science CloudAccelerating Science with Cloud Technologies in the ABoVE Science Cloud
Accelerating Science with Cloud Technologies in the ABoVE Science Cloud
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011
 
Victoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division FermilabVictoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division Fermilab
 
GEOSHARE_GLP_pres.pdf
GEOSHARE_GLP_pres.pdfGEOSHARE_GLP_pres.pdf
GEOSHARE_GLP_pres.pdf
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer Overlords
 
SomeSlides
SomeSlidesSomeSlides
SomeSlides
 
AusCover portal presentation
AusCover portal presentationAusCover portal presentation
AusCover portal presentation
 
Data accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphereData accessibility and the role of informatics in predicting the biosphere
Data accessibility and the role of informatics in predicting the biosphere
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 

More from Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Software Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformSoftware Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformIan Foster
 

More from Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Software Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformSoftware Infrastructure for a National Research Platform
Software Infrastructure for a National Research Platform
 

Recently uploaded

Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 

Recently uploaded (20)

Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 

The Earth System Grid Federation: Origins, Current State, Evolution

  • 1. The Earth System Grid Federation: Origins, Current State, Evolution Presented by: Ian Foster, Argonne National Laboratory and University of Chicago PIs: Forrest M. Hoffman (ORNL), Ian Foster (ANL), Sasha Ames (LLNL) And: Rachana Ananthakrishnan, Jason Boutte, Nathan Collier, Scott Collis, Carlos Downie, Max Grover, Robert Jacob, Jitendra Kumar, Lee Liming, Giri Prakash, Sarat Sreepathi, Steven Turoscy, Min Xu
  • 2. A DOE project to address the challenges of terabyte climate data Image from 2010 report In the beginning was the Earth System Grid 2006 web interface
  • 3. What is the Earth System Grid Federation? ● The Earth System Grid Federation (ESGF) is a globally distributed peer-to- peer network of data servers using a common set of protocols and interfaces to archive and distribute Earth system model (ESM) output ● ESM output data are used by scientists all over the world to investigate consequences of possible climate change scenarios and the resulting Earth system feedbacks
  • 4. ESGF has led data archiving for CMIP since its conception ● Gathering and sharing of climate data is a key effort of the Coupled Model Intercomparison Project (CMIP), the worldwide standard experimental protocol for studying general circulation model output ● This climate modeling research requires enormous scientific and computational resources that involves over 100 models and spans more than 20 countries ● The World Climate Research Program (WCRP) serves as the primary coordinating body for this research activity ● The WCRP Working Group on Coupled Modeling (WGCM) relies on the ESGF to support these activities by coordinating and maintaining the distributed petabyte data archive ● CMIP simulation model runs are key components of periodic assessments by the Intergovernmental Panel on Climate Change (IPCC).
  • 5. ESGF Holdings are Large and Growing ● CMIP5 totals >5 PB ● CMIP6 totals >20 PB ● We expect CMIP7 output, including high resolutions simulations and more ensembles, to total >100 PB ● We also plan to expand Federation holdings by adding other Earth science data projects As of August 22, 2021 Total Distinct Replica
  • 6. Success of ESGF (2021) ● ESGF data distribution solution has scaled to: ○ Dozens of data nodes ○ Dozens of modeling centers with multiple models ○ 1000s of simulation experiments/variants/ensembles ○ Tens of millions of files ○ Tens of petabytes ○ Thousands of research papers ● ESGF achieved a new level of stability and reliability for CMIP6 ● Much of ESGF’s success is due to the fact that it has been a paragon of international collaboration, including strong international governance But also new challenges: • Much more data • Desire for yet more data • Many more consumers • New applications • Aging software stack • Mixed user support
  • 7. -US: A New Consortium Project in the USA ● New team from Oak Ridge National Laboratory, Argonne National Laboratory, and Lawrence Livermore National Laboratory proposed to modernize the data backplane based on the Globus platform ● ESGF2 proposal was reviewed by panel of 8 scientists on August 30–31, 2021, and was selected for funding by the US Department of Energy in September ● In collaboration with the ESGF Executive Committee, we will develop and deploy a new architecture based on the Future Architecture Roadmap ● In addition, we will develop new data discovery tools and data access interfaces, server-side computing (subsetting & summarizing), and user computing (Kubernetes & JupyterHub) with improved user & system metrics ● We will add a Resource & Project Liaison group and a Science, User & Facility Advisory Board; hold outreach activities; and offer a help desk/user support
  • 8. Design and implementation principles ● Open architecture and protocols ○ Enable substitution of alternative implementations ● Leverage highly available and scalable central services from Globus ○ Reduce complexity, increase reliability, provide economies of scale ● Use proven, modern security technologies and practices ○ Integrated access control; protect against attacks and intrusions ● Use case approach to design, implementation, and evaluation ○ Ensure that solutions meet real user needs ● Integrated instrumentation ○ Metrics drive data management, data access features, capability development ● Focus on performance to deal with big data ○ High-speed data transfer, scalable search, HPC server-side processing
  • 9. HTTP/S Globus Transfer OpenDAP Search data Data Node ESGF webapp Custom application (e.g. portals, scripts) Identity Provider Proxy Managed Publication Service Central Search Service Authorization Service ESGF Central Services Identity Providers (InCommon, ORCID, eduGAIN, etc.) Custom Identity Provider Transfer data Publish data Analysis environment (e.g. JupyterHub) Standard APIs Identity Provider Proxy Managed Publication Service Central Search Service Authorization Service … On prem/cloud Compute Data Node (1) Use ESGF webapp to download or transfer data (2) Access with subsetting or reduction via API (3) Custom web and thick client applications access data via API (4) Analysis application directly access data Personal Laptop -US: Open architecture and protocols
  • 10. -US: “Services from Globus” – What does that mean? ● Globus is a set of cloud-hosted data management services operated by the University of the Chicago for the community ● Managed services for: ○ Auth: Identity broker and delegation ○ Transfer and Sharing: reliable file transfer ○ Search: Indexing and discovery ○ Flows: Automation ○ Compute: Remote computing ● Hybrid model: ○ Management services in the cloud ○ Agents/Connectors at institutions All services are:  Accessible by Web  Accessible via simple APIs  Secure  Reliable  Easily automatable [Sanjay Aiyagari]
  • 11. Globus platform Operated by UChicago for researchers worldwide Made possible by the support of 200+ subscribers
  • 12. DOE’s Current Earth System Grid Federation ● Primary server at LLNL ● Replicating data from the global Federation ● Independent data node at ANL -US
  • 13. DOE’s Next Generation Earth System Grid Federation ● Co-located at DOE’s major computing facilities ● Replicating data from the global Federation ● Providing cloud indexing, automated migration, and tape archiving 60,000+ GPUs -US
  • 14. Enabling a new level of research productivity Logging in with her institutional credentials, Samantha is presented with new data, code, and papers relevant to her current research. Intrigued by a new report on extreme precipitation events, she examines a Jupyter notebook that implements the method used. Wondering how this method would work with higher-resolution E3SM data, she quickly locates required datasets and runs the notebook on a subset. Results are promising, so she shares them with collaborators via ESGF2 federated storage, and they agree that a larger ensemble analysis is called for. ESGF2 confirms that the full ensemble data are available at OLCF, so they submit a request to execute the analysis there. Within 24 hours, results have been published to ESGF2 for broader consumption, along with the notebook used to produce and validate results. Flood risk increases with water availability
  • 15. New: ESGF2-US Failsafe Data Replication ● In the US, LLNL operates the primary ESGF node, which replicates much of the CMIP6 and related model output from around the globe ● Since the data at LLNL are contained only on spinning disk, we decided to replicate the entire ~7.5 PB collection of data to Argonne National Laboratory (ANL) and Oak Ridge National Laboratory (ORNL) Solution: Use Globus to transfer all the data over ESnet ● We used custom Globus scripting (thanks to Lukasz Lacinski), ESnet network monitoring and diagnostics (thanks to Eli Dart), DTN and GPFS optimized configurations (thanks to Cameron Harr and others), and debugging and problem-solving (thanks to Sasha Ames, Lee Liming, and others)
  • 16. ESnet Global Connectivity Global ESnet interconnectivity—including high speed connections to London, Amsterdam, and Geneva—will enable rapid data replication across most of the Federation data nodes ESGF2 will make use of the high bandwidth between DOE labs and HPC centers Data will be automatically migrated and cached across ORNL, ANL, and LLNL sites An ESnet representative is part of our Resource & Project Liaisons group
  • 17. https://dashboard.globus.org/esgf As of May 4, 2022 1.5 GB/s 4 to 6 GB/s 7.5 PB transferred between mid-Feb and May 4 17,347,671 directories and 28,907,532 files
  • 18. Cumulative Data Transferred Over Time GPFS Errors & Unreadable Files
  • 19. Transfer Rates Over Time >7.5 GB/s OLCF ➔ ALCF
  • 20. New Data Discovery Methods ● Scalable, highly available index ● Globus Search for cloud-based, security-enabled index ● Bulk ingest operations for publication and replication ● Globus Flows to automate publication/replication ● Support discovery (faceted search) and query ● MetaGrid, other interfaces ● (In progress/discussion) STAC interface
  • 21. New: CMIP6 metadata in Globus Search ● Metadata copied from LLNL index ○ ~5.4 million datasets ○ ~26 million files ● Metadata includes pointers to files at ALCF ○ HTTPS and Globus Transfer ● ~5 second response time to return pointers to all ● 5.4 million datasets ● Future: ● Include OLCF pointers ● Automated metadata enhancement An index is created and managed in Globus Search service Globus search allows for fine- grained access controls for metadata visibility
  • 22. New publication pipelines ● Repeatable, reliable process with well-defined steps ● Coordinate across heterogeneous resources/services ● Asynchronous processing of publication requests ● Enforce authorization to ensure only permitted accounts can publish
  • 23. Automating publication with Globus Flows Search Ingest Share Set policy Identifier Mint DOI Auth Login Globus Auth Web form User input Transfer Stage data funcX Extract funcX Validate
  • 24. Data publication service architecture
  • 25. New: Data Proximate Compute Data Proximate Compute (DPC) is an important part of ESGF2, as: ○ Volumes for CMIP6 are huge and CMIP7 will be larger ○ There are active efforts that we want to enable in HPC workflows: e.g., Pangeo community is an example. ○ Not all users have computers or resources to respond to to LCF calls With DOE computer centers, we pursue a staged, multi-faceted DPC strategy: 1. Built-in server-side operations: e.g., subsetting 2. Dynamic replication of subsets to (elastic) compute platforms 3. Vetted Jupyter notebooks running on leadership computers We aim for convergence of HPC and cloud environments so that (1) software development effort can be shared and (2) each consumer can be directed to the environment most appropriate for their situation. We do not want to replicate anything!
  • 26. Unsupervised cloud classification with a rotation-invariant autoencoder https://takglobus.github.io/AICCA-explorer/ Another approach to making large datasets accessible is to use unsupervised learning to generate compact representations
  • 27. AICCA: AI-Driven Cloud Classification Atlas 801 TB Unsupervised cloud classification with a rotation-invariant autoencoder https://takglobus.github.io/AICCA-explorer/ 54 GB 15,000x reduction
  • 28. Account for decreased low cloud off California/Baja: losses in stratocumulus deck
  • 29. Summary ● ESGF2-US is exploring a next generation Earth System Grid Federation (ESGF2) ○ Designed for an order of magnitude increase in data sizes ○ Highly available, scalable, and fast ○ Automatically migrate data as needed ○ Improved data discovery and sharing tools ○ Server-side computing (e.g., subsetting) for derived data ○ User computing capabilities (e.g., JupyterHub/JupyterLab) near data ● The Globus platform is expected to provide many of the central services of the ESGF2-US data backplane ● We are eager to understand your use cases for ESGF data, and to connect with other data sources and services to broaden applications and user base
  • 30. Thank You! Ian Foster, foster@anl.gov