The document discusses accessing and analyzing data from the ICOS (Integrated Carbon Observation System) Data Portal using Python. It introduces the "icoscp" Python library which allows users to load ICOS data into pandas dataframes with just a few lines of code. Examples shown include loading methane concentration data from a Zeppelin station in the Arctic, exploring persistent identifiers to find related data, and combining ICOS data with other datasets from repositories like Pangaea. The library is still in beta but aims to make it easier for scientists to access and analyze ICOS data without having to download and manage large files locally.
2. • Access to ICOS data
• Make it as easy as possible
• Five lines of code
to have a reproducible,
high quality graph
Motivation
Hermansen, O., Lund Myhre, C., Lunder, C., Platt, S., ICOS RI, 2019. ICOS
ATC CH4 Release, Zeppelin (15.0 m), 2017-07-27–2019-04-30,
https://hdl.handle.net/11676/YZtp9PTId2wyhcOjmeDAxBez
Zeppelin, Remote Arctic
lat: 78.9072, lon: 11.8867
nmolmol-1
3. • Python Library “icoscp”
• Examples
• Conclusion & Outlook
Topics for this Talk
4. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 4
Python Library
https://pypi.org/project/icoscp/
• Easy to install with pip
• Links to source code
• Links to documentation
5. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 5
Python Library
6. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 6
Python Library
• Find a PID by searching the ICOS Data Portal
https://data.icos-cp.eu
• Load the data into a pandas dataframe
https://exploredata.icos-cp.eu
7. • Python Library “icoscp”
• Examples
• Conclusion & Outlook
Topics for this Talk
8. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 8
Example: Data
9. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 9
Example: Data
10. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 10
Example: Data
11. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 11
Example: Data
12. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 12
Example: Data
13. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 13
Example: Explore
• Find a PID by searching the ICOS Data Portal
https://data.icos-cp.eu
• Explore persistent identifiers in python
14. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 14
Example: Explore
15. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 15
Example: Explore
16. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 16
Example: Explore
17. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 17
Example: Explore
18. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 18
What if …….
19. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 19
Example: Multisource
https://pangaea.de/ & https://data.icos-cp.eu/
20. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 20
Example: Multisource
21. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 21
Example: Multisource
22. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 22
Example: Multisource
23. • Python Library “icoscp”
• Examples
• Conclusion & Outlook
Topics for this Talk
24. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 24
Paradigm Shift
Pros Cons
No local storage needed No offline development
Collaboration: Share your code
without copying the data
Trust your Data Portal. Longterm
availability of source and PID
management
Concentrate on Science,
not data handling
If you work with a LOT of data,
HTTP transfer might be slower
than loading from local disk
New Version/Update can be
handled automatically
Meta data available
25. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 25
Good to Know
• Exploratory search through the library not as good as
the facetted search from the website (yet)
• Not all columns might be available compared to the
downloaded version
• No guarantee of 100% accurate representation …. But
we are confident to 99.9%
26. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 26
Good to Know
• This library is still considered a “ beta” version.
Hopefully a very stable beta…….
Please give us feedback, that’s the only way to make it
better and adjust the library to YOUR needs.
https://github.com/ICOS-Carbon-Portal/pylib/issues
27. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 27
Outlook
• Support for spatial datasets like netCDF files
• Convenience functions for collections
• Service to mint a Persistent Identification for your
work
28. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 28
Message to Take Home
• Play around on our Public Jupyter Hub Service
https://exploredata.icos-cp.eu
• It is easy…. You don’t need to be a rocket scientist
• It is transparent and reproducible
30. • Python Library “icoscp”
• Examples
• Conclusion & Outlook
• Data Dissemination
Topics for this Talk
31. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 31
Dissemination
Digital Data Object
• Meta-data
• File to download
• Snapshot
• Persistent Identification
• Facetted Search
https://data.icos-cp.eu/
32. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 32
Dissemination
Digital Data Object
• Meta-data
• File to download
• Snapshot
33. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 33
Dissemination
Digital Data Object
• Meta-data
• File to download
• Snapshot
• Untouched, exactly as it has
been uploaded
(ZIP file, PNG, NetCDF, CSV,
PDF….)
34. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 34
Dissemination
Digital Data Object
• Meta-data
• File to download
• Snapshot
• For tabular data
• Fast and easy
• Preview in the data portal
• Python library
35. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 35
Dissemination
Digital Data Object
• Meta-data
• File to download
• Snapshot
36. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 36
Data Flow
National Networks measure and observe green house gases
37. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 37
Data Flow
Carbon Portal
• Check meta data
• Sanity check
• Ingest data
• Create Persistent
Identification (PID)
Domain Experts for GHG in
Ecology – Ocean - Atmosphere
• Data curation
• Quality Assurance
• Quality Control
• Calculate Fluxes
• …
38. Claudio D’Onofrio & The Carbon Portal Team | September 2020 | 38
Data Flow
Raw Data Measurements & Observations
Level 0 Physical quantities possibly converted.
Level 1 Working data are data that are generated
as intermediate steps for NRT and L2
Level 1 NRT Near Real Time data using automated
quality control
Level 2 Final, quality-checked ICOS RI Datasets
Level 3 Elaborated Products