Brian Matthews presents the European Open Science Cloud (EOSC) and the EOSCpilot | OSFair2017 Workshop
Workshop title: How FAIR friendly is your data catalogue?
Workshop overview:
This workshop will build upon the work planned by the EOSCpilot data interoperability task and the BlueBridge workshop held on April 3 at the RDA meeting. We will investigate common mechanisms for interoperation of data catalogues that preserve established community standards, norms and resources, while simplifying the process of being/becoming FAIR. Can we have a simple interoperability architecture based on a common set of metadata types? What are the minimum metadata requirements to expose FAIR data to EOSC services and EOSC users?
DAY 3 - PARALLEL SESSION 6 & 7
3. Why Europe is not fully tapping
into the potential of data:
Data not always open and lack of incentives and rewards for data sharing
Lack of interoperability required for data sharing … noting deep-rooted
walls between disciplines.
Fragmentation between data infrastructures that are split by scientific and
economic domains, countries and governance models
Surging demand for High Performance Computing at a scale above single
member state resources
Data reuse employing advance analysis techniques adequate protection of
personal data considering forthcoming revision of Copyright legislation.
4. Proposed a European Open
Science Cloud
Make all scientific data produced by the Horizon 2020 programme open by
default.
Raise awareness and change incentive structures for academics industry and
public services to share their data.
Develop specification for interoperability and data sharing across disciplines
and infrastructures
Create a fit-for-purpose pan-European governance structure to federate
scientific data infrastructures and overcome fragmentation.
Develop cloud based services for Open science supported by the necessary
data infrastructure
Enlarge the scientific user base to researchers and innovators from all
disciplines.
5. High Level Expert Group for the "European
Open Science Cloud".
http://ec.europa.eu/research/openscience/pdf/hleg/hleg-eosc-first-report_(draft).pdf
6. Definitions
European:
research and innovation are global - EOSC cannot be built exclusively in and for
Europe
Europe, is in a strong position to lead this initiative as already distributed and
collaborative
Open:
not all data and tools can be open. E.g. confidentially and privacy.
Open is also often confused with ‘for free'. Free data and services do not exist.
Intelligently open is what we mean,
Science:
explicitly includes all disciplines including the arts and humanities,
Also societal innovation and productivity,
support broad societal participation in Open Innovation and Open Science.
Cloud:
It can be misinterpreted to indicate that the EOSC is mostly about hard ICT
infrastructure
But it is much more a commons of data, software, standards, expertise and
policy related to data-driven science and innovation.
9. Technical Challenges: developing technical solutions
that meet the scientific needs
9
EOSCpilot Challenges
Scientific Challenges are really Opportunities
Technical Challenges are Barriers to overcome
Cultural Challenges are also Barriers
Scientific Challenges: deploying the EOSC to deliver
Open Science
Cultural Challenges: adopting new, more open ways
of working
Three types of challenges addressed by the EOSCpilot:
10. Actions to bring about an EOSC
• Bring the current Research Infrastructures together
• We do not want to replace their work
• Bring the e-Infrastructure projects together
• GEANT , PRACE
• EGI, EUDat, OpenAire
• Interoperate between their services
• Catalogue of services
• Allow people to select services to build new infrastructures
• Set up appropriate rules of engagement
• Allow access to data and compute
• Interoperate between their data
• FAIR data catalogues – accessible outside their discipline.
• Interoperable standards and metadata
• Allow new resources to be added
• Cloud providers, HPC providers, data providers, service providers
• Within the common governance and resourcing processes
• And a skills and competencies framework
• Need some set of core services and processes to hold the EOSC together
03/07/17
11. EOSC-Pilot Project
Setting the EOSC in the right direction
First of the EOSC projects
10M€ over 2 years
• Jan 2017 – Dec 2018
33 Partners + 15 3rd parties
• Led by STFC
• A range of e-Infrastructure providers, research institutes, research consortia, across
disciplines.
• EGI, EUDat, OpenAire, PRACE, GEANT
• ELIXIR, ICOS, ECRIN, BBMRI, DESY, CERN, XFEL, CEA
• STFC, CNR, DANS, DCC, BSC, MPG, CNRS
Try to answer some basic questions
• What is the EOSC going to provide?
• How is the EOSC going to operate ?
• How is the EOSC going to change how science is done ?
www.eoscpilot.
eu
11
12. EOSCpilot: High Level Aims
The EOSCpilot project will support the first phase in the
development of the EOSC.
Establish the governance framework for the EOSC and
contribute to the development of European open science
policy and best practice;
Develop a number of demonstrators functioning as high-
profile pilots that integrate services and infrastructures to
show interoperability and its benefits in a number of
scientific domains;
Engage with a broad range of stakeholders, crossing
borders and communities, to build the trust and skills
required for adoption of an open approach to scientific
research.
13. 1. Governance
• Propose a governance framework
2. Policy
• Devise a policy environment
3. Demonstrators
• Use real demonstrators to drive the
requirements for the EOSC
4. Services
• Specify service architecture, catalogue and
pilot services
5. Interoperability
• Identify interfaces and standards to drive
interoperability
6. Skills
• Specify a skills and competencies framework
for the EOSC
7. Engagement
• involve as many stakeholders as possible.
13
Workpackages
14. Science Demonstrators
First 5 Demonstrators
• Environmental & Earth Sciences - ENVRI
Radiative Forcing Integration to enable
harmonised data access and integration
across multiple research communities
• High Energy Physics - WLCG: large-scale, long-
term preservation and re-use of HEP data in
the EOSC open to other researchers
• Humanities – TEXTCROWD: Collaborative
semantic enrichment of text-based datasets
by make new software available on the EOSC.
• Life Sciences - Pan-Cancer Analyses & Cloud
Computing within the EOSC to accelerate
genomic analysis on the EOSC
• Physics - The photon-neutron community to
improve the community’s computing facilities
by creating a virtual platform for all users
Second 5 Demonstrators
• HPCaaS for Fusion - Culham Science Centre, UK
• Life Science Leveraging EOSC to offload
updating and standardizing life sciences
datasets and to improve studies
reproducibility, reusability and interoperability-
CRG, Spain
• Seismology: EPOS Virtual Earthquake and
Computational Earth Science e-science
environment in Europe- University of Liverpool,
UK
• CryoEM Linking distributed data and data
analysis resources as workflows in Structural
Biology with cryo-Electron Microscopy:
Interoperability and reuse CSIC, Spain
• Astronomy Open Science Cloud access to
LOFAR data - ASTRON, NL
• 5 more demonstrators to be selected in the autumn.
www.eoscpilot.e
u
14
15. Some Key Deliverables of EOSC Pilot
Governance framework
propose how the EOSC might be operated
Policies it needs to run the EOSC
Architecture of the EOSC
Systems of Systems approach
Rules of engagement
Catalogue of services
Skills and Competencies framework to work with the EOSC
Interoperability
Service Interoperability
Data Interoperability
www.eoscpilot.
eu
The European Open Science Cloud for Research pilot project is
funded by the European Commission, DG Research & Innovation
under contract no. 739563
15
16. Service Interoperability
Propose an architecture, validated technical solutions and
best practices for enabling interoperability across multiple
federated e-infrastructures, overcoming current gaps
expressed by user communities and resource providers.
www.eoscpilot.
eu
The European Open Science Cloud for Research pilot project is
funded by the European Commission, DG Research & Innovation
under contract no. 739563
16
Reasons
for GAPs
Gap1:
Diversity and
incompatibility
of the AAIs
Gap5: Low
awareness
of the e-
infrastructur
es and
services
Gap2:
Network
services
Gap4:
Diversity of
access
policies
Gap3:
Diversity of
services and
providers
Gap6: Lack
of expertise,
training,
easy tools,
human
networks
Bridging
the GAPs
Gap1:
Global AAI
Gap5:
Common
vocabulary,
global
services
catalogue,
disseminatio
n
Gap2:
Network
services
improveme
nt
Gap4:
Multidiscipli
nary
mutualised
space
Gap3:
Services
technical
interoperabil
ity
Gap6: Foster
adoption,
expertise
sharing, user
friendly
tools, human
networks
17. Data Interoperability
Establish principles and develop mechanisms that
enable the EOSC to provide research and data
interoperability across the diversity of existing (and
potential future) research communities, research
infrastructures and other research organisations
Consider some examples of data interoperability in
research infrastructures
Develop guidelines to ensure data interoperability for
FAIR data
Define minimal metadata to allow cross-walks
Test in some trial scenarios
www.eoscpilot.
eu
The European Open Science Cloud for Research pilot project is
funded by the European Commission, DG Research & Innovation
under contract no. 739563
17