💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
IDCC workshop: OpenAIRE services and tools for Open Research Data in H2020
1. OpenAIRE Services & tools
for Open Research Data in H2020
Gwen Franck, EIFL/LIBER | Katerina Iatropoulou, ATHENA
Pedro Príncipe, Univ. of Minho | Sarah Jones, DCC
3. OUTLINE
3
Open Access
and Open
Data in H2020:
requirements
OpenAIRE
services &
tools:
showcase and
demos
Data
Management
Planning:
H2020 &
OpenAIRE
OpenAIRE:
what comes
next
1 2 3 4
4. AGENDA
09h00-09h20: Welcome
OpenAIRE in a nutshell
09h20-09h40: Open Access and Open Data in H2020
Open Research Data pilot requirements
09h40-10h20: OpenAIRE services and tools
Showcase and demos
10h20-10h50: Breakout groups (1)
Showcase and your opinion is relevant
10h50-11h00: Warm up
11h00-11h30: Coffe-break
11h30-12h30: Data Management Planning
H2020, FAIR guidelines and OpenAIRE (Breakout groups 2)
12h30-13h00: OpenAIRE: what comes next
4
5. It’s all about openness
5
OA is here to stay.
Policies and practices
hand in hand for a
sustainable OA Service provision
at all levels
for all stakeholders
Policy alignment
& advising
6. Who we are
• EU project(s)
• DRIVER
• DRIVER II
• OpenAIRE
• OpenAIREplus
• OpenAIRE2020
• In 24x7 operation since Dec 2010
• Consortium of 50 partners
• One of 5 key EU e-Infrastructures
• A legal entity in 2017
• Institutional, national and
international perspectives on OA
policies & e-Infrastructures
Open Access experts
• Building efficient e-Infra technologies
• State of the art technologies (big
data, linked data)
Information & Computer
Science experts
• Legal &policy recommendations
Legal experts
• Best practices for data
• Linking to data infrastructures
Data communities
6
7. Human Network e-infrastructure
NOADS: National Open Access Desks
Monitor and foster the adoption of Open
Access policies at the local level
Support the implementation of the Open Data
in H2020
FP7 post grant APCs Pilot
e-infrastructure for monitoring impact of OA
mandates and research projects
OpenAIRE guidelines for metadata exchange
Zenodo Repository for the deposition of research
products
THE POINT OF REFERENCE FOR OPEN SCIENCE IN EUROPE
50 Partners: EU countries, data centers, universities, libraries, repositories
Open Access infrastructure
for research in Europe
8. Infrastructure for Open Knowledge
• Foster and facilitate
the shift of scholarly
communication
towards making
science Open and
Reproducible
• Collaborative and
participatory
approach at European
and Global level
Research
communities
Research
admins
Researchers
Funders
SMEsContent providers in
scholarly communication
Neworking &
e-Infrastructure
9. No people, no infrastructure!
Linking people, ideas, and technologies
9
Networking
International alignment
NOADs (EU), COAR, RDA, CASRAI, SHARE (US), La
Referencia (South America), WDS
Policies and guidelines
Open Access and DMPs
Interoperability guidelines for content providers
Best practices
Data citation, data-literature inrerlinking
Alternative bibliometrics, Repository usage stats
Open Access peer-review
Technological liaisons
Existing e-infrastructures to re-use their content and
services
Research
communitie
s
Researc
h admins
Researcher
s
Funders
SMEsContent providers in
scholarly
communication
Neworking &
e-Infrastructure
17. RESEARCHER
DECIDES WHERE TO
PUBLISH
Check publishers
policies on
www.sherpa.ac.uk/romeo
Open Access Journals
doaj.org
Check for Article
Processing Charges
Subscription-based journal Self-archive in a repository
Find at: openaire.eu
IMMEDIATE
OPEN ACCESS
IMMEDIATE OR DELAYED
OPEN ACCESS
H2020 Open Access Mandate
27. 1 2 3INSTITUTIONAL
REPOSITORY
of the research
institution with which
they are affiliated
SUBJECT/THEMATIC
REPOSITORY
ZENODO REPOSITORY
Centralised option set
up by the OpenAIRE
project and CERN
Depositing
27
Through OpenAIRE they can be directed to: Publication repositories (OpenDOAR),
Research Data repositories (RE3DATA). If no repository is available: Zenodo at CERN
(sponsored by OpenAIRE). OpenAIRE harvests directly from a number of OpenAIRE
compliant OA publishers and journal aggregators.
31. 31
OpenAIRE Content Acquisition
Authoritative Information Research DataPublications
• Registries of Data
Providers
• OpenDOAR,
re3data, DOAJ
journal list, …
• Funding Information
• Author-/Contributor
Information
32. Aggregation Activities and Results
Regular bibliographic metadata harvesting, validation and normalization
•18M+ publications (de-duplicated)
•~5 M full-texts
•300,000+ links publication-project from 6 funders
•40,000+ datasets linked to publications or projects
•60,000+ organizations (de-duplicated)
Collected from:
•800+ “direct” data providers
• Key-regional aggregators: CORE-RIOXX, JAIRO (Japan), LA Referencia
•6,000+ “indirect” data providers (inherited from aggregators)
32Jan. 2016
33. OpenAIRE data providers
1. OpenAIRE first collect everything from Datacite OAI-PMH interface (~9M records);
2. OpenAIRE harmonize and filter, keeping only the records of defined typologies
('dataset', 'software', 'collection', 'film', 'sound', 'physicalobject', 'audiovisual') to
produce the cleaned collection (~3M records);
3. The OpenAIRE datacite collection is then crunched by the inference system in order
to try to mine links to projects and publications (OpenAIRE relies on location and
access to the full-text files);
4. Finally, only those datasets that has received at least one link are considered eligible
to reach the OpenAIRE portal (currently 31.426).
DataCite (aggregation workflow based on the OpenAIRE content policy)
33
35. Zenodo Repository
• Multiple data types
• Publications
• Long tail of research data
• Citable data (DOI)
• Links to funding, pubs, data, software
“Catch-all” repository: OpenAIRE-CERN joint effort
35
www.zenodo.org
H2020: Option to gather, preserve and share
project’s scientific output
43. Support, Usage, Content
• Support load
• Requests: 1576 requests in 2016 vs 800 in 2015
• Usage:
• Visits: 282k visits in 2016 vs 140k in 2015
• But lower # of support request/1000 visitors
• And faster handling per request !
• Infrastructure: 30 nodes, 9TB
• Content (150k records)
• Datasets: 4k (2k in 2016)
• Software: 14k (8k in 2016)
• Publications: 54k (34k in 2016)
• Images: 76k (50k in 2016)
• Other: 2k (1k in 2016)
43
44. Zenodo Upgrade
• Launched September 12.
• Faster everything – search, browsing, uploading, API
• Big Data Ready – 50GB per dataset, no file size limit.
• H2020 grants support.
• Significant faster development and easier support handling.
• Impact: Doubled number of visitors per week pre vs post upgrade.
44
September upgrade
51. SAFURE:
SAFety and secURity by design for interconnected
mixed-critical cyber-physical systems
ZENODO COMMUNITY: https://zenodo.org/communities/safure_h2020
OpenAIRE project landing page:
https://www.openaire.eu/search/project?projectId=corda__h2020::a118deea7a33c168a3084e4202bd10ab
Publications and Datasets available via Zenodo
51
58. Example H2020 DMPs in Zenodo
• Helix Nebula – High Energy Physics example
• https://zenodo.org/record/48171#.WATexnriF40
• Tweether – engineering (micro-electronics) example
• https://zenodo.org/record/55791#.WATei3riF40
• AutoPost – ICT example
https://zenodo.org/record/56107#.WATefXriF40
58
65. Search all entities
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop 65
http://www.openaire.eu/search/find
66. Research data links
•Linking between research results
• Related publications and datasets
• References
• Similar Publications
• Bio Entities
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop 66
75. LINK RESEARCH RESULTS TOOL
https://www.openaire.eu/participate/claim
Link publication or datasets
to projets.
Identify the project, select
publications or datasets and
set the access rights.
84. OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017 Workshop 84
All project publications in
HTML or CSV
One click away to EC
project reporting systems
92. OA Overview
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop 92
66K pubs – 7.5K projects
FP7
8.5K pubs –725 projects
SC39: FP7 OA Pilot
93. Project overview
Project productivity
over time
Post project-end
monitoring
Pubs location
OA mandate
conformance
93OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop
96. Services and tools for projects
Open Access Depositing
Storing Research Data
Claiming publications and datasets
Reporting research outputs
Monitoring and analytics
Discovery and Access
96
99. INTEROPERABILITY:
GUIDELINES & VALIDATOR
Data providers
99
Common standards/best practices for data providers (Guidelines for
literature, data repositories, aggregators, OA journals, CRIS
systems).
Validator: web service or standalone
100. 1 2 3Literature
Repositories
(and journal platforms)
Dublin Core (DRIVER)
Data
Repositories
(and archives/data centres)
Datacite
CRIS systems
CERIF-XML
Guidelines for Data Providers
100
101. Test the OpenAIRE Compliance
101
Choose from the menu
Finally check results
107. FCT in OpenAIRE: what has been done
107
List of FCT projects
OpenAIRE info space
Link publications to
projects
FCT projects infered
Results available on the
OpenAIRE portal
108. FUNDED PROJECTS INFO IN OPENAIRE: SUMMARY
Collect metadata
including project
grantID from
OpenAIRE compliant
repositories
Metadata publications
record enrichments by
OpenAIRE
deduplication
Link Publications to
projects by
inference (text
mining procedures)
Link Publications to
projects using the
end-user service:
claim publications
109. With this information, OpenAIRE
can offer funders…
• A unique view of the scientific outputs that derive from their funding.
• OpenAIRE enables advanced monitoring (including of compliance with
Open Access policies), reporting and analysis of research impact and
research trends.
• Funders can assess the impact of their funding by viewing advanced
statistics on research outputs (publications and data-sets) and the
funding programme/stream/project from which they derive (including
co-funded research results and research trends).
109
110. Features OpenAIRE can provide to funders
• filter publications/data by funder and browse by specific funding streams
• search via project title, acronym or grant agreement and view specific
statistics of the project: publications/data over time, OA status, where
they were published/deposited, etc.
• view overall funder/funding stream statistics (facets over time, data
source, institution, etc.)
• correlate author/institution output with funding information
• visualize clusters of publications/data or funding based on their
interlinking (national or ERA-wide level).
Using the OpenAIRE portal, funders can
110
111. Monitor across funders
euroCRIS Strategic Membership Meeting @ Athens, 8-10 November,2016
At the project and funding scheme level
111
113. Inference process
113
Pilot BETA Production
• Deutsche Forschungsgemeinschaft (DFG), Germany
• RCUK, UK
• FondationTara Expéditions, France
• Slovak Centre of Scientific andTechnical Information (CVTI SR),
Slovakia
• CONICYT (Chile)
OpenAIRE GeneralAssembly – Oslo, February 14rd, 2016
114. Inference process
114
Pilot BETA Production
• Netherlands Organisation for Scientific Research (NWO), Netherlands, ~10,725 publications,
~24,180 NWO projects [just approved for Production].
• Serbia National Funding Scheme (MESTD), Serbia, ~1,273 publications, ~777 projects [approved
for Production].
• Science Foundation, Ireland, ~2,005 publications, ~4130 projects.
• “Ministry of Science Education and Sport” (MSES/MZOS) and "Croatian Science Foundation”
(CSF/HRZZ), Croatia, ~800 publications, ~2120 MSES and ~881CSF projects.
• National Institutes of Health (NIH), USA, ~124,054 publications, ~1,682,692 projects.
• Swiss National Science Foundation (SNSF), Switzerland, ~???? publications, ~64,278 projects.
• Austrian Science Fund (FWF), Austria, ~???? publications, ~12,565 projects
OpenAIRE GeneralAssembly – Oslo, February 14rd, 2016
115. Inference process
115
Pilot BETA
Productio
n
• EC–FP7, ~205,275 publications, ~25,689 projects.
• EC–H2020, ~6,809 publications, ~11,242 projects.
• FCT funder, Portugal, ~24,162 publications, ~37,277 projects.
• WellcomeTrust, UK, ~17,516 publications, ~11,906 projects.
• Australian Research Council (ARC), Australia, ~9,206 publications, ~23,011
projects.
• National Health and Medical Research Council (NHMRC), Australia, ~7,730
publications, ~23,209 projects.
• National Science Foundation (NSF), USA, ~127,110 pubs, ~497,646 project.
OpenAIRE GeneralAssembly – Oslo, February 14rd, 2016
120. OpenAIRE API - Bulk
• OAI – PMH
• http://api.openaire.eu/oai_pmh
• Bulk access to projects
• DSpace endpoint:
http://api.openaire.eu/projects/dspace/$fundingStream/ALL/ALL
• ePrints endpoint:
http://api.openaire.eu/projects/eprints/$fundingStream/ALL/ALL
• Examples:
• http://api.openaire.eu/projects/eprints/WT/ALL/ALL
• http://api.openaire.eu/projects/eprints/FP7/SP2/ALL
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017 Workshop 120
121. OpenAIRE API - Selective
• The number of total results returned by one query is
limited to 10,000.
• http://api.openaire.eu/search/publications
• http://api.openaire.eu/search/datasets
• http://api.opanaire.eu/search/projects
• XML, JSON,TSV, CSV.
• Examples:
• http://api.openaire.eu/search/publications?FP7ProjectID=246686
• http://api.openaire.eu/search/projects?funder=EC
• Participant portal.
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017 Workshop 121
122. OpenAIRE data as LOD
http://beta.lod.openaire.eu
DBLP
CiteSeer
CEUR
lAK
• LOD entry point published online
• Increase interoperability of OpenAIRE and
enabling interlinking with other data
sources
• Interlinking OpenAIRE LOD to related
LOD datasets
OpenAIRE General Assembly – Oslo, February 14rd, 2016
123. PROMOTE & INFORM
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop 123
General Information
124. OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop 124
125. OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017 Workshop 125
126. OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017Workshop 126http://www.openaire.eu/eu-member-states/noads/oa-member-states
127. SCHOLARLY LINK RESOLUTION
Publishers, data providers,
research infrastructures, SMEs
127
Data-publication links exchange and resolution
in production by November 2016 (coop. RDA/WDS, CrossRef, DataCite)
BETA at: http://dliservice.research-infrastructures.eu
128. ANONYMIZATION
Data anonymization made easy
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017 Workshop 128
anonymization
Original data anonymous data
129. Data anonymization
• Data anonymization
• Removal of direct identifiers, e.g., Names, SSN etc.
• Removal of infrequent combinations of quasi-
identifiers, e.g., unique combinations of birth dates
and zipcodes.
• Infrequent combinations are removed through
generalization, e.g., birth date 14/01/1977 becomes
**/**/1977.
OpenAIRE services and tools for Open Research Data in H2020 - IDCC 2017 Workshop 129
130. Amnesia
• http://amnesia.imis.athena-
innovation.gr:8080/amnesia/
• Amnesia is a scalable anonymization tool
• It offers several versions of k-anonymity
• It allows the user to select and customize possible solutions
• It offers graphical tools that allow the user to analyze the
anonymized dataset
• Web service or stand alone
• Integrated within research deposition workflows in
repositories/Zenodo
OpenAIRE General Assembly 2017 – Oslo, 15 February 130
131. Amensia Challenges
When to anonymize?
• We need to determine the nature
and size of the available data
• Issues like updates and overlaps
between different datasets are
important
• What are the data exchange
scenarios?
• Basic actors
• Requirements for data utility
• Frequency of data exchange
• Data anonymization tools allow for
scenarios that we did not thought
possible
How to anonymize?
• Data anonymity reduces the quality
of the data
• The reduction must be guided to
minimize the effect on applications
• Different applications and users
require different anonymization
methods
• We need to talk to experts
• To understand what they need
• To show them what we can do
OpenAIRE General Assembly 2017 – Oslo, 15 February 131
133. Breakout groups (1)
1. Did you know them?
2. What can be their utility for your
institution/project?
3. What do you like to see that is not
there?
OpenAIRE Services and tools: showcase and demos
#IDCC17 Workshop: OpenAIRE Services & Tools for ORD in H2020 - Edinburgh, February 23 2017
136. Data Management
Planning, H2020 &
OpenAIRE
#IDCC17 Workshop: OpenAIRE Services & Tools for ORD in H2020 - Edinburgh, February 23 2017
137. Data Management Planning,
H2020 & OpenAIRE
Sarah Jones
Digital Curation Centre, Glasgow
sarah.jones@glasgow.ac.uk
Twitter: @sjDCC
OpenAIRE workshop, International Digital Curation Conference, Edinburgh, 23 February 2017 #idcc17
138. A DMP is a brief plan to define:
• how the data will be created
• how it will be documented
• who will be able to access it
• where it will be stored
• who will back it up
• whether (and how) it will be shared & preserved
DMPs are often submitted as part of grant applications, but are
useful whenever researchers are creating data.
What is a DMP?
139. FAIR Data Management guidelines
• Notes the extension of the pilot
• Clarifies concept of FAIR data
• Explains what a DMP is and when
they should be updated
• Notes what happens at proposal,
submission and evaluation
• Explains costs are eligible
• Provides a DMP template
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/
h2020-hi-oa-data-mgt_en.pdf
140. A FAIR approach to DMPs
Findable
– Assign persistent IDs, provide metadata, register in a searchable resource...
Accessible
– Retrievable by their ID using a standard protocol, metadata remain
accessible even if data aren’t...
Interoperable
– Use formal, broadly applicable languages, use standard vocabularies,
qualified references...
Reusable
– Rich metadata, clear licences, provenance, use of community standards...
www.force11.org/group/fairgroup/fairprinciples
141. 1. Data summary
2. FAIR data
2.1 Making data findable, including provisions for metadata
2.2 Making data openly accessible
2.3 Making data interoperable
2.4 Increase data re-use (through clarifying licences)
3. Allocation of resources
4. Data security
5. Ethical aspects
6. Other issues
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi
/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
H2020 template
142. • The Commission does NOT require applicants to submit a DMP
at the proposal stage. It’s a deliverable (due by month 6).
• A DMP is therefore NOT part of the evaluation
• Optional section on data management in proposal is worth
doing, especially to help justify costs
• A DMP is a living or “active” document that should be updated
whenever important changes occur (or at review times)
Key differences in H2020
143. Reviewing DMPs in H2020
• DMPs are a deliverable, checked primarily by project
officers and in some cases external reviewers too
• Guidelines are being developed to give reviewers pointers
on what to check. These are based on the template.
• The reviewer has access to the full project documentation
• Process is only just evolving and this is a pilot so feedback
may be variable initially
144. Example H2020 DMPs in Zenodo
Helix Nebula – High Energy Physics example
https://zenodo.org/record/48171#.WATexnriF40
Tweether – engineering (micro-electronics) example
https://zenodo.org/record/55791#.WATei3riF40
AutoPost – ICT example
https://zenodo.org/record/56107#.WATefXriF40
More listed at: www.dcc.ac.uk/resources/data-
management-plans/guidance-examples
145. Example: AutoPost
An industry-driven
innovation action that
will deliver ICT-based
solutions to enhance
established post-
production workflows
Have adopted their own
structure to create DMP
146. Example: AutoPost
4. Manuscripts
5. Dissemination material
Covers a range of existing and new data:
1. Evaluation / test data
2. Computer software
3. Research data and metadata
147. What is DMPonline?
A web-based tool to help researchers write DMPs
Includes a template for Horizon 2020
https://dmponline.dcc.ac.uk
148. National / local DMP Tools
https://github.com/DMPRoadmap/roadmap/wiki/Local-installations-inventory
151. OpenAIRE plans with DMPonline
• Adding option to allow projects to deposit DMP in
Zenodo as way to publish plan and obtain a DOI
• Considering using OpenAIRE API to let PIs select
their H2020 project to automatically populate grant
ID field and link the DMP with other outputs
152. Discussion: what support do you need?
• What are your experiences in supporting
researchers to write DMPs?
• What kind of support would be useful from
OpenAIRE to help you in this role?
1. Take 15 mins to share your experiences
2. Take 20 minutes to brainstorm ideas for support
3. Circulate and vote for your favourite 5 ideas
153. Thanks for listening
DCC resources on Data Management
www.dcc.ac.uk/resources
Follow us on twitter:
@digitalcuration and #ukdcc
156. Breakout groups (2)
1. What are your experiences in supporting researchers to write DMPs
for H2020 or other contexts (get each group to go round in turns).
2. What kind of support would be useful from OpenAIRE to help you in
this role (general brainstorm to gather as many ideas as possible) I
could ask Stephanie to give us some sticky dots if you want to try to
vote on these as part of the exercise.
DMPs
#IDCC17 Workshop: OpenAIRE Services & Tools for ORD in H2020 - Edinburgh, February 23 2017
160. What comes next
From today to 2019
• Technical services (Open Science as a Service concept)
• Repository Dashboard
• Extension of data model to research methods and research
objects
• Catch-All Notification Broker service
• Research Community Dashboard
• Networking services
• Research Community Open Science Desk
#IDCC17 Workshop: OpenAIRE Services & Tools for ORD in H2020 - Edinburgh, February 23 2017
161. Plans
• Q1
• Documentation revamp
• Versioning
• Q1-2
• All OpenAIRE grants (in addition to FP7, H2020)
• WT (UK), FCT (Portugal), ARC & NHMRC (Australia)
• Possibly: SFI (Ireland) CSF (Croatia) MSES (Croatia), …
• REST APIs
• Q2-3
• Community teams
• Q3-4
• Data Seal of Approval Certification
• Integration of anonymization service
161
163. ONE STOP SHOP
FOR OPENAIRE DATA PROVIDERS MANAGERS
for friends… “the repository managers dashboard”
OpenAIRE General Assembly 2017 – Oslo, 15 February 163
164. OpenAIRE Data Provider
Dashboard
OpenAIRE General Assembly 2017 – Oslo, 15 February 164
REGISTRATION
&
VALIDATION
::
current validator
ENRICHMENT
&
ADDITION
::
broker service
USAGE STATISTICS
&
METRICS
::
stats service
NOTIFICATIONS
&
UPDATES
::
manage datasource
168. Open Science publishing
Supporting reuse/reproducibility and transparent evaluation
Research
data
Research
methods
e-infra
Tools & Services
Research
data
Scientific process
Research literature:
Articles, docs, white papers
Publishing
01101010
01100001
11010010
01101010
01100001
11010010
Publication
Repository
01101010
01100001
11010010
Data
Repository
Method
Repository
01101010
01100001
11010010
01101010
01100001
11010010
Package
RepositoryEnabling
Reproducibility
citation
citation
Enabling
Transparent
evaluation
Methods: e.g. software, workflows, protocols, algorithms,
scripts
OpenAIRE General Assembly – Oslo, February 14rd, 2016
169. Open Science as-a-Service (OSaaS)
Jan 2017- June 2019
Catch-All-
Notification Broker
Methods
Packages
Articles DataProjects
Research Community
Dashboard
Aggregation
Deposit-Search-
Browse-Monitor-
Research Impact
Subscribe & Receive Notification
Articles
Data
Researchers
Content Providers
Articles
Data
Projects
Methods
OpenAIRE as a mediation
gateway between research
communities and the
scientific communication
landscape
Methods
Packages
Packages
OpenAIRE General Assembly – Oslo, February 14rd, 2016