3. A FAIRLY short timeline
• January 2014 Workshop in Leiden (the Netherlands)
• 2014 Results on Force11 site
• 15 March 2016 Article in ‘Scientific data’
• 26 July 2016 H2020 Programme Guidelines
• December 2016 Webinar FAIR / repositories
Guiding Principles for Findable,
Accessible, Interoperable and Re-usable
Data Publishing version b1.0
Discussion about indicators of ‘FAIRness’
4. A bit longer timeline
December 2010 First discussions with selected scientists
October 2012 First data management course PhD’s
April 2014 Data management plan mandatory
(PhD projects and research groups)
May 2015 Data management support hub
o.a. data librarian, code repository, ELN
2017 ? Guidelines on ownership
2017 ? Guidelines for data storage during research
5. What ‘FAIR’ does NOT want to be and what it
wants to achieve
• It is NOT a specification
• It is NOT a syntax (it aims to be syntax agnostic)
• It is meant to precede technology and other implementation choices
• In my own words : these guidelines aim to create a research data
environment that is FAIR to machines and humans
6. F
to be findable
• F1. (meta)data are assigned a globally unique and
persistent identifier
• F2. data are described with rich metadata (defined by
R1 below)
• F3. metadata clearly and explicitly include the
identifier of the data it describes
• F4. (meta)data are registered or indexed in a
searchable resource
7. Proposed indicators F(indable)
• 1.No PID and no metadata/documentation
• 2.PID without or with insufficient* metadata
• 3.Sufficient* metadata without PID
• 4.PID with sufficient* metadata–Information on data provenance
• 5.PID, rich metadata and additional documentation–Additional
explanation of how data can be used
* Sufficient = enough metadata to understand what the data is about
8. F(indable) @ Wageningen
• Presently departments decide what data is published
• At best data that is underlying publications (pressure from journals
helps at lot….)
• There are ongoing (series of) datasets that are only known to insiders
9. A
to be accessible
• A1. (meta)data are retrievable by their identifier using
a standardized communications protocol
•A1.1 the protocol is open, free, and universally
implementable
•A1.2 the protocol allows for an authentication and
authorization procedure, where necessary
• A2. metadata are accessible, even when the data are
no longer available
10. Proposed indicators A(ccessible)
1.No user license / unclear conditions of reuse / metadata nor data are
accessible
2.Metadata are accessible (even when the data are not or no longer
available)
3.User restrictions apply (of any kind, including privacy, commercial
interests, embargo period, etc.)
4.Public Access (after registration)
5.Open Access (unrestricted, CC0 –perhaps also CCby?)
11. Accessible @ Wageningen
• Probably the most important problem: who decides who can get
access (and who will grant the permission technically)
• We have been awaiting guidelines on ownership / usage rights for
three years.
12. I
to be interoperable
• I1. (meta)data use a formal, accessible, shared, and broadly
applicable language for knowledge representation.
• I2. (meta)data use vocabularies that follow FAIR principles
• I3. (meta)data include qualified references to other (meta)data
13. Proposed indicators I(nteroperable)
1. Proprietary, non-open format data
2.Proprietary format, accepted by DSA Certified Trusted Data
Repository
3.Non-proprietary, open format (= “preferred” or “archival” format)
4.Data is additionally harmonized/ standardized, using standard
vocabularies
5.Data is additionally linked to other data to provide context
14. I(nteroperable) @ Wageningen
• In response to a blog about this the people working with ontologies
met for the first time
• Their main concerns
• How to find the relevant ontologies
• Can we rely on them to justify investments (consistency, process of
maintenance
• H2020 coordinators have no clue what all this is about
15. R
to be Reusable:
•R1. meta(data) are richly described with a plurality of
accurate and relevant attributes
• R1.1. (meta)data are released with a clear and
accessible data usage license
•R1.2. (meta)data are associated with detailed
provenance
•R1.3. (meta)data meet domain-relevant community
standards
Also in F4
Also in F2, I1
Also in I1
16. Proposed indicators R(e-usable)
“First we attempted to operationalise R – Re-usable as well ... but we
changed our mind.
Reusable – is it a separate dimension? Partly subjective: it
depends on what you want to use the data for!”
18. References
Guiding principles for findable, accessible, interoperable and re-usable data publishing version B1.0
https://www.force11.org/fairprinciples
The FAIR Guiding Principles for scientific data management and stewardship
https://www.nature.com/articles/sdata201618
Guidelines on FAIR Data Management in Horizon 2020
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
FAIR Data in Trustworthy Data Repositories Webinar
https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-webinar
Two blogs about FAIR @ Wageningen
• https://weblog.wur.eu/openscience/can-wageningen-fair/
• https://weblog.wur.eu/openscience/vocabularies-and-the-i-in-fair-data-principles/