1. FAIR data requires FAIR ontologies,
how do we do?
Clement Jonquet, PhD
Assistant Professor – LIRMM, University of Montpellier
Visiting Scholar at Stanford University
EUDAT Semantics Services in EOSC meeting
2. As any data, ontologies need to be FAIR
• The FAIR principles have established the importance of using standards
vocabularies or ontologies to describe FAIR data and to facilitate
interoperability and reuse…
• Explosion of the number of ontologies/vocabularies
• Cumbersome to identify
the ontologies we need
and manage their overlap.
2
8. Ontology libraries, registries, repositories
• Ontology libraries defined as
• “a library system that offers various functions for managing, adapting and
standardizing groups of ontologies. It should fulfill the needs for re-use of
ontologies. In this sense, an ontology library system should be easily
accessible and offer efficient support for re-using existing relevant
ontologies and standardizing them based on upper-level ontologies and
ontology representation languages.” [Ding & Fensel, 2001]
• Ontology repositories defined as
• “a structured collection of ontologies (…) by using an Ontology Metadata
Vocabulary. References and relations between ontologies and their
modules build the semantic model of an ontology repository. Access to
resources is realized through semantically-enabled interfaces applicable
for humans and machines. Therefore a repository provides a formal query
language” [Hartmann, Palma, Gomez-Perez, 2009]
8
9. What are the ontology libraries out there?
• Ontology repositories / portal
• NCBO BioPortal
• Ontobee
• AberOWL
• EBI Ontology Lookup Service
• OKFN Linked Open Vocabularies
• ONKI Ontology Library Service
• MMI Ontology Registry and Repository
• ESIPportal
• AgroPortal
• SIFR BioPortal
• CISMEF HeTOP
• OntoHub
• Web indexes
• Watson, Swoogle,
Sindice, Falcons
• Ontology libraries / listings (more or less
updated)
• OBO Foundry
• WebProtégé
• Romulus
• DAML ontology library
• Colore
• FAO VEST Registry
• BioSharing
• DERI Vocabularies , OntologyDesignPatterns,
Semanticweb.org, W3C Good ontologies
• Platform technology
• Mondeca ITM, LexEVS
• Abandoned projects
• Cubboard, Knoodl, Schemapedia, SchemaWeb,
OntoSelect, OntoSearch, TONES
9
10. Focus on NCBO BioPortal : a “one stop shop” for biomedical
ontologies
• Web repository for biomedical
ontologies
• Make ontologies accessible and usable –
abstraction on format, locations, structure,
etc.
• Users can publish, download, browse,
search, comment, align ontologies and use
them for annotations both online and via a
web services API.
10
11. 11
• Online support for
ontology
• Peer review & notes
• Versioning
• Mapping
• Search
• Resources
• Annotation
• Open source technology
• Packaged in a “virtual
appliance”
• Set up your own
“bioportal” in a few
hours
12. http://bioportal.bioontology.org
Ontology
Services
• Search
• Traverse
• Comment
• Download
Widgets
• Tree-view
• Auto-complete
• Graph-view
Annotation
Data Access
Mapping
Services
• Create
• Upload
• Download
Term recognition
Search “data”
annotated with a
given term
http://data.bioontology.org 12
13. Who has been reusing NCBO technology so far?
• Recently
• AgroPortal (http://agroportal.lirmm.fr) – agronomy, food, plant sciences, biodiveristy
• SIFR/French BioPortal (http://bioportal.lirmm.fr) – French biomedical ontologies & terminologies
• BiblioPortal (http://biblio.ontoportal.org) – libraries and metadata standards
• EcoPortal – ongoing discussion with the Lifewatch/LTER projects for a more focused portal on ecology &
biodiversity
• Historically
• NCI term browser (https://nciterms.nci.nih.gov) – BioPortal first, then LexEVS
• Open Ontology Repository (OOR) Initiative (http://www.oor.net) – Now stopped. Looked also at OntoHub
• Marine Metadata Interoperability Ontology Registry and Repository (http://mmisw.org)
• ESIPPortal (Earth Science Information Partners - http://semanticportal.esipfed.org )
• And a few hospitals, research labs, with private data and specific needs (often in-house annotation)
13
15. SIFR: Semantic Indexing of French Biomedical Data
Resources
http://www.lirmm.fr/sifr
• Ontology-based services to index, mine
and retrieve French biomedical data
• In France, there is already a reference
repository for medical terminologies
but nothing public for annotation
• Crucial need for tools & services for
French biomedical data
15
16. C. Jonquet, et al.. SIFR BioPortal: French biomedical ontologies
and terminologies available for semantic annotation, In 16th
Journées Francophones d'Informatique Médicale, JFIM'16.
Geneva, Switzerland, July 2016.
A dedicated version
of BioPortal for
French ontologies
http://bioportal.lirmm.fr
26 monolingual ontologies/terminologies
• From the UMLS or EHTOP
• Cleaned and checked for the Annotator
purpose
16
A. Tchechmedjiev, ..., C. Jonquet. Enhanced Functionalities for
Annotating and Indexing Clinical Text with the NCBO
Annotator+, Bioinformatics, January 2018.
17. AgroPortal: ontology repository for the agronomic domain
http://agroportal.lirmm.fr
• Develop and support a reference ontology repository
• Primary focus on the agronomy & close related domains (food, plant sciences and biodiversity)
• Reusing the NCBO BioPortal technology
• Avoid to re-implement what has been done, facilitate interoperability
• Reusing the scientific outcomes, experience & methods of the biomedical domain
• Enable straightforward use of agronomy ontologies
• Respect the requirements & specificities of the agronomic community
• Fully semantic web compliant infrastructure
• Enable new science Jonquet, C., Toulet, A., Arnaud, E., Aubin, S., DzaléYeumo, E., Emonet, V., Graybeal,
J., Laporte, M.-A., Musen, M.A., Pesce, V., Larmande, P., AgroPortal: A
vocabulary and ontology repository for agronomy. Comput. Electron. Agric.
144, (Jan 2018).
17
18. AgroPortal an
ontology repository
for agronomy, food,
plant sciences &
biodiversity
Publish, search,
download
Browse, visualize
Peer review
Versioning
Annotation
Recommendatio
n
Mapping
Notes
Projects 80 ontologies, 95 candidates
5 driving use cases
~90 registered users
http://agroportal.
lirmm.fr
18
20. Harnessing the power of metadata to
facilitate the comprehension of the
agronomical ontology landscape
20
21. A new metadata model to
better support description of
ontologies and their relations
• Building a list of properties to
describe ontologies
• Pickup properties and relations
from 23 existing vocabularies
• Existing properties in ontology
repositories (especially
BioPortal)
• Non specific properties that
“belong to the ontology”
346 relevant properties that could be used to described
ontologies
127 used to build a new metadata model inside
AgroPortal
Ontology
repositories
metadata
Other Interesting
vocabularies (e.g.,
IDOT, PAV, SD,
DOAP, …)
Standards &
Relevant (e.g., DC,
DCAT, SKOS, OWL,
PROV, OMV, VOID,
VOAF, MOD …)
21
22. AgroPortal landscape page
Display “per property”
• Global presentation of the properties
• Synthesis diagrams & listing
• Metadata automatically extracted from the files and authored by
us and the ontology developpers
• Allows to explore the agronomical ontology landscape by
automatically aggregating the metadata fields of each ontologies
in explicit vizualizations (charts, term cloud and graphs).
22
Jonquet, C., Toulet, A., Dutta, B., Emonet, V.: Harnessing the power of unified
metadata in an ontology repository: the case of AgroPortal. Data Semant.
UNDER REVIEW.
23. • Metadata vocabulary for Ontology
Description and publication
• 88 properties
• https://github.com/sifrproject/MOD-
Ontology
• To be discussed within the RDA
Vocabulary and Semantic Services
Interest Group (VSSIG)
23
Generalizing this
with MOD
Dutta, B., … Jonquet, C.: New Generation Metadata vocabulary for
Ontolog yDescription and Publication. 11th Metadata and
Semantics Research Conference, MTSR’17. , Tallinn, Estonia (2017).
24. Better ontology identification & selection
(via ontology metadata)
Multilingualism
Ontology alignment (creation & use)
Catching up with relevant data:
annotations and linked data
Generalized ontology-based services
(keep quality while enabling horizontal studies)
Scale
to multiple domain and to the number/variety of ontologies
Challenges for
ontology
repositories
24
Jonquet, C.: Challenges for ontology repositories
and applications to biomedicine & agronomy.
Keynote at 4th Annual International Symposium on
Information Management and Big Data, SIMBig’17.
pp. 25–37, Lima, Peru (2017).
25. Scale to multiple domain and
to the number/variety of
ontologies
• There are +600 ontologies and +110 ontology views in
BioPortal right now
• Mostly biology and medicine
• Overlaps with other domains
• Lots of upper level ontologies
• Lots of vocabularies
• Swoogle in 2007: “Search over 10.000 ontologies”
• Today a Google Search for “filetype:owl” returns around 34K
results
25
26. Mutualize efforts,
harmonize, enhance
robustness
• No repository (except the web itself)
will handle them all, while keeping the level
of features (and curation!)
• Avoid duplicating ontologies
• Connect repositories one another
• Sharing the technology is the best way to guaranty long term support and
future development (more interoperability)
• Developers all around the world, different funders & support
• We should be able to make a new portal for another community in minutes
• What matters is the motivation of a community
• Technology must follow and make this easy
26
28. FAIR ontologies are needed for
the European Open Science Cloud
• What’s needed
• Standardization of ontologies (OWL, RDF and SKOS)
• Strong community to develop good quality ontologies
• Standardization of Ontology Repositories
• Next
• Semantic Lookup Services prototype (project with Yann Le Franc)
• Convergence with B2NOTE (annotation of chunk of texts inside Web pages with the Annotators)
• Possible work with EU project OpenMinTeD to make our ontologies accessible to their research infrastructure
• Collaboration on EcoPortal
• Toward a new standard for ontology metadata with the VSSIG
Will EU take the lead on developing and maintaining ontology repositories for
Open Science in any domain ?
28
AgroPortal to reusing the outcomes of the biomedical domain: (i) to avoid re-developing technologies and tools that have already been designed and extensively used; (ii) to offer the same tools, services and formats in both domains to facilitate the interface and interaction between the domains e.g., to enable a user to query the BioPortal or the AgroPortal without changing a line of code