SlideShare a Scribd company logo
1 of 20
Semantic Mapping in
CLARIN Component Metadata
Matej Durco
Institute for Corpus Linguistics and Text Technology
matej.durco@assoc.oeaw.ac.at

Menzo Windhouwer
The Language Archive - DANS
menzo.windhouwer@dans.knaw.nl
MTSR 2013
Thessaloniki, Greece
Outline






CLARIN an european infrastructure for language resources
Component Metadata Infrastructure (CMDI)
Semantic Mapping in CMDI
Semantic mapping in the CLARIN joint metadata domain
Conclusions and future work
CLARIN
 CLARIN = Common Language Resources and Technology
Infrastructure = an european ESFRI infrastructure project
 Aims at providing easy and sustainable access for scholars
in the humanities and social sciences to digital language
data (in written, spoken, video or multimodal form) and
advanced tools to discover, explore, exploit, annotate,
analyze or combine them, independent of where they are
located.
 Building a networked federation of European data
repositories, service centers and centers of expertise.

 One pillar of this infrastructure is a joint metadata domain
http://www.clarin.eu/
Component Metadata Infrastructure
Rationale for CMDI
 Limitations of existing metadata schemas (OLAC/DCMI, IMDI,
TEI header)





Inflexible: too many (IMDI) or too few (OLAC) metadata elements
Limited interoperability (both semantic and syntactic)
Problematic (unfamiliar) terminology for some sub-communities.
Limited support for LT tool & services descriptions

 CMDI addresses this by:
 Explicit defined schema & semantics
 User/project/community defined components
http://www.clarin.eu/cmdi/
CMDI - example

Lets describe a
speech recording

Sample frequency

Format
Size

Technical
Metadata

…
CMDI - example

Lets describe a
speech recording

Name

Language

Id
…

Technical
Metadata
CMDI - example

Lets describe a
speech recording

Name

Actor

Age

Sex

Language

Language
…

Technical
Metadata
CMDI - example

Continent

Location

Country

Address
…

Actor
Language

Technical
Metadata

Lets describe a
speech recording
CMDI - example
Name

Project

Contact

…

Location

Actor
Language

Technical
Metadata

Lets describe a
speech recording
CMDI - example

Project

Lets describe a
speech recording

Location

Actor

Metadata schema
(W3C XML Schema)

Language

Technical
Metadata
Metadata Profile

Metadata description
(XML document)
CMDI - workflow

metadata
catalogue

component
registry &
editor

ISOcat

metadata
modeler

metadata
user
search &
semantic
mapping

metadata
curator

Relation
Registry

metadata
editor

Joint
metadata
repository

Local
metadata
repository

OAI-PMH
Service provider

OAI-PMH
Data provider

DATA

metadata
creator

metadata
curator
Semantic Mapping in CMDI
 A CMD component, element or value should be linked to a ‘concept’,
i.e., an URI that points to a semantic description
 ‘concepts’ can be shared indicating shared semantics

 Current components use mainly:
 Dublin Core elements or terms
 ISOcat Data Categories
 ISOcat (www.isocat.org) is an ISO 12620:2009 compliant Data
Category Registry
 allows ellaborate specifications, e.g., a definition, (alternative)
names, examples, explanations, value domains (all in various
languages)
 can be freely used by anyone, including the creation of new data
categories
 the Athens Core group has created many metadata data categories
inspired by OLAC, TEI Header and IMDI
Semantic Mapping in CMDI

Name

Language

Id
…

Semantic Registry
Language Name : A human understandable name of the language that ...
Language ID : Identifier of the language as defined by ISO 639 that …

Language

Dictionary

Author
…
Semantic Mapping in CMDI
 Due to the use of multiple ‘concept’ registries and the open
nature of some of them (almost) same-as relationships
have to be specified
 RELcat (under development) is a Relation Registry which
allows to store these in, possibly user or community specific,
sets
language ID
isocat:DC-2482
dc:language
language name
isocat:DC-2484

time coverage
isocat:DC-1502

relcat:subClassOf

dc:coverage
CMDI in CLARIN
2011-01
Profiles

2012-06

2013-01

2013-06

40

53

87

124

Components

164

298

542

828

Elements

511

893

1505

2399

Distinct Data
Categories (DCs)

203

266

436

499

Metadata DCs

277

712

774

791

24.7%

17.6%

21.5%

26.5%

% Elements w/o DCs




CMD profiles for existing metadata schemas like OLAC/DCMI, TEI Header and
META-SHARE have been created
Profiles differ a lot in structure:
 Small and flat profiles with 5 – 10 elements
 Large and complex profiles of up to 10 component levels with hundreds of elements



Around half a million CMD records are harvested from around 70 providers
http://catalog.clarin.eu/vlo/
CMD Semantic Mapping in CLARIN
 791 metadata Data Categories
 222 from Athens Core (recommended)
 2 showcases (of very common concepts):
 Language
 Name

 SMC (Semantic Mapping Component) Browser
 http://clarin.aac.ac.at/smc-browser
 Allows the metadata modeller to explore the semantic overlap
between profiles, components and elements in an interactive
graph
CMD Semantic Mapping in CLARIN
 Language
 LanguageID (http://www.isocat.org/datcat/DC-2482)
 languageName (http://www.isocat.org/datcat/DC-2484)
 Linked in the RelationRegistry with the Dublin Core term
language
 http://lux13.mpi.nl/relcat/set/cmdi (graph)

 Together these ‘concepts’ are linked with 80 profiles

 Other related language Data Categories could be
considered
 sourceLanguage, languageMother

 The Relation Registry allows to include them to maximize
the recall for a specific language
CMD Semantic Mapping in CLARIN
CMD Semantic Mapping in CLARIN
 Name
 Is a more ambiguous term used by 72 CMD elements
 12 different Data Categories are used by these elements







resourceName (http://www.isocat.org/datcat/DC-2544)
resourceTitle (http://www.isocat.org/datcat/DC-2545)
author (http://www.isocat.org/datcat/DC-4115)
contact full name (http://www.isocat.org/datcat/DC-2454)
dcterms:Contributor
...

 A naive search on ‘name’ would yield semantically very
heterogenous results, instead use
 The ‘concept’ links
 Context, i.e., the enclosing components of an element
Conclusion & future work
 The CMD Infrastructure is very flexible with regard to metadata
structures, but also provides an integrated semantic layer to achieve
semantic interoperability
 All the proper registries are in place and prove to be useful, e.g., by the
central CLARIN catalogue
 Users can search and navigate the metadata based on semantics
and are not directly confronted with the structural diversity
 Furture work: sometimes more context is needed for disambiguation

 However, for metadata modellers the percieved proliferation of reusable
profiles and component can be a burden
 The SMC browser gives already insight in (semantic) overlap and
differences
 Future work: statistics based on the instance data will also help to
select among profiles and components

More Related Content

What's hot

Technical integration of data repositories status and challenges
Technical integration of data repositories status and challengesTechnical integration of data repositories status and challenges
Technical integration of data repositories status and challengesvty
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertationssinglish
 
Map of the CETIS metadata and digital repository interoperability domain
Map of the CETIS metadata and digital repository interoperability domainMap of the CETIS metadata and digital repository interoperability domain
Map of the CETIS metadata and digital repository interoperability domainPhil Barker
 
An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...
An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...
An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...David Massart
 
Metadata for Terminology / KOS Resources
Metadata for Terminology / KOS ResourcesMetadata for Terminology / KOS Resources
Metadata for Terminology / KOS ResourcesMarcia Zeng
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyvty
 
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....IMPACT Centre of Competence
 
interoperability: the value of recombinant potential
interoperability: the value of recombinant potentialinteroperability: the value of recombinant potential
interoperability: the value of recombinant potentiallisld
 
Lecture semantic dataaccess_presentation
Lecture semantic dataaccess_presentationLecture semantic dataaccess_presentation
Lecture semantic dataaccess_presentationIKS - Project
 
Adri Jovin - Semantic Web
Adri Jovin - Semantic WebAdri Jovin - Semantic Web
Adri Jovin - Semantic WebAdri Jovin
 
Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...
Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...
Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...Eduserv Foundation
 
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Beat Signer
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mappingVlad Vega
 
A category theoretic model of rdf ontology
A category theoretic model of rdf ontologyA category theoretic model of rdf ontology
A category theoretic model of rdf ontologyIJwest
 

What's hot (20)

Technical integration of data repositories status and challenges
Technical integration of data repositories status and challengesTechnical integration of data repositories status and challenges
Technical integration of data repositories status and challenges
 
Networked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And DissertationsNetworked Digital Library Of Theses And Dissertations
Networked Digital Library Of Theses And Dissertations
 
Map of the CETIS metadata and digital repository interoperability domain
Map of the CETIS metadata and digital repository interoperability domainMap of the CETIS metadata and digital repository interoperability domain
Map of the CETIS metadata and digital repository interoperability domain
 
An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...
An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...
An Introduction to the IMS Learning Object Discovery and Exchange (LODE) Spec...
 
Metadata for Terminology / KOS Resources
Metadata for Terminology / KOS ResourcesMetadata for Terminology / KOS Resources
Metadata for Terminology / KOS Resources
 
Metadata Cloud
Metadata CloudMetadata Cloud
Metadata Cloud
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
 
interoperability: the value of recombinant potential
interoperability: the value of recombinant potentialinteroperability: the value of recombinant potential
interoperability: the value of recombinant potential
 
Lecture semantic dataaccess_presentation
Lecture semantic dataaccess_presentationLecture semantic dataaccess_presentation
Lecture semantic dataaccess_presentation
 
Adri Jovin - Semantic Web
Adri Jovin - Semantic WebAdri Jovin - Semantic Web
Adri Jovin - Semantic Web
 
Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...
Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...
Lomtologies - issues and challenges in maintaining simple LOM-related vocabul...
 
Sword 2007 06 22
Sword 2007 06 22Sword 2007 06 22
Sword 2007 06 22
 
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
 
LOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the StackLOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the Stack
 
Semantic Web Nature
Semantic Web NatureSemantic Web Nature
Semantic Web Nature
 
Web Spa
Web SpaWeb Spa
Web Spa
 
Ore 2007 06 22
Ore 2007 06 22Ore 2007 06 22
Ore 2007 06 22
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
 
A category theoretic model of rdf ontology
A category theoretic model of rdf ontologyA category theoretic model of rdf ontology
A category theoretic model of rdf ontology
 

Viewers also liked

Implementation of Semantic Mapping
Implementation of Semantic MappingImplementation of Semantic Mapping
Implementation of Semantic MappingRic
 
Semantics.lorena.pruna.tapia.learning activity 2.1
Semantics.lorena.pruna.tapia.learning activity 2.1Semantics.lorena.pruna.tapia.learning activity 2.1
Semantics.lorena.pruna.tapia.learning activity 2.1Lorena Tapia
 
Roles of assistant language teachers and Japanese teachers of English for a s...
Roles of assistant language teachers and Japanese teachers of English for a s...Roles of assistant language teachers and Japanese teachers of English for a s...
Roles of assistant language teachers and Japanese teachers of English for a s...Ken Urano
 
Social network analysis of Jose Rizal
Social network analysis of Jose RizalSocial network analysis of Jose Rizal
Social network analysis of Jose RizalJose Fadul
 
Working With Teaching Assistants - Session Sixteen
Working With Teaching Assistants - Session SixteenWorking With Teaching Assistants - Session Sixteen
Working With Teaching Assistants - Session SixteenMike Blamires
 
Progress in semantic mapping - NKOS
Progress in semantic mapping - NKOSProgress in semantic mapping - NKOS
Progress in semantic mapping - NKOSAntoine Isaac
 
Teacher assistants term 4 presentation
Teacher assistants term 4 presentationTeacher assistants term 4 presentation
Teacher assistants term 4 presentationMoniDonaldson
 
Kungshen english book 2 lesson 4 what do they eat
Kungshen english book 2 lesson 4 what do they eatKungshen english book 2 lesson 4 what do they eat
Kungshen english book 2 lesson 4 what do they eatFortuna Lu
 
Noli me tangere characters
Noli me tangere charactersNoli me tangere characters
Noli me tangere charactersImYakultGirl
 
comprehension and levels of comprehension
comprehension and levels of comprehensioncomprehension and levels of comprehension
comprehension and levels of comprehensionmarimar27
 
Dimensional approach
Dimensional approachDimensional approach
Dimensional approachAndrew Scott
 
Reading comprehension
Reading comprehensionReading comprehension
Reading comprehensionjovamson
 

Viewers also liked (13)

Implementation of Semantic Mapping
Implementation of Semantic MappingImplementation of Semantic Mapping
Implementation of Semantic Mapping
 
Word webs
Word websWord webs
Word webs
 
Semantics.lorena.pruna.tapia.learning activity 2.1
Semantics.lorena.pruna.tapia.learning activity 2.1Semantics.lorena.pruna.tapia.learning activity 2.1
Semantics.lorena.pruna.tapia.learning activity 2.1
 
Roles of assistant language teachers and Japanese teachers of English for a s...
Roles of assistant language teachers and Japanese teachers of English for a s...Roles of assistant language teachers and Japanese teachers of English for a s...
Roles of assistant language teachers and Japanese teachers of English for a s...
 
Social network analysis of Jose Rizal
Social network analysis of Jose RizalSocial network analysis of Jose Rizal
Social network analysis of Jose Rizal
 
Working With Teaching Assistants - Session Sixteen
Working With Teaching Assistants - Session SixteenWorking With Teaching Assistants - Session Sixteen
Working With Teaching Assistants - Session Sixteen
 
Progress in semantic mapping - NKOS
Progress in semantic mapping - NKOSProgress in semantic mapping - NKOS
Progress in semantic mapping - NKOS
 
Teacher assistants term 4 presentation
Teacher assistants term 4 presentationTeacher assistants term 4 presentation
Teacher assistants term 4 presentation
 
Kungshen english book 2 lesson 4 what do they eat
Kungshen english book 2 lesson 4 what do they eatKungshen english book 2 lesson 4 what do they eat
Kungshen english book 2 lesson 4 what do they eat
 
Noli me tangere characters
Noli me tangere charactersNoli me tangere characters
Noli me tangere characters
 
comprehension and levels of comprehension
comprehension and levels of comprehensioncomprehension and levels of comprehension
comprehension and levels of comprehension
 
Dimensional approach
Dimensional approachDimensional approach
Dimensional approach
 
Reading comprehension
Reading comprehensionReading comprehension
Reading comprehension
 

Similar to Semantic Mapping in CLARIN Component Metadata.

Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21vty
 
Ontologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and DataverseOntologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and Dataversevty
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentManjulaPatel
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Andrea Scharnhorst
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...vty
 
HDL - Towards A Harmonized Dataset Model for Open Data Portals
HDL - Towards A Harmonized Dataset Model for Open Data PortalsHDL - Towards A Harmonized Dataset Model for Open Data Portals
HDL - Towards A Harmonized Dataset Model for Open Data PortalsAhmad Assaf
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the HaystackAdrian Stevenson
 
Metadata lecture(9 17-14)
Metadata lecture(9 17-14)Metadata lecture(9 17-14)
Metadata lecture(9 17-14)mhb120
 
The Mysteries of Metadata
The Mysteries of MetadataThe Mysteries of Metadata
The Mysteries of MetadataAmit Sheth
 
ABCD Open Source Software for managing ETD repositories
ABCD Open Source Software for managing ETD repositoriesABCD Open Source Software for managing ETD repositories
ABCD Open Source Software for managing ETD repositoriessangeetadhamdhere
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
 
Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...openminted_eu
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Dan Brickley
 
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Artificial Intelligence Institute at UofSC
 

Similar to Semantic Mapping in CLARIN Component Metadata. (20)

Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Ontologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and DataverseOntologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and Dataverse
 
Knowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents EnvironmentKnowledge Discovery in an Agents Environment
Knowledge Discovery in an Agents Environment
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
Metadata
MetadataMetadata
Metadata
 
HDL - Towards A Harmonized Dataset Model for Open Data Portals
HDL - Towards A Harmonized Dataset Model for Open Data PortalsHDL - Towards A Harmonized Dataset Model for Open Data Portals
HDL - Towards A Harmonized Dataset Model for Open Data Portals
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the Haystack
 
Metadata lecture(9 17-14)
Metadata lecture(9 17-14)Metadata lecture(9 17-14)
Metadata lecture(9 17-14)
 
The Mysteries of Metadata
The Mysteries of MetadataThe Mysteries of Metadata
The Mysteries of Metadata
 
ABCD Open Source Software for managing ETD repositories
ABCD Open Source Software for managing ETD repositoriesABCD Open Source Software for managing ETD repositories
ABCD Open Source Software for managing ETD repositories
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...
 
Semantic web Santhosh N Basavarajappa
Semantic web   Santhosh N BasavarajappaSemantic web   Santhosh N Basavarajappa
Semantic web Santhosh N Basavarajappa
 
It's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic webIt's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic web
 
Ontology based metadata schema for digital library projects in China
Ontology based metadata schema for digital library projects in ChinaOntology based metadata schema for digital library projects in China
Ontology based metadata schema for digital library projects in China
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
 

More from Menzo Windhouwer

Fedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN InfrastructureFedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN InfrastructureMenzo Windhouwer
 
ISOcat and RELcat, two cooperating semantic registries
	ISOcat and RELcat, two cooperating semantic registries	ISOcat and RELcat, two cooperating semantic registries
ISOcat and RELcat, two cooperating semantic registriesMenzo Windhouwer
 
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...Menzo Windhouwer
 
A CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web ServicesA CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web ServicesMenzo Windhouwer
 
LDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesLDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesMenzo Windhouwer
 
What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?Menzo Windhouwer
 
On the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categoriesOn the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categoriesMenzo Windhouwer
 
ISOcat: a short introduction
ISOcat: a short introductionISOcat: a short introduction
ISOcat: a short introductionMenzo Windhouwer
 
Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.Menzo Windhouwer
 

More from Menzo Windhouwer (13)

CMD2RDF
CMD2RDFCMD2RDF
CMD2RDF
 
Fedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN InfrastructureFedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN Infrastructure
 
ISOcat and RELcat, two cooperating semantic registries
	ISOcat and RELcat, two cooperating semantic registries	ISOcat and RELcat, two cooperating semantic registries
ISOcat and RELcat, two cooperating semantic registries
 
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
 
A CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web ServicesA CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web Services
 
LDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesLDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data Categories
 
What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?
 
ISOcat to LMF to TEI
ISOcat to LMF to TEIISOcat to LMF to TEI
ISOcat to LMF to TEI
 
On the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categoriesOn the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categories
 
The ISO-DCR
The ISO-DCRThe ISO-DCR
The ISO-DCR
 
Use of ISOcat within CMDI
Use of ISOcat within CMDIUse of ISOcat within CMDI
Use of ISOcat within CMDI
 
ISOcat: a short introduction
ISOcat: a short introductionISOcat: a short introduction
ISOcat: a short introduction
 
Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.
 

Recently uploaded

Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...astropune
 
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...Neha Kaur
 
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night EnjoyCall Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoybabeytanya
 
Chandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD availableChandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD availableDipal Arora
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...hotbabesbook
 
High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...narwatsonia7
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service KochiLow Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service KochiSuhani Kapoor
 
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...Taniya Sharma
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escortsaditipandeya
 
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...narwatsonia7
 
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service CoimbatoreCall Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatorenarwatsonia7
 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escortsvidya singh
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...jageshsingh5554
 
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...CALL GIRLS
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...Garima Khatri
 
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
 
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night EnjoyCall Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
Call Girl Number in Panvel Mumbai📲 9833363713 💞 Full Night Enjoy
 
Chandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD availableChandrapur Call girls 8617370543 Provides all area service COD available
Chandrapur Call girls 8617370543 Provides all area service COD available
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
 
High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
 
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service KochiLow Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
 
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
 
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
 
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
 
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service CoimbatoreCall Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
 
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
 
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
 
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
Book Paid Powai Call Girls Mumbai 𖠋 9930245274 𖠋Low Budget Full Independent H...
 

Semantic Mapping in CLARIN Component Metadata.

  • 1. Semantic Mapping in CLARIN Component Metadata Matej Durco Institute for Corpus Linguistics and Text Technology matej.durco@assoc.oeaw.ac.at Menzo Windhouwer The Language Archive - DANS menzo.windhouwer@dans.knaw.nl MTSR 2013 Thessaloniki, Greece
  • 2. Outline      CLARIN an european infrastructure for language resources Component Metadata Infrastructure (CMDI) Semantic Mapping in CMDI Semantic mapping in the CLARIN joint metadata domain Conclusions and future work
  • 3. CLARIN  CLARIN = Common Language Resources and Technology Infrastructure = an european ESFRI infrastructure project  Aims at providing easy and sustainable access for scholars in the humanities and social sciences to digital language data (in written, spoken, video or multimodal form) and advanced tools to discover, explore, exploit, annotate, analyze or combine them, independent of where they are located.  Building a networked federation of European data repositories, service centers and centers of expertise.  One pillar of this infrastructure is a joint metadata domain http://www.clarin.eu/
  • 4. Component Metadata Infrastructure Rationale for CMDI  Limitations of existing metadata schemas (OLAC/DCMI, IMDI, TEI header)     Inflexible: too many (IMDI) or too few (OLAC) metadata elements Limited interoperability (both semantic and syntactic) Problematic (unfamiliar) terminology for some sub-communities. Limited support for LT tool & services descriptions  CMDI addresses this by:  Explicit defined schema & semantics  User/project/community defined components http://www.clarin.eu/cmdi/
  • 5. CMDI - example Lets describe a speech recording Sample frequency Format Size Technical Metadata …
  • 6. CMDI - example Lets describe a speech recording Name Language Id … Technical Metadata
  • 7. CMDI - example Lets describe a speech recording Name Actor Age Sex Language Language … Technical Metadata
  • 10. CMDI - example Project Lets describe a speech recording Location Actor Metadata schema (W3C XML Schema) Language Technical Metadata Metadata Profile Metadata description (XML document)
  • 11. CMDI - workflow metadata catalogue component registry & editor ISOcat metadata modeler metadata user search & semantic mapping metadata curator Relation Registry metadata editor Joint metadata repository Local metadata repository OAI-PMH Service provider OAI-PMH Data provider DATA metadata creator metadata curator
  • 12. Semantic Mapping in CMDI  A CMD component, element or value should be linked to a ‘concept’, i.e., an URI that points to a semantic description  ‘concepts’ can be shared indicating shared semantics  Current components use mainly:  Dublin Core elements or terms  ISOcat Data Categories  ISOcat (www.isocat.org) is an ISO 12620:2009 compliant Data Category Registry  allows ellaborate specifications, e.g., a definition, (alternative) names, examples, explanations, value domains (all in various languages)  can be freely used by anyone, including the creation of new data categories  the Athens Core group has created many metadata data categories inspired by OLAC, TEI Header and IMDI
  • 13. Semantic Mapping in CMDI Name Language Id … Semantic Registry Language Name : A human understandable name of the language that ... Language ID : Identifier of the language as defined by ISO 639 that … Language Dictionary Author …
  • 14. Semantic Mapping in CMDI  Due to the use of multiple ‘concept’ registries and the open nature of some of them (almost) same-as relationships have to be specified  RELcat (under development) is a Relation Registry which allows to store these in, possibly user or community specific, sets language ID isocat:DC-2482 dc:language language name isocat:DC-2484 time coverage isocat:DC-1502 relcat:subClassOf dc:coverage
  • 15. CMDI in CLARIN 2011-01 Profiles 2012-06 2013-01 2013-06 40 53 87 124 Components 164 298 542 828 Elements 511 893 1505 2399 Distinct Data Categories (DCs) 203 266 436 499 Metadata DCs 277 712 774 791 24.7% 17.6% 21.5% 26.5% % Elements w/o DCs   CMD profiles for existing metadata schemas like OLAC/DCMI, TEI Header and META-SHARE have been created Profiles differ a lot in structure:  Small and flat profiles with 5 – 10 elements  Large and complex profiles of up to 10 component levels with hundreds of elements  Around half a million CMD records are harvested from around 70 providers http://catalog.clarin.eu/vlo/
  • 16. CMD Semantic Mapping in CLARIN  791 metadata Data Categories  222 from Athens Core (recommended)  2 showcases (of very common concepts):  Language  Name  SMC (Semantic Mapping Component) Browser  http://clarin.aac.ac.at/smc-browser  Allows the metadata modeller to explore the semantic overlap between profiles, components and elements in an interactive graph
  • 17. CMD Semantic Mapping in CLARIN  Language  LanguageID (http://www.isocat.org/datcat/DC-2482)  languageName (http://www.isocat.org/datcat/DC-2484)  Linked in the RelationRegistry with the Dublin Core term language  http://lux13.mpi.nl/relcat/set/cmdi (graph)  Together these ‘concepts’ are linked with 80 profiles  Other related language Data Categories could be considered  sourceLanguage, languageMother  The Relation Registry allows to include them to maximize the recall for a specific language
  • 18. CMD Semantic Mapping in CLARIN
  • 19. CMD Semantic Mapping in CLARIN  Name  Is a more ambiguous term used by 72 CMD elements  12 different Data Categories are used by these elements       resourceName (http://www.isocat.org/datcat/DC-2544) resourceTitle (http://www.isocat.org/datcat/DC-2545) author (http://www.isocat.org/datcat/DC-4115) contact full name (http://www.isocat.org/datcat/DC-2454) dcterms:Contributor ...  A naive search on ‘name’ would yield semantically very heterogenous results, instead use  The ‘concept’ links  Context, i.e., the enclosing components of an element
  • 20. Conclusion & future work  The CMD Infrastructure is very flexible with regard to metadata structures, but also provides an integrated semantic layer to achieve semantic interoperability  All the proper registries are in place and prove to be useful, e.g., by the central CLARIN catalogue  Users can search and navigate the metadata based on semantics and are not directly confronted with the structural diversity  Furture work: sometimes more context is needed for disambiguation  However, for metadata modellers the percieved proliferation of reusable profiles and component can be a burden  The SMC browser gives already insight in (semantic) overlap and differences  Future work: statistics based on the instance data will also help to select among profiles and components