SlideShare a Scribd company logo
1 of 23
Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
Hugo Manguinhas, Nuno Freire, Antoine Isaac, Valentine Charles| Europeana Foundation
Juliane Stiller| Humboldt-Universität zu Berlin
Aitor Soroa| University of the Basque Country
Rainer Simon| Austrian Institute of Technology
Vladimir Alexiev| Ontotext Corp
Agenda
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
• Europeana and Semantic Enrichments
• Goal and Steps of the Evaluation
• Results and Evaluation
• Conclusion
What is Europeana?
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
We aggregate metadata:
• From all EU countries
• ~3,500 galleries, libraries,
archives and museums
• More than 52M objects
• In about 50 languages
• Huge amount of references to
places, agents, concepts, time
Europeana aggregation infrastructure
Europeana| CC BY-SA
The Platform for Europe’s Digital Cultural Heritage
What are Semantic Enrichments?
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
● Key technique to contextualize objects
● Aims at adding new information at the semantic level to the
metadata about certain resources
● Linked Data context -> creation of new links between the
enriched resources and others
● Information Retrieval -> adding new terms to a query or
document and therefore reaching a higher visibility of
documents within the document space
Semantic Enrichments in Europeana
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
Offers richer context to items by linking them to
relevant entities (people, places, objects, types)
Automatic Enrichments are Challenging as
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
• Bewildering variety of objects;
• Differing provider practices for providing extra information;
• Data normalization issues;
• Information in a variety of languages, without the
appropriate language tags to determine the language.
Established a Task Force Gathering Domain Experts
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
Goals of the Task Force
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
• Perform concrete evaluations and comparisons of enrichment
tools in the CH domain
• Identify methodologies for evaluating enrichment in CH, which:
• are realistic wrt. the amount of resources to be employed;
• facilitate the measure of enrichment progress over time;
• build a reference set of metadata and enrichments that can be used
for future comparative evaluations.
Enrichment Tools Evaluated
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
• Semantic Enrichment Framework used at Europeana
• Dictionary based against a curated subset of DBpedia, Geonames and
SemiumTime
• The European Library named entity recognition process
• NERD and heuristic based againts GeoNames, GemeinsamenNormdatei
• The Background Link service developed within the LoCloud project
• NERD applying supervised statistical methods against DBpedia
• The Vocabulary Matching service (also from LoCloud project)
• Dictionary based against several thesauri in SKOS
• The Recogito open source tool used in Pelagios 3 project
• NERD and heuristic based against Wikidata
• The Ontotext Semantic Platform enrichment tool
• NERD with NLP against MediaGraph
Evaluation of Enrichments Tools
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
Steps performed:
1. Select a sample dataset for enrichment,
2. Apply different tools to enrich the sample dataset,
3. Manually create reference annotated corpus,
4. Compare automatic annotations with the manual ones to
determine performance of enrichment tools.
The Sample Dataset
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
• A subset of The European Library (TEL) submitted to
Europeana
• Biggest data aggregator for Europeana -> widest coverage of
countries and languages.
• For each of the 19 countries, we randomly selected a
maximum of 1000 records.
Resulting in a dataset of 17,300 metadata
records in 10 languages
Building the corpus
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
● Each tool was applied to the dataset resulting in a total of 360k
enrichments
● Each enrichment set for each tool was:
○ normalized to a reference target vocabulary using coreferencing
information (~62.16% of success)
■ Geonames (places), DBpedia (for other resource types)
○ compared to identified overlap between tools resulting in 26
agreement combinations (plus non-agreements)
● At most 1000 enrichments were selected per set
○ resulting in a main corpus of 1757 enrichments
○ a secondary corpus composed of a subset of 46 enrichments for
measuring the inter-rater agreement
Building the corpus (overview)
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
Subset of TEL
Dataset
(17.3K)
Raw
Enrichments
(~360K)
Evaluation Corpus
(1757 distinct
enrichments)
Inter-Annotator
Corpus
(46 enrichments)
Annotated Corpus
(1757 enrichments)
Agreement
Combinations
(26 sets)
Normalized
Enrichments
(~360K)
TEL Dataset
(~175K)
Selected at most 1000
distinct enrichments per
distinct set
Normalized using a
reference target dataset
Compared
amongst tools
Randomly selected ~ 1000
per country (19)
Applied 7 enrichment tools
/ tools settings
Annotated according to the
Annotation Guidelines
Annotation criteria and guidelines
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
Category Annotation Description
Semantic
correctness
C = Correct
I = Incorrect
U = Unsure
Is the enrichment semantically correct or not?
Completeness of name
match
F = Full match
P = Partial match
U = Unsure
Was the whole phrase/named entity enriched or only parts of it?
Completeness of
concept match
F = Full match
B = Broader than
N = Narrower than
U = Unsure
Whether the matched concept is at the same level of conceptual abstraction
as the named entity/phrase. Since sometimes the exact concept is not
available in the target vocabulary, a narrower or broader concept may be
used in the enrichment.
Use B when the concept identified (i.e. target) is broader than the intended
concept. Use N, when it is narrower.
This is also true for a geographical region where the concept identified
describes a smaller entity (narrower code), whereas the “broader”-code
refers to a bigger geographical region.
In addition to the criteria, the guidelines also included:
● explanation of how to apply the criteria
● concrete examples matching all possible combinations of the criteria
Example of the annotated corpus
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
16 members of our Task Force
manually assessed the correctness of
both corpora applying the criteria
Measuring performance of each tool
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
● We adapted the notions of Information Retrieval for precision and
recall to measure performance of enrichment tools
● Two measures were defined considering the criteria:
○ relaxed: all enrichments annotated as semantically correct are
considered “true” regardless of their completeness
○ strict: considers as “true” the semantically correct enrichments with a
full name and concept completeness
● As it was not possible within the TF to identify all possible true
enrichments, we chose for a pooled recall using the total number
of enrichments
Measuring performance of each tool
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
Tools
Annotated
Enrichments
(% of full
corpus)
Precision Max
Pooled
Recall
Estimated
Recall
Estimated
F-measure
Relax. Strict Relax. Strict Relax. Strict
Group
A
EF 550 (31.3%) 0.985 0.965 0.458 0.355 0.432 0.522 0.597
TEL 391 (22.3%) 0.982 0.982 0.325 0.254 0.315 0.404 0.477
Group
B
BgLinks 427 (24.3%) 0.888 0.574 0.355 0.249 0.200 0.389 0.296
Pelagios 502 (28.6%) 0.854 0.820 0.418 0.286 0.340 0.428 0.481
VocMatch 100 (05.7%) 0.774 0.312 0.083 0.048 0.024 0.091 0.045
Ontotext v1 489 (27.8%) 0.842 0.505 0.407 0.272 0.202 0.411 0.289
Ontotext v2 682 (38.8%) 0.924 0.632 0.567 0.418 0.354 0.576 0.454
The observer percentage agreement for Fleiss Kappa was 79.9%
Conclusions
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
1. Select a dataset for your evaluation that represents the diversity of
your (meta)data
2. Building a gold standard is ideal but not always possible
3. Consider using the semantics of target datasets
4. Try to keep balance between evaluated tools
5. Give clear guidelines on how to annotate the corpus
6. Use the right tool for annotating the corpus
Recommendation for Enrichment Tools
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
● Consider applying different techniques depending on the kind of
values used within the enriched property
● Matches on parts of a field's textual content may result in too
general or even meaningless enrichments if they fail to recognize
compound expressions
○ In the Europeana context, concepts so broad as "general period" do
not bring any value as enrichments
● Apply strong disambiguation mechanism that considers the
accuracy of a name reference together with the relevance of the
entity
● Quality issues originating in the metadata mapping process are a
great obstacle to get good enrichments
Next steps...
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
● Build a gold standard with
○ significant language and spatial coverage and
○ diversity of subjects and domains
● Develop a tool for helping annotators in creating and managing
the gold standard
○ tuned for the annotation task, collaborative and easily accessible
● Develop (or extend) a solution for running evaluations
○ Possibly Inspired by GERBIL
● Re-evaluate the tools
● Integrate enrichment evaluation into an holistic evaluation
framework for search
Thank you!
References
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
● Monti, Johanna, Mario Monteleone, Maria Pia di Buono, and Federica Marano. 2013. “Cross-
Lingual Information Retrieval and Semantic Interoperability for Cultural Heritage Repositories.”
In Recent Advances in Natural Language Processing, RANLP 2013, 9-11 September, 2013, Hissar,
Bulgaria, 483–90. http://aclweb.org/anthology/R/R13/R13-1063.pdf.
● Petras, Vivien, Toine Bogers, Elaine G. Toms, Mark M. Hall, Jacques Savoy, Piotr Malak, Adam
Pawlowski, Nicola Ferro, and Ivano Masiero. 2013. “Cultural Heritage in CLEF (CHiC) 2013.” In
Information Access Evaluation. Multilinguality, Multimodality, and Visualization - 4th International
Conference of the CLEF Initiative, CLEF 2013, Valencia, Spain, September 23-26, 2013. Proceedings,
192–211. doi:10.1007/978-3-642-40802-1_23.
● Stiller, Juliane, Vivien Petras, Maria Gäde, and Antoine Isaac. 2014. “Automatic Enrichments with
Controlled Vocabularies in Europeana: Challenges and Consequences.” In Digital Heritage.
Progress in Cultural Heritage: Documentation, Preservation, and Protection, 238–47.
● Usbeck, Ricardo, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin
Brümmer, Diego Ceccarelli, et al. 2015. “GERBIL: General Entity Annotator Benchmarking
Framework.” In Proceedings of the 24th International Conference on World Wide Web, 1133–43.
WWW ’15. New York, NY, USA: ACM. doi:10.1145/2736277.2741626.
Previous Studies
TPDL 2016 - Exploring Comparative Evaluation of Semantic
Enrichment Tools for Cultural Heritage Metadata
CC BY-SA
• Evaluations of enrichments were performed sporadically:
• Sample evaluation of alignments of places and agents to DBpedia
• Evaluation of the relationship between enrichments, objects and
queries which retrieve enriched objects (Stiller et al. 2014)
• Evaluation of retrieval performances in the CH domain indirectly also
target enrichments (e.g. Monti et al. 2013, Petras et al. 2013)
• Tools and frameworks for enrichment:
• GERBIL (Usbeck et al. 2015)

More Related Content

What's hot

IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...IMPACT Centre of Competence
 
LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectEuropeana Newspapers
 
Europeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max KaiserEuropeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max KaiserEuropeana
 
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...The European Library
 
The Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventThe Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventEuropeana Newspapers
 
The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012Europeana Newspapers
 
LDBC 19 November 2013
LDBC 19 November 2013  LDBC 19 November 2013
LDBC 19 November 2013 Europeana
 
Big (Language) Data – From research strategies to proof-of-concept and implem...
Big (Language) Data – From research strategies to proof-of-concept and implem...Big (Language) Data – From research strategies to proof-of-concept and implem...
Big (Language) Data – From research strategies to proof-of-concept and implem...LEARN Project
 
The European(a) Newspapers Project
The European(a) Newspapers ProjectThe European(a) Newspapers Project
The European(a) Newspapers ProjectEuropeana Newspapers
 

What's hot (11)

IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
 
LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers Project
 
Europeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max KaiserEuropeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
 
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
 
The Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventThe Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final Event
 
The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012
 
LDBC 19 November 2013
LDBC 19 November 2013  LDBC 19 November 2013
LDBC 19 November 2013
 
Europeana datainaction nov2012
Europeana datainaction nov2012Europeana datainaction nov2012
Europeana datainaction nov2012
 
Big (Language) Data – From research strategies to proof-of-concept and implem...
Big (Language) Data – From research strategies to proof-of-concept and implem...Big (Language) Data – From research strategies to proof-of-concept and implem...
Big (Language) Data – From research strategies to proof-of-concept and implem...
 
The Europeana Newspapers Project
The Europeana Newspapers ProjectThe Europeana Newspapers Project
The Europeana Newspapers Project
 
The European(a) Newspapers Project
The European(a) Newspapers ProjectThe European(a) Newspapers Project
The European(a) Newspapers Project
 

Similar to Exploring comparative evaluation of semantic enrichment tools for cultural heritage metadata

Building an ecosystem of networked references
Building an ecosystem of networked referencesBuilding an ecosystem of networked references
Building an ecosystem of networked referencesHugo Manguinhas
 
Enrichment and Europeana
Enrichment and EuropeanaEnrichment and Europeana
Enrichment and EuropeanaAntoine Isaac
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....semanticsconference
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
 
Toward FAIR Semantic Resources
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic ResourcesEUDAT
 
Linking Historical Sources to Established Knowledge Bases in Order to Inform ...
Linking Historical Sources to Established Knowledge Bases in Order to Inform ...Linking Historical Sources to Established Knowledge Bases in Order to Inform ...
Linking Historical Sources to Established Knowledge Bases in Order to Inform ...Annalina Caputo
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPariadnenetwork
 
Are New Digital Literacies Skills Neededrscd2018
Are New Digital Literacies Skills Neededrscd2018Are New Digital Literacies Skills Neededrscd2018
Are New Digital Literacies Skills Neededrscd2018SusanMRob
 
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013Ilias Hatzakis
 
Facilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipationFacilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipationIMC Technologies
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Lifeng (Aaron) Han
 
Ocls 4th annual breakfast 2016
Ocls 4th annual breakfast 2016Ocls 4th annual breakfast 2016
Ocls 4th annual breakfast 2016Jan Dawson
 
Introduction to OpenAIRE services and the OpenAIRE Research Graph
Introduction to OpenAIRE services and the OpenAIRE Research GraphIntroduction to OpenAIRE services and the OpenAIRE Research Graph
Introduction to OpenAIRE services and the OpenAIRE Research GraphOpenAIRE
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsPeter Haase
 
Linked Open Data and Applications
Linked Open Data and Applications Linked Open Data and Applications
Linked Open Data and Applications Victor de Boer
 
Metadata ingestion plan presentation
Metadata ingestion plan presentationMetadata ingestion plan presentation
Metadata ingestion plan presentationEuropeana_Sounds
 
Diachronic Analysis of Language exploiting Google Ngram
Diachronic Analysis of Language exploiting Google NgramDiachronic Analysis of Language exploiting Google Ngram
Diachronic Analysis of Language exploiting Google NgramAnnalina Caputo
 

Similar to Exploring comparative evaluation of semantic enrichment tools for cultural heritage metadata (20)

Building an ecosystem of networked references
Building an ecosystem of networked referencesBuilding an ecosystem of networked references
Building an ecosystem of networked references
 
Enrichment and Europeana
Enrichment and EuropeanaEnrichment and Europeana
Enrichment and Europeana
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
Toward FAIR Semantic Resources
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic Resources
 
Linking Historical Sources to Established Knowledge Bases in Order to Inform ...
Linking Historical Sources to Established Knowledge Bases in Order to Inform ...Linking Historical Sources to Established Knowledge Bases in Order to Inform ...
Linking Historical Sources to Established Knowledge Bases in Order to Inform ...
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLP
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Are New Digital Literacies Skills Neededrscd2018
Are New Digital Literacies Skills Neededrscd2018Are New Digital Literacies Skills Neededrscd2018
Are New Digital Literacies Skills Neededrscd2018
 
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
Facilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipationFacilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipation
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
Ocls 4th annual breakfast 2016
Ocls 4th annual breakfast 2016Ocls 4th annual breakfast 2016
Ocls 4th annual breakfast 2016
 
Ingo Blees, Richard Heinen: Connecting Resources and Users - requirements for...
Ingo Blees, Richard Heinen: Connecting Resources and Users - requirements for...Ingo Blees, Richard Heinen: Connecting Resources and Users - requirements for...
Ingo Blees, Richard Heinen: Connecting Resources and Users - requirements for...
 
Introduction to OpenAIRE services and the OpenAIRE Research Graph
Introduction to OpenAIRE services and the OpenAIRE Research GraphIntroduction to OpenAIRE services and the OpenAIRE Research Graph
Introduction to OpenAIRE services and the OpenAIRE Research Graph
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Linked Open Data and Applications
Linked Open Data and Applications Linked Open Data and Applications
Linked Open Data and Applications
 
Metadata ingestion plan presentation
Metadata ingestion plan presentationMetadata ingestion plan presentation
Metadata ingestion plan presentation
 
Diachronic Analysis of Language exploiting Google Ngram
Diachronic Analysis of Language exploiting Google NgramDiachronic Analysis of Language exploiting Google Ngram
Diachronic Analysis of Language exploiting Google Ngram
 

Recently uploaded

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Exploring comparative evaluation of semantic enrichment tools for cultural heritage metadata

  • 1. Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata Hugo Manguinhas, Nuno Freire, Antoine Isaac, Valentine Charles| Europeana Foundation Juliane Stiller| Humboldt-Universität zu Berlin Aitor Soroa| University of the Basque Country Rainer Simon| Austrian Institute of Technology Vladimir Alexiev| Ontotext Corp
  • 2. Agenda TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA • Europeana and Semantic Enrichments • Goal and Steps of the Evaluation • Results and Evaluation • Conclusion
  • 3. What is Europeana? TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA We aggregate metadata: • From all EU countries • ~3,500 galleries, libraries, archives and museums • More than 52M objects • In about 50 languages • Huge amount of references to places, agents, concepts, time Europeana aggregation infrastructure Europeana| CC BY-SA The Platform for Europe’s Digital Cultural Heritage
  • 4. What are Semantic Enrichments? TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA ● Key technique to contextualize objects ● Aims at adding new information at the semantic level to the metadata about certain resources ● Linked Data context -> creation of new links between the enriched resources and others ● Information Retrieval -> adding new terms to a query or document and therefore reaching a higher visibility of documents within the document space
  • 5. Semantic Enrichments in Europeana TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA Offers richer context to items by linking them to relevant entities (people, places, objects, types)
  • 6. Automatic Enrichments are Challenging as TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA • Bewildering variety of objects; • Differing provider practices for providing extra information; • Data normalization issues; • Information in a variety of languages, without the appropriate language tags to determine the language.
  • 7. Established a Task Force Gathering Domain Experts TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA
  • 8. Goals of the Task Force TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA • Perform concrete evaluations and comparisons of enrichment tools in the CH domain • Identify methodologies for evaluating enrichment in CH, which: • are realistic wrt. the amount of resources to be employed; • facilitate the measure of enrichment progress over time; • build a reference set of metadata and enrichments that can be used for future comparative evaluations.
  • 9. Enrichment Tools Evaluated TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA • Semantic Enrichment Framework used at Europeana • Dictionary based against a curated subset of DBpedia, Geonames and SemiumTime • The European Library named entity recognition process • NERD and heuristic based againts GeoNames, GemeinsamenNormdatei • The Background Link service developed within the LoCloud project • NERD applying supervised statistical methods against DBpedia • The Vocabulary Matching service (also from LoCloud project) • Dictionary based against several thesauri in SKOS • The Recogito open source tool used in Pelagios 3 project • NERD and heuristic based against Wikidata • The Ontotext Semantic Platform enrichment tool • NERD with NLP against MediaGraph
  • 10. Evaluation of Enrichments Tools TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA Steps performed: 1. Select a sample dataset for enrichment, 2. Apply different tools to enrich the sample dataset, 3. Manually create reference annotated corpus, 4. Compare automatic annotations with the manual ones to determine performance of enrichment tools.
  • 11. The Sample Dataset TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA • A subset of The European Library (TEL) submitted to Europeana • Biggest data aggregator for Europeana -> widest coverage of countries and languages. • For each of the 19 countries, we randomly selected a maximum of 1000 records. Resulting in a dataset of 17,300 metadata records in 10 languages
  • 12. Building the corpus TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA ● Each tool was applied to the dataset resulting in a total of 360k enrichments ● Each enrichment set for each tool was: ○ normalized to a reference target vocabulary using coreferencing information (~62.16% of success) ■ Geonames (places), DBpedia (for other resource types) ○ compared to identified overlap between tools resulting in 26 agreement combinations (plus non-agreements) ● At most 1000 enrichments were selected per set ○ resulting in a main corpus of 1757 enrichments ○ a secondary corpus composed of a subset of 46 enrichments for measuring the inter-rater agreement
  • 13. Building the corpus (overview) TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA Subset of TEL Dataset (17.3K) Raw Enrichments (~360K) Evaluation Corpus (1757 distinct enrichments) Inter-Annotator Corpus (46 enrichments) Annotated Corpus (1757 enrichments) Agreement Combinations (26 sets) Normalized Enrichments (~360K) TEL Dataset (~175K) Selected at most 1000 distinct enrichments per distinct set Normalized using a reference target dataset Compared amongst tools Randomly selected ~ 1000 per country (19) Applied 7 enrichment tools / tools settings Annotated according to the Annotation Guidelines
  • 14. Annotation criteria and guidelines TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA Category Annotation Description Semantic correctness C = Correct I = Incorrect U = Unsure Is the enrichment semantically correct or not? Completeness of name match F = Full match P = Partial match U = Unsure Was the whole phrase/named entity enriched or only parts of it? Completeness of concept match F = Full match B = Broader than N = Narrower than U = Unsure Whether the matched concept is at the same level of conceptual abstraction as the named entity/phrase. Since sometimes the exact concept is not available in the target vocabulary, a narrower or broader concept may be used in the enrichment. Use B when the concept identified (i.e. target) is broader than the intended concept. Use N, when it is narrower. This is also true for a geographical region where the concept identified describes a smaller entity (narrower code), whereas the “broader”-code refers to a bigger geographical region. In addition to the criteria, the guidelines also included: ● explanation of how to apply the criteria ● concrete examples matching all possible combinations of the criteria
  • 15. Example of the annotated corpus TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA 16 members of our Task Force manually assessed the correctness of both corpora applying the criteria
  • 16. Measuring performance of each tool TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA ● We adapted the notions of Information Retrieval for precision and recall to measure performance of enrichment tools ● Two measures were defined considering the criteria: ○ relaxed: all enrichments annotated as semantically correct are considered “true” regardless of their completeness ○ strict: considers as “true” the semantically correct enrichments with a full name and concept completeness ● As it was not possible within the TF to identify all possible true enrichments, we chose for a pooled recall using the total number of enrichments
  • 17. Measuring performance of each tool TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA Tools Annotated Enrichments (% of full corpus) Precision Max Pooled Recall Estimated Recall Estimated F-measure Relax. Strict Relax. Strict Relax. Strict Group A EF 550 (31.3%) 0.985 0.965 0.458 0.355 0.432 0.522 0.597 TEL 391 (22.3%) 0.982 0.982 0.325 0.254 0.315 0.404 0.477 Group B BgLinks 427 (24.3%) 0.888 0.574 0.355 0.249 0.200 0.389 0.296 Pelagios 502 (28.6%) 0.854 0.820 0.418 0.286 0.340 0.428 0.481 VocMatch 100 (05.7%) 0.774 0.312 0.083 0.048 0.024 0.091 0.045 Ontotext v1 489 (27.8%) 0.842 0.505 0.407 0.272 0.202 0.411 0.289 Ontotext v2 682 (38.8%) 0.924 0.632 0.567 0.418 0.354 0.576 0.454 The observer percentage agreement for Fleiss Kappa was 79.9%
  • 18. Conclusions TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA 1. Select a dataset for your evaluation that represents the diversity of your (meta)data 2. Building a gold standard is ideal but not always possible 3. Consider using the semantics of target datasets 4. Try to keep balance between evaluated tools 5. Give clear guidelines on how to annotate the corpus 6. Use the right tool for annotating the corpus
  • 19. Recommendation for Enrichment Tools TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA ● Consider applying different techniques depending on the kind of values used within the enriched property ● Matches on parts of a field's textual content may result in too general or even meaningless enrichments if they fail to recognize compound expressions ○ In the Europeana context, concepts so broad as "general period" do not bring any value as enrichments ● Apply strong disambiguation mechanism that considers the accuracy of a name reference together with the relevance of the entity ● Quality issues originating in the metadata mapping process are a great obstacle to get good enrichments
  • 20. Next steps... TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA ● Build a gold standard with ○ significant language and spatial coverage and ○ diversity of subjects and domains ● Develop a tool for helping annotators in creating and managing the gold standard ○ tuned for the annotation task, collaborative and easily accessible ● Develop (or extend) a solution for running evaluations ○ Possibly Inspired by GERBIL ● Re-evaluate the tools ● Integrate enrichment evaluation into an holistic evaluation framework for search
  • 22. References TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA ● Monti, Johanna, Mario Monteleone, Maria Pia di Buono, and Federica Marano. 2013. “Cross- Lingual Information Retrieval and Semantic Interoperability for Cultural Heritage Repositories.” In Recent Advances in Natural Language Processing, RANLP 2013, 9-11 September, 2013, Hissar, Bulgaria, 483–90. http://aclweb.org/anthology/R/R13/R13-1063.pdf. ● Petras, Vivien, Toine Bogers, Elaine G. Toms, Mark M. Hall, Jacques Savoy, Piotr Malak, Adam Pawlowski, Nicola Ferro, and Ivano Masiero. 2013. “Cultural Heritage in CLEF (CHiC) 2013.” In Information Access Evaluation. Multilinguality, Multimodality, and Visualization - 4th International Conference of the CLEF Initiative, CLEF 2013, Valencia, Spain, September 23-26, 2013. Proceedings, 192–211. doi:10.1007/978-3-642-40802-1_23. ● Stiller, Juliane, Vivien Petras, Maria Gäde, and Antoine Isaac. 2014. “Automatic Enrichments with Controlled Vocabularies in Europeana: Challenges and Consequences.” In Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection, 238–47. ● Usbeck, Ricardo, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin Brümmer, Diego Ceccarelli, et al. 2015. “GERBIL: General Entity Annotator Benchmarking Framework.” In Proceedings of the 24th International Conference on World Wide Web, 1133–43. WWW ’15. New York, NY, USA: ACM. doi:10.1145/2736277.2741626.
  • 23. Previous Studies TPDL 2016 - Exploring Comparative Evaluation of Semantic Enrichment Tools for Cultural Heritage Metadata CC BY-SA • Evaluations of enrichments were performed sporadically: • Sample evaluation of alignments of places and agents to DBpedia • Evaluation of the relationship between enrichments, objects and queries which retrieve enriched objects (Stiller et al. 2014) • Evaluation of retrieval performances in the CH domain indirectly also target enrichments (e.g. Monti et al. 2013, Petras et al. 2013) • Tools and frameworks for enrichment: • GERBIL (Usbeck et al. 2015)