SlideShare a Scribd company logo
1 of 73
Download to read offline
Mappings Validation
Data Quality Tutorial - SEMANTICS2016
Anastasia Dimou
Anastasia.Dimou@ugent.be ● @natadimou
Ghent University – iMinds
Linked (Open) Data
semantically annotated & interlinked data
using different vocabularies or ontologies
published in the form of RDF datasets
Linked (Open) Data
derive from originally heterogeneous
(semi-)structured data
e.g.
Eurostat from TSV
DBLP from DBLP database
DBpedia from Wikipedia
LinkedBrainz from MusicBrainz database
... … …
Linked Data Quality
in the context of Linked Data
generation and publication workflow
Linked Data Quality dimensions
Representational dimension
Intrinsic dimension
Accessibility dimension
Contextual dimension
A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer.
Quality Assessment for Linked Data: A Survey.
Semantic Web Journal, 2016.
Linked Data Quality dimensions
Representational dimension
data modeling
Intrinsic dimension
Linked Data generation
Accessibility dimension
Linked Data publication
Contextual dimension
Linked Data consumption
Linked Data Quality dimensions
Representational dimension
data modeling
Intrinsic dimension
Linked Data generation
Accessibility dimension
Linked Data publishing
Contextual dimension
Linked Data consumption
Linked Data Quality - Intrinsic Dimension
determines the RDF Dataset Quality
by assessing it for possible violations
with respect to
accuracy (e.g. malformed datatype literals)
consistency (e.g. disjoint classes/properties)
Instead of applying Quality Assessment
to the already published Linked Data
as part of Linked Data consumption
Apply Quality Assessment
to the Mappings
that generate the Linked Data
as part of Linked Data production
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
dbo:Person
dbo:Personxsd:date
dbo:Personxsd:date
Linked Data Quality Assessment
Linked Data Quality Assessment (DQA)
RDFUnit http://rdfunit.aksw.org
test-driven data-debugging framework
based on SPARQL-patterns
D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, and A. J. Zaveri
Test-driven evaluation of linked data quality.
In Proceedings of the 23rd International Conference on World Wide Web
DQA with RDFUnit
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
10 domain violations
10 datatype violations
1,000,000 domain violations!!!
1,000,000 datatype violations!!!
Linked Data Quality Assessment (DQA)
Similar violations occur repeatedly
within a single Linked Data set
Linked Data Quality Assessment (DQA)
Sets of triples of a dataset have
repetitive patterns
Linked Data Quality Assessment (DQA)
Sets of triples of a dataset have
repetitive patterns
DQA: Linked Data Quality Assessment
is applied by third parties
to already published Linked Data sets
violations
DQA
DQA: Linked Data Quality Assessment
Adjustments is NOT applied
at the root of the problem
violations
DQA
DQA: Linked Data Quality Assessment
Adjustments are overwritten
if a new version of the original data
is annotated and published as Linked Data
violations
DQA
Instead of applying Quality Assessment
to the already published Linked Data set
as part of data consumption
Apply Quality Assessment to the Mappings
that generate the Linked Data
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Mapping languages
formalize patterns into rules to generate
Linked Data from some original data
RDF Mapping Language (RML) http://rml.io
extends the W3C-recommended R2RML
specify the mapping rules to
generate Linked Data
from heterogeneous data sources
mapping rules are Linked Data sets too!
A. Dimou, M. Vander Sande, P. Colpaert, R. Verborgh, E. Mannens, and R. Van de Walle.
RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data.
In Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), 2014.
RDF Mapping Language (RML) http://rml.io
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
RDF Mapping Language (RML) http://rml.io
data map doc
Mapping
Processor
RDF Mapping Language (RML) http://rml.io
data map doc
Mapping
Processor
violations
DQA
DQA: Linked Data Quality Assessment
data map doc
Mapping
Processor
violations
DQA
DQA: Linked Data Quality Assessment
data map doc
Mapping
Processor
violations
DQA
DQA: Linked Data Quality Assessment
data map doc
Mapping
Processor
violations
MQA
MQA: Mapping Quality Assessment
DQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
D→MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
D→MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
D→MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
… WHERE {
?resource rr:predicateObjectMap ?poMap.
?poMap rr:predicate %%P1%%;
rr:objectMap ?objM.
?objM rr:datatype ?c.
FILTER (?c != %%D1%%) }
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
data map doc
Mapping
Processor
violations
MQA
MQA: Mapping Quality Assessment
MQA with RDFUnit over RML
…WHERE { ?resource %%P1%% ?c.
FILTER (DATATYPE(?c) != %%D1%%) }
…WHERE { ?resource dbo:birthDate ?c.
FILTER (DATATYPE(?c) != xsd:date) }
… WHERE {
?resource rr:predicateObjectMap ?poMap.
?poMap rr:predicate %%P1%%;
rr:objectMap ?objM.
?objM rr:datatype ?c.
FILTER (?c != %%D1%%) }
<#Mapping>
rr:subjectMap [ rr:class dbo:Event
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
1 ONLY domain violations!!!
1 ONLY datatype violations!!!
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
MQA: Mapping Quality Assessment
discover not only the violations
but also their origin
before they are even generated
MQA: Mapping Quality Assessment
easily apply structural adjustments
prevent same violations to
appear repeatedly over distinct entities
allow intuitively combining
different ontologies and vocabularies
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:gYear ;
rut:missingValue xsd:date .
data map doc
Mapping
Processor
Mapping
Refinementsviolations
MDQA
Uniform Mapping & Dataset
Quality Assessment Workflow
Correcting MQA violations with RML Editor
Correcting MQA violations with RML Editor
Correcting MQA violations with RML Editor
data map doc
Mapping
Processor
violations
MDQA
MDQA:
Uniform Mapping & Dataset Quality Assessment
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:gYear ;
rut:missingValue xsd:date .
DEL: <#ObjectMap> rr:datatype xsd:gYear.
ADD: <#ObjectMap> rr:datatype xsd:date.
MQA with RDFUnit over RML
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:float ;
rut:missingValue xsd:int .
DEL: <#ObjectMap> rr:datatype xsd:gYear.
ADD: <#ObjectMap> rr:datatype xsd:date.
DEL: <#SubjectMap> rr:class dbo:Event.
ADD: <#SubjectMap> rr:class dbo:Person.
MQA with RDFUnit over RML
<#Result>
rut:testCase rut:datatypeError
spin:violationRoot <#ObjectMap> ;
spin:violationPath rr:datatype ;
spin:violationValue xsd:float ;
rut:missingValue xsd:int .
DEL: <#ObjectMap> rr:datatype xsd:gYear.
ADD: <#ObjectMap> rr:datatype xsd:date.
<#Mapping>
rr:subjectMap [ rr:class dbo:Person
rr:template "http://example.com/{Name}" ] ;
rr:predicateObjectMap [ rr:predicate dbo:birthDate
rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:date ] ] .
DEL: <#SubjectMap> rr:class dbo:Event.
ADD: <#SubjectMap> rr:class dbo:Person.
data
new
map doc
map doc
Mapping
Processor
Mapping
Refinementsviolations
MDQA
(optional)
Uniform Mapping & Dataset
Quality Assessment Workflow
data
new
map doc
map doc
Mapping
Processor
Mapping
Refinementsviolations
MDQA
(optional)
Uniform Mapping & Dataset
Quality Assessment Workflow
Uniform Mapping & Dataset
Quality Assessment Workflow
Mapping Quality Assessment: Limitations
Mapping Quality Assessment: Limitations
certain test cases inevitably
require the complete Linked Data set
Mapping Quality Assessment: Limitations
certain test cases inevitably
require the complete Linked Data set
cardinality,
functionality,
symmetricity
Mapping Quality Assessment: Limitations
certain test cases inevitably
require the complete Linked Data set
cardinality,
functionality,
symmetricity
on Mappings defense:
more data issue
NOT affected by the mapping rules
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Dataset Vs Mapping Quality Assessment
Number of Violations
*Dbpedia and DBLP D2RQ Mappings were translated to RML mappings
#violations - Quality Assessment
Dataset Assessment Mappings Assessment
DBpedia EN 3.2M 160
DBLP 8.1M 8
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
Dataset Vs Mapping Quality Assessment
Time
Dataset Quality Assessment Mappings Quality Assessment
size time size time
DBPedia EN 62M 16h 115K 11s
DBPedia NL 21M 1.5h 53K 6s
DBLP 12M 12h 368 12s
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
Mapping Quality Assessment
* http://mappings.dbpedia.org/validation
Live update of DBpedia Mapping Quality Assessment results every night! ☺
Mapping Quality Assessment
size time
DBpedia EN 115K 11s
DBpedia NL 53K 6s
DBpedia All 511K 32s
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle
Assessing and Refining Mappings to RDF to Improve Dataset Quality.
In Proceedings of The Semantic Web - ISWC 2015
* http://mappings.dbpedia.org/validation
DBpedia Mappings Quality Assessment
A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann
DBpedia Mappings Quality Assessment.
To be published in Proceedings of the 15th International Semantic Web Conference: Posters and Demos 2016
Live update of DBpedia Mapping Quality Assessment results every night! ☺
Linked Dataset Quality Assessment (DQA)
Mappings Quality Assessment (MQA)
Mapping & Dataset Quality Assessment Workflow
Mappings & Quality Assessment Evaluation Results
Violations
are related to the dataset's schema
(vocabularies or ontologies)
occur repeatedly
within a single RDF dataset
The situation aggravates the more
ontologies and vocabularies
are reused and combined
Linked Data Quality Assessment
shifted from data consumption
to data publication
integrated systematically
in the publishing workflow
violations are identified,
resolved and will not re-appear
Linked Data of higher Quality is generated!!!
Mappings Validation
Data Quality Tutorial - SEMANTICS2016
Anastasia Dimou
Anastasia.Dimou@ugent.be ● @natadimou
Ghent University – iMinds

More Related Content

What's hot

The WorldCat Search API
The WorldCat Search APIThe WorldCat Search API
The WorldCat Search APIOCLC Research
 
Knowledge graphs on the Web
Knowledge graphs on the WebKnowledge graphs on the Web
Knowledge graphs on the WebArmin Haller
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwanandrea huang
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentMaribel Acosta Deibe
 
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Blerina Spahiu
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011Peter Mika
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaGezim Sejdiu
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online NewsBernardo Najlis
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessOntotext
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databasesGraph-TA
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 

What's hot (20)

The WorldCat Search API
The WorldCat Search APIThe WorldCat Search API
The WorldCat Search API
 
Knowledge graphs on the Web
Knowledge graphs on the WebKnowledge graphs on the Web
Knowledge graphs on the Web
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
Presentation of Profiling Similarity Links in LOD @ DesWEB, ICDE 2016
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
 
Hack U Barcelona 2011
Hack U Barcelona 2011Hack U Barcelona 2011
Hack U Barcelona 2011
 
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD VivaEfficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online News
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databases
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 

Viewers also liked

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RMLExtraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RMLandimou
 
2014 review of data quality assessment methods
2014 review of data quality assessment methods2014 review of data quality assessment methods
2014 review of data quality assessment methodsRoger Zapata
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Beniamino Murgante
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening TalkWilliam Smith
 
Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Routine Health Information NetwOrk (RHINO)
 
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentUmair ul Hassan
 
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalData quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalSurvey Department
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentOlaf Hartig
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...HTAi Bilbao 2012
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introductiondatatovalue
 
Data quality overview
Data quality overviewData quality overview
Data quality overviewAlex Meadows
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality DashboardsWilliam Sharp
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratchdmurph4
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profilingShailja Khurana
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 

Viewers also liked (19)

FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RMLExtraction and Semantic Annotation of Workshop Proceedings in HTML using RML
Extraction and Semantic Annotation of Workshop Proceedings in HTML using RML
 
2014 review of data quality assessment methods
2014 review of data quality assessment methods2014 review of data quality assessment methods
2014 review of data quality assessment methods
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening Talk
 
Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...
 
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
 
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalData quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
Data Quality Dashboards
Data Quality DashboardsData Quality Dashboards
Data Quality Dashboards
 
Building a Data Quality Program from Scratch
Building a Data Quality Program from ScratchBuilding a Data Quality Program from Scratch
Building a Data Quality Program from Scratch
 
Data Quality Definitions
Data Quality DefinitionsData Quality Definitions
Data Quality Definitions
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 

Similar to Mappings Validation

Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality andimou
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
High quality Linked Data generation for librarians
High quality Linked Data generation for librariansHigh quality Linked Data generation for librarians
High quality Linked Data generation for librariansandimou
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika IntroductionJosef Hardi
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesConnected Data World
 
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...andimou
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsRomanaPernischov
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in SparkDatabricks
 
LOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD CycleLOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD Cyclerogers.rj
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESQUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESNexgen Technology
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computingBAINIDA
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1ErhardRahm
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastEric Kavanagh
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routingchennaijp
 

Similar to Mappings Validation (20)

Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
High quality Linked Data generation for librarians
High quality Linked Data generation for librariansHigh quality Linked Data generation for librarians
High quality Linked Data generation for librarians
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika Introduction
 
RDF data clustering
RDF data clusteringRDF data clustering
RDF data clustering
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the pieces
 
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
iLastic: Linked Data Generation Workflow and User Interface for iMinds Schola...
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix Revolutions
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
LOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD CycleLOP – Capturing and Linking Open Provenance on LOD Cycle
LOP – Capturing and Linking Open Provenance on LOD Cycle
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
Data Quality
Data QualityData Quality
Data Quality
 
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASESQUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computing
 
Building a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with RBuilding a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with R
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 
Analysis of the Datasets
Analysis of the DatasetsAnalysis of the Datasets
Analysis of the Datasets
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory Webcast
 
JPJ1423 Keyword Query Routing
JPJ1423   Keyword Query RoutingJPJ1423   Keyword Query Routing
JPJ1423 Keyword Query Routing
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

Mappings Validation

  • 1. Mappings Validation Data Quality Tutorial - SEMANTICS2016 Anastasia Dimou Anastasia.Dimou@ugent.be ● @natadimou Ghent University – iMinds
  • 2. Linked (Open) Data semantically annotated & interlinked data using different vocabularies or ontologies published in the form of RDF datasets
  • 3. Linked (Open) Data derive from originally heterogeneous (semi-)structured data e.g. Eurostat from TSV DBLP from DBLP database DBpedia from Wikipedia LinkedBrainz from MusicBrainz database ... … …
  • 4. Linked Data Quality in the context of Linked Data generation and publication workflow
  • 5. Linked Data Quality dimensions Representational dimension Intrinsic dimension Accessibility dimension Contextual dimension A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality Assessment for Linked Data: A Survey. Semantic Web Journal, 2016.
  • 6. Linked Data Quality dimensions Representational dimension data modeling Intrinsic dimension Linked Data generation Accessibility dimension Linked Data publication Contextual dimension Linked Data consumption
  • 7. Linked Data Quality dimensions Representational dimension data modeling Intrinsic dimension Linked Data generation Accessibility dimension Linked Data publishing Contextual dimension Linked Data consumption
  • 8. Linked Data Quality - Intrinsic Dimension determines the RDF Dataset Quality by assessing it for possible violations with respect to accuracy (e.g. malformed datatype literals) consistency (e.g. disjoint classes/properties)
  • 9. Instead of applying Quality Assessment to the already published Linked Data as part of Linked Data consumption Apply Quality Assessment to the Mappings that generate the Linked Data as part of Linked Data production
  • 10. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 11. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 12.
  • 13.
  • 17. Linked Data Quality Assessment (DQA) RDFUnit http://rdfunit.aksw.org test-driven data-debugging framework based on SPARQL-patterns D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, and A. J. Zaveri Test-driven evaluation of linked data quality. In Proceedings of the 23rd International Conference on World Wide Web
  • 18. DQA with RDFUnit …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) }
  • 19. 10 domain violations 10 datatype violations
  • 21. Linked Data Quality Assessment (DQA) Similar violations occur repeatedly within a single Linked Data set
  • 22. Linked Data Quality Assessment (DQA) Sets of triples of a dataset have repetitive patterns
  • 23. Linked Data Quality Assessment (DQA) Sets of triples of a dataset have repetitive patterns
  • 24. DQA: Linked Data Quality Assessment is applied by third parties to already published Linked Data sets violations DQA
  • 25. DQA: Linked Data Quality Assessment Adjustments is NOT applied at the root of the problem violations DQA
  • 26. DQA: Linked Data Quality Assessment Adjustments are overwritten if a new version of the original data is annotated and published as Linked Data violations DQA
  • 27. Instead of applying Quality Assessment to the already published Linked Data set as part of data consumption
  • 28. Apply Quality Assessment to the Mappings that generate the Linked Data A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 29. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 30. Mapping languages formalize patterns into rules to generate Linked Data from some original data
  • 31. RDF Mapping Language (RML) http://rml.io extends the W3C-recommended R2RML specify the mapping rules to generate Linked Data from heterogeneous data sources mapping rules are Linked Data sets too! A. Dimou, M. Vander Sande, P. Colpaert, R. Verborgh, E. Mannens, and R. Van de Walle. RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), 2014.
  • 32. RDF Mapping Language (RML) http://rml.io <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
  • 33. RDF Mapping Language (RML) http://rml.io
  • 34. data map doc Mapping Processor RDF Mapping Language (RML) http://rml.io
  • 35. data map doc Mapping Processor violations DQA DQA: Linked Data Quality Assessment
  • 36. data map doc Mapping Processor violations DQA DQA: Linked Data Quality Assessment
  • 37. data map doc Mapping Processor violations DQA DQA: Linked Data Quality Assessment
  • 39. DQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) }
  • 40. D→MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) }
  • 41. D→MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) } <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
  • 42. D→MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) } … WHERE { ?resource rr:predicateObjectMap ?poMap. ?poMap rr:predicate %%P1%%; rr:objectMap ?objM. ?objM rr:datatype ?c. FILTER (?c != %%D1%%) } <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] .
  • 44. MQA with RDFUnit over RML …WHERE { ?resource %%P1%% ?c. FILTER (DATATYPE(?c) != %%D1%%) } …WHERE { ?resource dbo:birthDate ?c. FILTER (DATATYPE(?c) != xsd:date) } … WHERE { ?resource rr:predicateObjectMap ?poMap. ?poMap rr:predicate %%P1%%; rr:objectMap ?objM. ?objM rr:datatype ?c. FILTER (?c != %%D1%%) } <#Mapping> rr:subjectMap [ rr:class dbo:Event rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:gYear ] ] . 1 ONLY domain violations!!! 1 ONLY datatype violations!!!
  • 45. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment
  • 46. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 47. MQA: Mapping Quality Assessment discover not only the violations but also their origin before they are even generated
  • 48. MQA: Mapping Quality Assessment easily apply structural adjustments prevent same violations to appear repeatedly over distinct entities allow intuitively combining different ontologies and vocabularies
  • 49. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment
  • 50. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:gYear ; rut:missingValue xsd:date .
  • 51. data map doc Mapping Processor Mapping Refinementsviolations MDQA Uniform Mapping & Dataset Quality Assessment Workflow
  • 52. Correcting MQA violations with RML Editor
  • 53. Correcting MQA violations with RML Editor
  • 54. Correcting MQA violations with RML Editor
  • 55. data map doc Mapping Processor violations MDQA MDQA: Uniform Mapping & Dataset Quality Assessment <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:gYear ; rut:missingValue xsd:date . DEL: <#ObjectMap> rr:datatype xsd:gYear. ADD: <#ObjectMap> rr:datatype xsd:date.
  • 56. MQA with RDFUnit over RML <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:float ; rut:missingValue xsd:int . DEL: <#ObjectMap> rr:datatype xsd:gYear. ADD: <#ObjectMap> rr:datatype xsd:date. DEL: <#SubjectMap> rr:class dbo:Event. ADD: <#SubjectMap> rr:class dbo:Person.
  • 57. MQA with RDFUnit over RML <#Result> rut:testCase rut:datatypeError spin:violationRoot <#ObjectMap> ; spin:violationPath rr:datatype ; spin:violationValue xsd:float ; rut:missingValue xsd:int . DEL: <#ObjectMap> rr:datatype xsd:gYear. ADD: <#ObjectMap> rr:datatype xsd:date. <#Mapping> rr:subjectMap [ rr:class dbo:Person rr:template "http://example.com/{Name}" ] ; rr:predicateObjectMap [ rr:predicate dbo:birthDate rr:objectMap [ rml:reference "Birth" ; rr:datatype xsd:date ] ] . DEL: <#SubjectMap> rr:class dbo:Event. ADD: <#SubjectMap> rr:class dbo:Person.
  • 60. Uniform Mapping & Dataset Quality Assessment Workflow
  • 62. Mapping Quality Assessment: Limitations certain test cases inevitably require the complete Linked Data set
  • 63. Mapping Quality Assessment: Limitations certain test cases inevitably require the complete Linked Data set cardinality, functionality, symmetricity
  • 64. Mapping Quality Assessment: Limitations certain test cases inevitably require the complete Linked Data set cardinality, functionality, symmetricity on Mappings defense: more data issue NOT affected by the mapping rules
  • 65. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 66. Dataset Vs Mapping Quality Assessment Number of Violations *Dbpedia and DBLP D2RQ Mappings were translated to RML mappings #violations - Quality Assessment Dataset Assessment Mappings Assessment DBpedia EN 3.2M 160 DBLP 8.1M 8 A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 67. Dataset Vs Mapping Quality Assessment Time Dataset Quality Assessment Mappings Quality Assessment size time size time DBPedia EN 62M 16h 115K 11s DBPedia NL 21M 1.5h 53K 6s DBLP 12M 12h 368 12s A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 68. Mapping Quality Assessment * http://mappings.dbpedia.org/validation Live update of DBpedia Mapping Quality Assessment results every night! ☺ Mapping Quality Assessment size time DBpedia EN 115K 11s DBpedia NL 53K 6s DBpedia All 511K 32s A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann, R. Van De Walle Assessing and Refining Mappings to RDF to Improve Dataset Quality. In Proceedings of The Semantic Web - ISWC 2015
  • 69. * http://mappings.dbpedia.org/validation DBpedia Mappings Quality Assessment A. Dimou, D. Kontokostas, M. Freudenberg, R. Verborgh, J. Lehmann, E. Mannens, S. Helmann DBpedia Mappings Quality Assessment. To be published in Proceedings of the 15th International Semantic Web Conference: Posters and Demos 2016 Live update of DBpedia Mapping Quality Assessment results every night! ☺
  • 70. Linked Dataset Quality Assessment (DQA) Mappings Quality Assessment (MQA) Mapping & Dataset Quality Assessment Workflow Mappings & Quality Assessment Evaluation Results
  • 71. Violations are related to the dataset's schema (vocabularies or ontologies) occur repeatedly within a single RDF dataset The situation aggravates the more ontologies and vocabularies are reused and combined
  • 72. Linked Data Quality Assessment shifted from data consumption to data publication integrated systematically in the publishing workflow violations are identified, resolved and will not re-appear Linked Data of higher Quality is generated!!!
  • 73. Mappings Validation Data Quality Tutorial - SEMANTICS2016 Anastasia Dimou Anastasia.Dimou@ugent.be ● @natadimou Ghent University – iMinds