SlideShare a Scribd company logo
1 of 63
Validating RDF data:
Challenges and perspectives
Jose Emilio Labr Gayo
WESO Research Group (WEb Semantics Oviedo)
University of Oviedo, Spain
About me…
Founded WESO (Web Semantics Oviedo) research group
Practical applications of semantic technologies since 2004
Several domains: e-Government, e-Health
Some books:
"Web semántica" (in Spanish), 2012
"Validating RDF data", 2017
…and software:
SHaclEX (Scala library, implements ShEx & SHACL)
RDFShape (RDF playground)
HTML version: http://book.validatingrdf.com
Examples: https://github.com/labra/validatingRDFBookExamples
Contents
Short intro to semantic Web & knowledge graphs
Short overview of RDF
Understanding the RDF validation problem
2 technologies: ShEx/SHACL
Challenges & perspectives
Semantic Web
Vision of the Web as web of data
Not only pages, but also data: linked data
Data understandable by machines
Related with:
Big Data
Artificial intelligence
Knowledge representation
Example: Wikipedia vs Wikidata
Tim Berners-Lee
Source: Wikipedia
Source: http://www.internetlivestats.com/total-number-of-websites/
Date: 05/04/2019 (19:05h)
Who consumes Web information?
People
We access through some device
...and Machines
They show us web pages (browsers)
...but also manage information (bots)
Filtering content
Doing suggestions
...
"If Google doesn't understand your Web page, it almost doesn't exist"
Web information
Documentos
Information
retrieval
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Web information must be machine-processable
Semantic web technologies focus on how to achieve this goal
Sensors Generate
User
Content
generation &
filtering
People vs Machines
Programmed for some tasks
Predictable (no mistakes*)
No problem with repetitive tasks
Problems to understand the context
Creativity, imagination
Unpredictable (we do mistakes)
We get tired with repetitive tasks
Understanding based on context
*when well programmed
Information understandable by machines?
Example: "Oviedo has a temperature of 36 degrees"
Problem: Ambiguity and context identification
Oviedo ? It can be... A city in Spain
...or a city in Florida, USA
...or a soccer player http://www.wikidata.org/entity/Q325997
http://wikidata.org/entity/Q14317
http://www.wikidata.org/entity/Q1813449
...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076
Identifiers (URIs) in Wikidata
Machine representation
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
36^^<http://www.w3.org/2001/XMLSchema#integer>
Knowledge representation for machines
Simple statement (triple 3 elements): subject, predicate, object
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
subject predicate object
36^^<http://www.w3.org/2001/XMLSchema#integer>
wd:Q14317
wdt:P2076
36^^xsd:integer
wd: <http://www.wikidata.org/entity/>
wdt: <https://www.wikidata.org/prop/direct/>
xsd: <http://www.w3.org/2001/XMLSchema#>
Prefix declarations help simplify long URIs
Knowledge graphs
wd:Q14317 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
Wikidata statistics (17-07-2019)
57,462,853 entities
728,979,397 triples
Source: https://www.wikidata.org/wiki/Wikidata:Statistics
Wikidata is an example of a knowledge graph
There are a lot more:
Open: DBpedia, BabelNet, etc
Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
Web information & knowledge graphs
Documentos
Information
retrieval
Knowledge
graphs
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Knowledge graphs are a key part of the Web nowadays
Sensors Generate
User
Content
generation &
filtering
RDF
RDF (Resource Description Framework)
Allows to represent knowledge graphs
RDF data model = graph as a set of statements
Predicates = URIs
Several syntaxes
Turtle, N-Triples, JSON-LD,...
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
prefix wd: <http://www.wikidata.org/entity/>
prefix wdt: <https://www.wikidata.org/prop/direct/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
wd:Q14317 wdt:P1082 220020 ;
wdt:P2076 36 ;
wdt:P1376 wd:Q3934 .
wd:Q10514 wdt:P19 wd:Q14317 ;
wdt:P569 "1981-07-29"^^xsd:date ;
wdt:P18 <picture.jpg> .
RDF syntaxes: Turtle
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1082> "220020"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1376> <http://www.wikidata.org/entity/Q3934> .
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P2076> "36"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P18> <picture.jpg> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P19> <http://www.wikidata.org/entity/Q14317> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P569> "1981-07-29"^^<http://www.w3.org/2001/XMLSchema#date> .
RDF syntaxes: N-Triples
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
{
"@context" : "http://...wikidata.json",
"@graph" : [ {
"@id" : "wd:Q10514",
"wdt:P18" : "picture.jpg",
"wdt:P19" : "wd:Q14317",
"P569" : "1981-07-29"
}, {
"@id" : "wd:Q14317",
"wdt:P1082" : 220020,
"wdt:P1376" : "wd:Q3934",
"wdt:P2076" : 36
}
]
}
RDF syntaxes: JSON-LD
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:wdt="https://www.wikidata.org/prop/direct/"
xmlns:wd="http://www.wikidata.org/entity/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
<rdf:Description rdf:about="http://www.wikidata.org/entity/Q10514">
<wdt:P18>picture.jpg</wdt:P18>
<wdt:P19>
<rdf:Description rdf:about="http://www.wikidata.org/entity/Q14317">
<wdt:P1082 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">220020</wdt:P1082>
<wdt:P1376 rdf:resource="http://www.wikidata.org/entity/Q3934"/>
<wdt:P2076 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">36</wdt:P2076>
</rdf:Description>
</wdt:P19>
<wdt:P569 rdf:datatype=http://www.w3.org/2001/XMLSchema#date>1981-07-29</wdt:P569>
</rdf:Description>
</rdf:RDF>
RDF syntaxes: RDF/XML
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
RDF graphs are composable
RDF helps information integration
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
:alice
:bob
:carol
schema:birthPlace
schema:birthPlace
schema:knows
schema:knows
Alice Cooper
Schema:name Robert Smith
schema:nameschema:knows
Carol King
schema:name
RDF data model
Graph = set of statements (triples)
Predicates (arcs) = URIs
Subjects: URIs*
Objects: URIs or literals* 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
*Exception: blank nodes (next slide)
RDF data model: Blank nodes
A statement about something
Example: "Alice knows someone who knows Bob"
:alice
schema:knows
:bob
schema:knows
:alice schema:knows _:x ;
_:x schema:knows :bob .
Notation (in Turtle)
:alice schema:knows [
schema:knows :bob
] .
 x ( :alice :knows x 
x :knows :bob )
RDF data model: Literals
:bob schema:name "Robert"^^xsd:string ;
schema:age "18"^^xsd:integer ;
schema:birthDate "2010-04-12"^^xsd:date .
:bob schema:name "Robert" ;
schema:age 18 ;
schema:birthDate "2010-04-12"^^xsd:date .
Objects can also be literals
Literals contain a lexical form and a datatype
Common datatypes: XML Schema primitive datatypes
If not specified, a literal has type xsd:string
:bob
schema:name
"2010-04-12"^^xsd:date18"Robert Smith"
schema:age
schema:birthDate
RDF data model: Language tagged strings
String literals can be qualified by a language tag
They have datatype rdfs:langString
:spain rdfs:label "Spain"@en ;
rdfs:label "España"@es .
:Spain
"España"@es"Spain"@en
rdfs:label
rdfs:label
...and that's all
Yes, the RDF Data model is simple
RDF ecosystem
SPARQL
Vocabularies
Ontologies and inference
SELECT ?name ?birthDate ?picture WHERE {
?person wdt:P19 wd:Q14317 ;
rdfs:label ?name ;
wdt:P569 ?birthDate ;
wdt:P18 ?picture .
FILTER (Lang(?name) = 'es')
}
SPARQL
SPARQL = Simple Protocol and RDF Query Language
Similar to SQL, but for RDF
Example: People born in Oviedo
wdt:P19
"1981-07-29"^^xsd:date
wdt:P18
wdt:P569
Oviedo
wd:Q14317
"Fernando Alonso"@en
rdfs:label
?person
?birthPlace?name ?picture
wdt:P19
wdt:P18wdt:P569
wd:Q14317
rdfs:label
wd:Q10514
wdt:P19
...
Shared entities and vocabularies
URIs as global identifiers allow to reuse them
Lots of vocabularies: domain specific or general purpose
Examples:
schema.org: properties that major browsers can recognize
Linked Open Vocabularies
Inference and ontologies
Some vocabularies define properties that can infer new information
RDF Schema: Classes, subclasses, etc.
OWL: Ontologies using description logics
Trade-off between expressiveness and complexity
rdf:type
:Student
:alice
:Person
rdfs:subClassOf
:HumanBeing
owl:sameAs
rdf:type
rdf:type
RDF, the good parts...
RDF as an integration language
RDF as a lingua franca for semantic web and linked data
RDF flexibility & integration
Data can be adapted to multiple environments
Open and reusable data by default
RDF for knowledge representation
RDF data stores & SPARQL
RDF, the other parts
Consuming & producing RDF
Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ...
Embedding RDF in HTML
Describing and validating RDF content
Producer Consumer
Why describe & validate RDF?
For producers
Developers can understand the contents they are going to produce
Ensure they produce the expected structure
Advertise and document the structure
Generate interfaces
For consumers
Understand the contents
Verify the structure before processing it
Query generation & optimization
Producer Consumer
Shapes
Similar technologies
Technology Schema
Relational Databases DDL
XML DTD, XML Schema, RelaxNG, Schematron
Json Json Schema
RDF ?
Fill that gap
Understanding the problem
Identifying the shape of graphs...
Shapes can describe the form of a node (node constraint)
...and the number of possible arcs incoming/outgoing from a node
...and the possible values associated with those arcs
:alice schema:name "Alice";
schema:knows :bob, :carol .
IRI schema:name string 1
schema:knows IRI 0, 1,...
RDF Node
Shape
RDF Node that
represents a User
<UserShape> IRI {
schema:name xsd:string ;
schema:knows IRI *
}
ShEx
Understanding the problem
Repeated properties
The same property can be used for different purposes in the same data
Example: A product must have 2 codes with different structure
:product schema:productID "isbn:123-456-789";
schema:productID "code456" .
A practical example from FHIR
See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
Understanding the problem
Shapes ≠ types
Nodes in RDF graphs can have 0, 1 or many rdf:type declarations
A type can be used in multiple contexts, e.g. foaf:Person
Nodes are not necessarily annotated with discriminating types
Nodes with type :Person can represent friends, students, patients,...
Different meanings and different structure depending on context
Specific validation constraints for different contexts
Understanding the problem
RDF validation ≠ ontology definition ≠ instance data
Ontologies are usually focused on domain entities
RDF validation is focused on RDF graph features (lower level)
Ontology
Constraints
RDF Validation
Instance data
Different levels
:alice schema:name "Alice";
schema:knows :bob .
<User> IRI {
schema:name xsd:string ;
schema:knows IRI
}
schema:knows a owl:ObjectProperty ;
rdfs:domain schema:Person ;
rdfs:range schema:Person .
A user must have only two properties:
schema:name of value xsd:string
schema:knows with an IRI value
Why not using SPARQL to validate?
Pros:
Expressive
Ubiquitous
Cons
Expressive
Idiomatic
many ways to encode the same constraint
ASK {{ SELECT ?Person {
?Person schema:name ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person {
?Person schema:name ?o .
FILTER ( isLiteral(?o) &&
datatype(?o) = xsd:string )
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person (COUNT(*) AS ?c1) {
?Person schema:gender ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)}
{ SELECT ?Person (COUNT(*) AS ?c2) {
?S schema:gender ?o .
FILTER ((?o = schema:Female ||
?o = schema:Male))
} GROUP BY ?Person HAVING (COUNT(*)=1)}
FILTER (?c1 = ?c2)
}
Example: Define SPARQL query to check:
There must be one schema:name which
must be a xsd:string, and one
schema:gender which must be
schema:Male or schema:Female
Previous/other approaches
SPIN, by TopQuadrant http://spinrdf.org/
SPARQL templates, Influenced SHACL
Stardog ICV: http://docs.stardog.com/icv/icv-specification.html
OWL with UNA and CWA
OSLC resource shapes: https://www.w3.org/Submission/shapes/
Vocabulary for RDF validation
Dublin Core Application profiles (K. Coyle, T. Baker)
http://dublincore.org/documents/dc-dsp/
RDF Data Descriptions (Fischer et al)
http://ceur-ws.org/Vol-1330/paper-33.pdf
RDFUnit (D. Kontokostas)
http://aksw.org/Projects/RDFUnit.html
...
ShEx and SHACL
2013 RDF Validation Workshop
Conclusions of the workshop:
There is a need of a higher level, concise language for RDF Validation
ShEx initially proposed (v 1.0)
2014 W3c Data Shapes WG chartered
2017 SHACL accepted as W3C recommendation
2017 ShEx 2.0 released as Community group draft
2018 ShEx adopted by Wikidata
Short intro to ShEx
ShEx (Shape Expressions Language)
Concise and human-readable language for RDF validation & description
Syntax similar to SPARQL, Turtle
Semantics inspired by regular expressions & RelaxNG
2 syntaxes: Compact and RDF/JSON-LD
Official info: http://shex.io
Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
ShEx implementations and playgrounds
Libraries:
shex.js: Javascript
SHaclEX: Scala (Jena/RDF4j)
PyShEx: Python
shex-java: Java
Ruby-ShEx: Ruby
Online demos & playgrounds
ShEx-simple
RDFShape
ShEx-Java
ShExValidata
Simple example
Nodes conforming to <User> shape must:
• Be IRIs
• Have exactly one schema:name with a value of type xsd:string
• Have zero or more schema:knows whose values conform to <User>
prefix schema: <http://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
Prefix
declarations
as in
Turtle/SPARQL
RDF Validation using ShEx
:alice schema:name "Alice" ;
schema:knows :alice .
:bob schema:knows :alice ;
schema:name "Robert".
:carol schema:name "Carol", "Carole" .
:dave schema:name 234 .
:emily foaf:name "Emily" .
:frank schema:name "Frank" ;
schema:email <mailto:frank@example.org> ;
schema:knows :alice, :bob .
:grace schema:name "Grace" ;
schema:knows :alice, _:1 .
_:1 schema:name "Unknown" .
Try it (RDFShape): https://goo.gl/97bYdv
Try it (ShExDemo):https://goo.gl/Y8hBsW
Schema
Data
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
:alice@<User>,
:bob @<User>,
:carol@<User>,
:dave @<User>,
:emily@<User>,
:frank@<User>,
:grace@<User>
Shape map







Validation process
:alice@:User, :bob@:User, :carol@:User
ShEx
Validator
Result shape map
:User {
schema:name xsd:string ;
schema:knows @:User *
}
ShEx Schema
:alice schema:name "Alice" ;
schema:knows :alice .
:bob schema:knows :alice ;
schema:name "Robert".
:carol schema:name "Carol", "Carole" .
RDF data
Shape map :alice@:User,
:bob@:User,
:carol@!:User
Input: RDF data, ShEx schema, Shape map
Output: Result shape map
Example with more ShEx features
:AdultPerson EXTRA rdf:type {
rdf:type [ schema:Person ] ;
:name xsd:string ;
:age MinInclusive 18 ;
:gender [:Male :Female] OR xsd:string ;
:address @:Address ? ;
:worksFor @:Company + ;
}
:Address CLOSED {
:addressLine xsd:string {1,3} ;
:postalCode /[0-9]{5}/ ;
:state @:State ;
:city xsd:string
}
:Company {
:name xsd:string ;
:state @:State ;
:employee @:AdultPerson * ;
}
:State /[A-Z]{2}/
:alice rdf:type :Student, schema:Person ;
:name "Alice" ;
:age 20 ;
:gender :Male ;
:address [
:addressLine "Bancroft Way" ;
:city "Berkeley" ;
:postalCode "55123" ;
:state "CA"
] ;
:worksFor [
:name "Company" ;
:state "CA" ;
:employee :alice
] . Try it: https://tinyurl.com/yd5hp9z4
More info about ShEx
See:
ShEx by Example (slides):
https://figshare.com/articles/ShExByExample_pptx/6291464
ShEx chapter from Validating RDF data book:
http://book.validatingrdf.com/bookHtml010.html
Short intro to SHACL
W3C recommendation:
https://www.w3.org/TR/shacl/ (July 2017)
RDF vocabulary
2 parts: SHACL-Core, SHACL-SPARQL
SHACL implementations
Name Parts Language - Library Comments
Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer
SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/
SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es
pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL
Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese
RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
Basic example
Try it. RDFShape https://goo.gl/T1uuzv
prefix : <http://example.org/>
prefix sh: <http://www.w3.org/ns/shacl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix schema: <http://schema.org/>
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
:hasEmail sh:path schema:email ;
sh:minCount 1;
sh:maxCount 1;
sh:nodeKind sh:IRI .
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org> .
:bob schema:firstName "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email "carol@mail.org" .
Shapes graph
Data graph


Same example with blank nodes
Try it. RDFShape https://goo.gl/ukY5vq
prefix : <http://example.org/>
prefix sh: <http://www.w3.org/ns/shacl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix schema: <http://schema.org/>
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property [
sh:path schema:name ;
sh:minCount 1; sh:maxCount 1;
sh:datatype xsd:string ;
] ;
sh:property [
sh:path schema:email ;
sh:minCount 1; sh:maxCount 1;
sh:nodeKind sh:IRI ;
] .
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org> .
:bob schema:firstName "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email "carol@mail.org" .
Shapes graph
Data graph


Some definitions about SHACL
Shape: collection of targets and constraints components
Targets: specify which nodes in the data graph must conform to a shape
Constraint components: Determine how to validate a node
Shape
target declarations
constraint components
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
Validation Report
The output of the validation process is a list of violation errors
No errors  RDF conforms to shapes graph
[ a sh:ValidationReport ;
sh:conforms false ;
sh:result [
a sh:ValidationResult ;
sh:focusNode :bob ;
sh:message
"MinCount violation. Expected 1, obtained: 0" ;
sh:resultPath schema:name ;
sh:resultSeverity sh:Violation ;
sh:sourceConstraintComponent
sh:MinCountConstraintComponent ;
sh:sourceShape :hasName
] ;
...
[ a sh:ValidationReport ;
sh:conforms true
].
SHACL processor
SHACL
Processor
Validation report
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
Shapes
graph
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org>.
:bob schema:name "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email <mailto:carol@mail.org> .
Data
Graph
[ a sh:ValidationReport ;
sh:conforms true
].
SHACL Core built-in constraint components
Type Constraints
Cardinality minCount, maxCount
Types of values class, datatype, nodeKind
Values node, in, hasValue, property
Range of values minInclusive, maxInclusive
minExclusive, maxExclusive
String based minLength, maxLength, pattern
Language based languageIn, uniqueLang
Logical constraints not, and, or, xone
Closed shapes closed, ignoredProperties
Property pair constraints equals, disjoint, lessThan, lessThanOrEquals
Non-validating constraints name, description, order, group
Qualified shapes qualifiedValueShape, qualifiedValueShapesDisjoint
qualifiedMinCount, qualifiedMaxCount
Longer example
:AdultPerson EXTRA a {
a [ schema:Person ] ;
:name xsd:string ;
:age MinInclusive 18 ;
:gender [:Male :Female] OR xsd:string ;
:address @:Address ? ;
:worksFor @:Company + ;
}
:Address CLOSED {
:addressLine xsd:string {1,3} ;
:postalCode /[0-9]{5}/ ;
:state @:State ;
:city xsd:string
}
:Company {
:name xsd:string ;
:state @:State ;
:employee @:AdultPerson * ;
}
:State /[A-Z]{2}/
:AdultPerson a sh:NodeShape ;
sh:property [
sh:path rdf:type ;
sh:qualifiedValueShape [
sh:hasValue schema:Person
];
sh:qualifiedMinCount 1 ;
sh:qualifiedMaxCount 1 ;
] ;
sh:targetNode :alice ;
sh:property [ sh:path :name ;
sh:minCount 1; sh:maxCount 1;
sh:datatype xsd:string ;
] ;
sh:property [ sh:path :gender ;
sh:minCount 1; sh:maxCount 1;
sh:in (:Male :Female);
] ;
sh:property [ sh:path :age ;
sh:maxCount 1;
sh:minInclusive 18
] ;
sh:property [ sh:path :address ;
sh:node :Address ;
sh:minCount 1 ; sh:maxCount 1
] ;
sh:property [ sh:path :worksFor ;
sh:node :Company ;
sh:minCount 1 ; sh:maxCount 1
].
:Address a sh:NodeShape ;
sh:closed true ;
sh:property [ sh:path :addressLine;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 3
] ;
sh:property [ sh:path :postalCode ;
sh:pattern "[0-9]{5}" ;
sh:minCount 1 ; sh:maxCount 3
] ;
sh:property [ sh:path :city ;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 1
] ;
sh:property [ sh:path :state ;
sh:node :State ;
] .
:Company a sh:NodeShape ;
sh:property [ sh:path :name ;
sh:datatype xsd:string
] ;
sh:property [
sh:path :state ;
sh:node :State
] ;
sh:property [ sh:path :employee ;
sh:node :AdultPerson ;
] ;.
:State a sh:NodeShape ;
sh:pattern "[A-Z]{2}" .
In ShEx
In SHACL
Its recursive!!! (not well defined SHACL)
Implementation dependent feature
Try it: https://tinyurl.com/ycl3mkzr
More info about SHACL
SHACL by example (slides):
https://figshare.com/articles/SHACL_by_example/6449645
SHACL chapter at Validating RDF data book
http://book.validatingrdf.com/bookHtml011.html
Some challenges and perspectives
Theoretical foundations of ShEx/SHACL
Inference shapes from data
Validation Usability
RDF Stream validation
Schema ecosystems
Wikidata
Solid
Theoretical foundations of ShEx/SHACL
Conversion between ShEx and SHACL
SHaclEX library converts subsets of both
Challenges
Recursion and negation
Performance and algorithmic complexity
Detect useful subsets of the languages
Convert to SPARQL
Schema/data mapping Shapes Shapes
ShEx SHACL
Inference of Shapes from Data
Useful use case in practice
Knowledge Graph summarization
Some prototypes:
ShExer, RDFShape, ShapeArchitect
Try it with RDFShape:
https://tinyurl.com/y8pjcbyf
Shape Expression
generated for
wd:Q51613194
Shapes
RDF data
infer
Validation usability
Learning from users
Early adopters: WebIndex, HL7 FHIR, Eclipse Lyo, GenWiki,…
Improve error information/visualization/navigation/repairing
Authoring/visualization tools
Propose annotation sets
UI generation
Error reporting/suggestion (SHOULD/MUST/…)
Shapes
RDF Stream validation
Validation of RDF streams
Challenges:
Incremental validation
Named graphs
Addition/removal of triples
Sensor
Stream
RDF
data
Shapes
Validator
Stream
Validation
results
Schema ecosystems: Wikidata
In May, 2019, Wikidata announced ShEx adoption
New namespace for schemas
Example: https://www.wikidata.org/wiki/EntitySchema:E2
It opens lots of opportunities/challenges
Schema evolution and comparison
Schema ecosystems: Solid project
SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee
Goal: Re-decentralize the Web
Separate data from apps
Give users more control about their data
Internally using linked data & RDF
Shapes needed for interoperability
See:
https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/
Data
pod
Shape
A
Shape
C
Shape
B
App
1
App
2
App
3
Conclusions
RDF as a basis for knowledge graphs
Explicit schemas can help improve data quality
2 languages proposed: ShEx/SHACL
Lots of new challenges and opportunities
End of presentation

More Related Content

What's hot

RDF 개념 및 구문 소개
RDF 개념 및 구문 소개RDF 개념 및 구문 소개
RDF 개념 및 구문 소개Dongbum Kim
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudRichard Cyganiak
 
Neo4j 4.1 overview
Neo4j 4.1 overviewNeo4j 4.1 overview
Neo4j 4.1 overviewNeo4j
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commonsJesse Wang
 
LOD (linked open data) part 2 lod 구축과 현황
LOD (linked open data) part 2   lod 구축과 현황LOD (linked open data) part 2   lod 구축과 현황
LOD (linked open data) part 2 lod 구축과 현황LiST Inc
 
Will Lyon- Entity Resolution
Will Lyon- Entity ResolutionWill Lyon- Entity Resolution
Will Lyon- Entity ResolutionNeo4j
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFSNilesh Wagmare
 
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013Juan Sequeda
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshellFabien Gandon
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web CorpusRobert Meusel
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDatabricks
 
Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)EUCLID project
 
Data Lineage with Apache Airflow using Marquez
Data Lineage with Apache Airflow using Marquez Data Lineage with Apache Airflow using Marquez
Data Lineage with Apache Airflow using Marquez Willy Lulciuc
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 

What's hot (20)

RDF 개념 및 구문 소개
RDF 개념 및 구문 소개RDF 개념 및 구문 소개
RDF 개념 및 구문 소개
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
 
Neo4j 4.1 overview
Neo4j 4.1 overviewNeo4j 4.1 overview
Neo4j 4.1 overview
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
LOD (linked open data) part 2 lod 구축과 현황
LOD (linked open data) part 2   lod 구축과 현황LOD (linked open data) part 2   lod 구축과 현황
LOD (linked open data) part 2 lod 구축과 현황
 
Will Lyon- Entity Resolution
Will Lyon- Entity ResolutionWill Lyon- Entity Resolution
Will Lyon- Entity Resolution
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshell
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web Corpus
 
Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
MongoDB
MongoDBMongoDB
MongoDB
 
Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)
 
Data Lineage with Apache Airflow using Marquez
Data Lineage with Apache Airflow using Marquez Data Lineage with Apache Airflow using Marquez
Data Lineage with Apache Airflow using Marquez
 
L18 Object Relational Mapping
L18 Object Relational MappingL18 Object Relational Mapping
L18 Object Relational Mapping
 
Practical Celery
Practical CeleryPractical Celery
Practical Celery
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 

Similar to Validating RDF data: Challenges and perspectives

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapesJose Emilio Labra Gayo
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2Dimitris Kontokostas
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsDr. Neil Brittliff
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage灿辉 葛
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...Jennifer Bowen
 
Semantic Web
Semantic WebSemantic Web
Semantic Webhardchiu
 
Linked open data sandwich
Linked open data sandwichLinked open data sandwich
Linked open data sandwichThimo Thoeye
 
MWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfMWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfBernhard Krabina
 
Culture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data LandCulture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data Landval.cartei
 
Linked opendata parisemantique.fr - 24062011
Linked opendata   parisemantique.fr - 24062011Linked opendata   parisemantique.fr - 24062011
Linked opendata parisemantique.fr - 24062011Loïc Dias Da Silva
 
Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Alberto Labarga
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked datavafopoulos
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
 

Similar to Validating RDF data: Challenges and perspectives (20)

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your Analytics
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape Expressions
 
Sem webmaubeuge
Sem webmaubeugeSem webmaubeuge
Sem webmaubeuge
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
RDF validation tutorial
RDF validation tutorialRDF validation tutorial
RDF validation tutorial
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Linked open data sandwich
Linked open data sandwichLinked open data sandwich
Linked open data sandwich
 
MWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfMWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdf
 
Culture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data LandCulture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data Land
 
Semantic Web talk TEMPLATE
Semantic Web talk TEMPLATESemantic Web talk TEMPLATE
Semantic Web talk TEMPLATE
 
Linked opendata parisemantique.fr - 24062011
Linked opendata   parisemantique.fr - 24062011Linked opendata   parisemantique.fr - 24062011
Linked opendata parisemantique.fr - 24062011
 
Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 

More from Jose Emilio Labra Gayo

Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctoradoJose Emilio Labra Gayo
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data qualityJose Emilio Labra Gayo
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesJose Emilio Labra Gayo
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosJose Emilio Labra Gayo
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorJose Emilio Labra Gayo
 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applicationsJose Emilio Labra Gayo
 

More from Jose Emilio Labra Gayo (20)

Publicaciones de investigación
Publicaciones de investigaciónPublicaciones de investigación
Publicaciones de investigación
 
Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctorado
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data quality
 
Wikidata
WikidataWikidata
Wikidata
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologies
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introducción a la Web Semántica
Introducción a la Web SemánticaIntroducción a la Web Semántica
Introducción a la Web Semántica
 
2017 Tendencias en informática
2017 Tendencias en informática2017 Tendencias en informática
2017 Tendencias en informática
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
19 javascript servidor
19 javascript servidor19 javascript servidor
19 javascript servidor
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazados
 
16 Alternativas XML
16 Alternativas XML16 Alternativas XML
16 Alternativas XML
 
XSLT
XSLTXSLT
XSLT
 
XPath
XPathXPath
XPath
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el Servidor
 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applications
 
ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
 
SHACL by example
SHACL by exampleSHACL by example
SHACL by example
 
RDF data model
RDF data modelRDF data model
RDF data model
 

Recently uploaded

✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663Call Girls Mumbai
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...tanu pandey
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Andheri Escorts With Room Vashi Call Girls 💃 9004004663
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 

Validating RDF data: Challenges and perspectives

  • 1. Validating RDF data: Challenges and perspectives Jose Emilio Labr Gayo WESO Research Group (WEb Semantics Oviedo) University of Oviedo, Spain
  • 2. About me… Founded WESO (Web Semantics Oviedo) research group Practical applications of semantic technologies since 2004 Several domains: e-Government, e-Health Some books: "Web semántica" (in Spanish), 2012 "Validating RDF data", 2017 …and software: SHaclEX (Scala library, implements ShEx & SHACL) RDFShape (RDF playground) HTML version: http://book.validatingrdf.com Examples: https://github.com/labra/validatingRDFBookExamples
  • 3. Contents Short intro to semantic Web & knowledge graphs Short overview of RDF Understanding the RDF validation problem 2 technologies: ShEx/SHACL Challenges & perspectives
  • 4. Semantic Web Vision of the Web as web of data Not only pages, but also data: linked data Data understandable by machines Related with: Big Data Artificial intelligence Knowledge representation Example: Wikipedia vs Wikidata Tim Berners-Lee Source: Wikipedia Source: http://www.internetlivestats.com/total-number-of-websites/ Date: 05/04/2019 (19:05h)
  • 5. Who consumes Web information? People We access through some device ...and Machines They show us web pages (browsers) ...but also manage information (bots) Filtering content Doing suggestions ... "If Google doesn't understand your Web page, it almost doesn't exist"
  • 6. Web information Documentos Information retrieval DocumentosDocumentos Posts Tweets Web pages ... Publish Web documents Reality Mental model Web information must be machine-processable Semantic web technologies focus on how to achieve this goal Sensors Generate User Content generation & filtering
  • 7. People vs Machines Programmed for some tasks Predictable (no mistakes*) No problem with repetitive tasks Problems to understand the context Creativity, imagination Unpredictable (we do mistakes) We get tired with repetitive tasks Understanding based on context *when well programmed
  • 8. Information understandable by machines? Example: "Oviedo has a temperature of 36 degrees" Problem: Ambiguity and context identification Oviedo ? It can be... A city in Spain ...or a city in Florida, USA ...or a soccer player http://www.wikidata.org/entity/Q325997 http://wikidata.org/entity/Q14317 http://www.wikidata.org/entity/Q1813449 ...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076 Identifiers (URIs) in Wikidata Machine representation http://www.wikidata.org/entity/Q14317 https://www.wikidata.org/prop/direct/P2076 36^^<http://www.w3.org/2001/XMLSchema#integer>
  • 9. Knowledge representation for machines Simple statement (triple 3 elements): subject, predicate, object http://www.wikidata.org/entity/Q14317 https://www.wikidata.org/prop/direct/P2076 subject predicate object 36^^<http://www.w3.org/2001/XMLSchema#integer> wd:Q14317 wdt:P2076 36^^xsd:integer wd: <http://www.wikidata.org/entity/> wdt: <https://www.wikidata.org/prop/direct/> xsd: <http://www.w3.org/2001/XMLSchema#> Prefix declarations help simplify long URIs
  • 10. Knowledge graphs wd:Q14317 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso Wikidata statistics (17-07-2019) 57,462,853 entities 728,979,397 triples Source: https://www.wikidata.org/wiki/Wikidata:Statistics Wikidata is an example of a knowledge graph There are a lot more: Open: DBpedia, BabelNet, etc Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
  • 11. Web information & knowledge graphs Documentos Information retrieval Knowledge graphs DocumentosDocumentos Posts Tweets Web pages ... Publish Web documents Reality Mental model Knowledge graphs are a key part of the Web nowadays Sensors Generate User Content generation & filtering
  • 12. RDF RDF (Resource Description Framework) Allows to represent knowledge graphs RDF data model = graph as a set of statements Predicates = URIs Several syntaxes Turtle, N-Triples, JSON-LD,... 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 13. prefix wd: <http://www.wikidata.org/entity/> prefix wdt: <https://www.wikidata.org/prop/direct/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> wd:Q14317 wdt:P1082 220020 ; wdt:P2076 36 ; wdt:P1376 wd:Q3934 . wd:Q10514 wdt:P19 wd:Q14317 ; wdt:P569 "1981-07-29"^^xsd:date ; wdt:P18 <picture.jpg> . RDF syntaxes: Turtle 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 14. <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1082> "220020"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1376> <http://www.wikidata.org/entity/Q3934> . <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P2076> "36"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P18> <picture.jpg> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P19> <http://www.wikidata.org/entity/Q14317> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P569> "1981-07-29"^^<http://www.w3.org/2001/XMLSchema#date> . RDF syntaxes: N-Triples 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 15. { "@context" : "http://...wikidata.json", "@graph" : [ { "@id" : "wd:Q10514", "wdt:P18" : "picture.jpg", "wdt:P19" : "wd:Q14317", "P569" : "1981-07-29" }, { "@id" : "wd:Q14317", "wdt:P1082" : 220020, "wdt:P1376" : "wd:Q3934", "wdt:P2076" : 36 } ] } RDF syntaxes: JSON-LD 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 16. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wdt="https://www.wikidata.org/prop/direct/" xmlns:wd="http://www.wikidata.org/entity/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q10514"> <wdt:P18>picture.jpg</wdt:P18> <wdt:P19> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q14317"> <wdt:P1082 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">220020</wdt:P1082> <wdt:P1376 rdf:resource="http://www.wikidata.org/entity/Q3934"/> <wdt:P2076 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">36</wdt:P2076> </rdf:Description> </wdt:P19> <wdt:P569 rdf:datatype=http://www.w3.org/2001/XMLSchema#date>1981-07-29</wdt:P569> </rdf:Description> </rdf:RDF> RDF syntaxes: RDF/XML 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 17. RDF graphs are composable RDF helps information integration 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg wd:Q14317 :alice :bob :carol schema:birthPlace schema:birthPlace schema:knows schema:knows Alice Cooper Schema:name Robert Smith schema:nameschema:knows Carol King schema:name
  • 18. RDF data model Graph = set of statements (triples) Predicates (arcs) = URIs Subjects: URIs* Objects: URIs or literals* 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg wd:Q14317 *Exception: blank nodes (next slide)
  • 19. RDF data model: Blank nodes A statement about something Example: "Alice knows someone who knows Bob" :alice schema:knows :bob schema:knows :alice schema:knows _:x ; _:x schema:knows :bob . Notation (in Turtle) :alice schema:knows [ schema:knows :bob ] .  x ( :alice :knows x  x :knows :bob )
  • 20. RDF data model: Literals :bob schema:name "Robert"^^xsd:string ; schema:age "18"^^xsd:integer ; schema:birthDate "2010-04-12"^^xsd:date . :bob schema:name "Robert" ; schema:age 18 ; schema:birthDate "2010-04-12"^^xsd:date . Objects can also be literals Literals contain a lexical form and a datatype Common datatypes: XML Schema primitive datatypes If not specified, a literal has type xsd:string :bob schema:name "2010-04-12"^^xsd:date18"Robert Smith" schema:age schema:birthDate
  • 21. RDF data model: Language tagged strings String literals can be qualified by a language tag They have datatype rdfs:langString :spain rdfs:label "Spain"@en ; rdfs:label "España"@es . :Spain "España"@es"Spain"@en rdfs:label rdfs:label
  • 22. ...and that's all Yes, the RDF Data model is simple
  • 24. SELECT ?name ?birthDate ?picture WHERE { ?person wdt:P19 wd:Q14317 ; rdfs:label ?name ; wdt:P569 ?birthDate ; wdt:P18 ?picture . FILTER (Lang(?name) = 'es') } SPARQL SPARQL = Simple Protocol and RDF Query Language Similar to SQL, but for RDF Example: People born in Oviedo wdt:P19 "1981-07-29"^^xsd:date wdt:P18 wdt:P569 Oviedo wd:Q14317 "Fernando Alonso"@en rdfs:label ?person ?birthPlace?name ?picture wdt:P19 wdt:P18wdt:P569 wd:Q14317 rdfs:label wd:Q10514 wdt:P19 ...
  • 25. Shared entities and vocabularies URIs as global identifiers allow to reuse them Lots of vocabularies: domain specific or general purpose Examples: schema.org: properties that major browsers can recognize Linked Open Vocabularies
  • 26. Inference and ontologies Some vocabularies define properties that can infer new information RDF Schema: Classes, subclasses, etc. OWL: Ontologies using description logics Trade-off between expressiveness and complexity rdf:type :Student :alice :Person rdfs:subClassOf :HumanBeing owl:sameAs rdf:type rdf:type
  • 27. RDF, the good parts... RDF as an integration language RDF as a lingua franca for semantic web and linked data RDF flexibility & integration Data can be adapted to multiple environments Open and reusable data by default RDF for knowledge representation RDF data stores & SPARQL
  • 28. RDF, the other parts Consuming & producing RDF Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ... Embedding RDF in HTML Describing and validating RDF content Producer Consumer
  • 29. Why describe & validate RDF? For producers Developers can understand the contents they are going to produce Ensure they produce the expected structure Advertise and document the structure Generate interfaces For consumers Understand the contents Verify the structure before processing it Query generation & optimization Producer Consumer Shapes
  • 30. Similar technologies Technology Schema Relational Databases DDL XML DTD, XML Schema, RelaxNG, Schematron Json Json Schema RDF ? Fill that gap
  • 31. Understanding the problem Identifying the shape of graphs... Shapes can describe the form of a node (node constraint) ...and the number of possible arcs incoming/outgoing from a node ...and the possible values associated with those arcs :alice schema:name "Alice"; schema:knows :bob, :carol . IRI schema:name string 1 schema:knows IRI 0, 1,... RDF Node Shape RDF Node that represents a User <UserShape> IRI { schema:name xsd:string ; schema:knows IRI * } ShEx
  • 32. Understanding the problem Repeated properties The same property can be used for different purposes in the same data Example: A product must have 2 codes with different structure :product schema:productID "isbn:123-456-789"; schema:productID "code456" . A practical example from FHIR See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
  • 33. Understanding the problem Shapes ≠ types Nodes in RDF graphs can have 0, 1 or many rdf:type declarations A type can be used in multiple contexts, e.g. foaf:Person Nodes are not necessarily annotated with discriminating types Nodes with type :Person can represent friends, students, patients,... Different meanings and different structure depending on context Specific validation constraints for different contexts
  • 34. Understanding the problem RDF validation ≠ ontology definition ≠ instance data Ontologies are usually focused on domain entities RDF validation is focused on RDF graph features (lower level) Ontology Constraints RDF Validation Instance data Different levels :alice schema:name "Alice"; schema:knows :bob . <User> IRI { schema:name xsd:string ; schema:knows IRI } schema:knows a owl:ObjectProperty ; rdfs:domain schema:Person ; rdfs:range schema:Person . A user must have only two properties: schema:name of value xsd:string schema:knows with an IRI value
  • 35. Why not using SPARQL to validate? Pros: Expressive Ubiquitous Cons Expressive Idiomatic many ways to encode the same constraint ASK {{ SELECT ?Person { ?Person schema:name ?o . } GROUP BY ?Person HAVING (COUNT(*)=1) } { SELECT ?Person { ?Person schema:name ?o . FILTER ( isLiteral(?o) && datatype(?o) = xsd:string ) } GROUP BY ?Person HAVING (COUNT(*)=1) } { SELECT ?Person (COUNT(*) AS ?c1) { ?Person schema:gender ?o . } GROUP BY ?Person HAVING (COUNT(*)=1)} { SELECT ?Person (COUNT(*) AS ?c2) { ?S schema:gender ?o . FILTER ((?o = schema:Female || ?o = schema:Male)) } GROUP BY ?Person HAVING (COUNT(*)=1)} FILTER (?c1 = ?c2) } Example: Define SPARQL query to check: There must be one schema:name which must be a xsd:string, and one schema:gender which must be schema:Male or schema:Female
  • 36. Previous/other approaches SPIN, by TopQuadrant http://spinrdf.org/ SPARQL templates, Influenced SHACL Stardog ICV: http://docs.stardog.com/icv/icv-specification.html OWL with UNA and CWA OSLC resource shapes: https://www.w3.org/Submission/shapes/ Vocabulary for RDF validation Dublin Core Application profiles (K. Coyle, T. Baker) http://dublincore.org/documents/dc-dsp/ RDF Data Descriptions (Fischer et al) http://ceur-ws.org/Vol-1330/paper-33.pdf RDFUnit (D. Kontokostas) http://aksw.org/Projects/RDFUnit.html ...
  • 37. ShEx and SHACL 2013 RDF Validation Workshop Conclusions of the workshop: There is a need of a higher level, concise language for RDF Validation ShEx initially proposed (v 1.0) 2014 W3c Data Shapes WG chartered 2017 SHACL accepted as W3C recommendation 2017 ShEx 2.0 released as Community group draft 2018 ShEx adopted by Wikidata
  • 38. Short intro to ShEx ShEx (Shape Expressions Language) Concise and human-readable language for RDF validation & description Syntax similar to SPARQL, Turtle Semantics inspired by regular expressions & RelaxNG 2 syntaxes: Compact and RDF/JSON-LD Official info: http://shex.io Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
  • 39. ShEx implementations and playgrounds Libraries: shex.js: Javascript SHaclEX: Scala (Jena/RDF4j) PyShEx: Python shex-java: Java Ruby-ShEx: Ruby Online demos & playgrounds ShEx-simple RDFShape ShEx-Java ShExValidata
  • 40. Simple example Nodes conforming to <User> shape must: • Be IRIs • Have exactly one schema:name with a value of type xsd:string • Have zero or more schema:knows whose values conform to <User> prefix schema: <http://schema.org/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> <User> IRI { schema:name xsd:string ; schema:knows @<User> * } Prefix declarations as in Turtle/SPARQL
  • 41. RDF Validation using ShEx :alice schema:name "Alice" ; schema:knows :alice . :bob schema:knows :alice ; schema:name "Robert". :carol schema:name "Carol", "Carole" . :dave schema:name 234 . :emily foaf:name "Emily" . :frank schema:name "Frank" ; schema:email <mailto:frank@example.org> ; schema:knows :alice, :bob . :grace schema:name "Grace" ; schema:knows :alice, _:1 . _:1 schema:name "Unknown" . Try it (RDFShape): https://goo.gl/97bYdv Try it (ShExDemo):https://goo.gl/Y8hBsW Schema Data <User> IRI { schema:name xsd:string ; schema:knows @<User> * } :alice@<User>, :bob @<User>, :carol@<User>, :dave @<User>, :emily@<User>, :frank@<User>, :grace@<User> Shape map       
  • 42. Validation process :alice@:User, :bob@:User, :carol@:User ShEx Validator Result shape map :User { schema:name xsd:string ; schema:knows @:User * } ShEx Schema :alice schema:name "Alice" ; schema:knows :alice . :bob schema:knows :alice ; schema:name "Robert". :carol schema:name "Carol", "Carole" . RDF data Shape map :alice@:User, :bob@:User, :carol@!:User Input: RDF data, ShEx schema, Shape map Output: Result shape map
  • 43. Example with more ShEx features :AdultPerson EXTRA rdf:type { rdf:type [ schema:Person ] ; :name xsd:string ; :age MinInclusive 18 ; :gender [:Male :Female] OR xsd:string ; :address @:Address ? ; :worksFor @:Company + ; } :Address CLOSED { :addressLine xsd:string {1,3} ; :postalCode /[0-9]{5}/ ; :state @:State ; :city xsd:string } :Company { :name xsd:string ; :state @:State ; :employee @:AdultPerson * ; } :State /[A-Z]{2}/ :alice rdf:type :Student, schema:Person ; :name "Alice" ; :age 20 ; :gender :Male ; :address [ :addressLine "Bancroft Way" ; :city "Berkeley" ; :postalCode "55123" ; :state "CA" ] ; :worksFor [ :name "Company" ; :state "CA" ; :employee :alice ] . Try it: https://tinyurl.com/yd5hp9z4
  • 44. More info about ShEx See: ShEx by Example (slides): https://figshare.com/articles/ShExByExample_pptx/6291464 ShEx chapter from Validating RDF data book: http://book.validatingrdf.com/bookHtml010.html
  • 45. Short intro to SHACL W3C recommendation: https://www.w3.org/TR/shacl/ (July 2017) RDF vocabulary 2 parts: SHACL-Core, SHACL-SPARQL
  • 46. SHACL implementations Name Parts Language - Library Comments Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/ SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
  • 47. Basic example Try it. RDFShape https://goo.gl/T1uuzv prefix : <http://example.org/> prefix sh: <http://www.w3.org/ns/shacl#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> prefix schema: <http://schema.org/> :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . :hasEmail sh:path schema:email ; sh:minCount 1; sh:maxCount 1; sh:nodeKind sh:IRI . :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org> . :bob schema:firstName "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email "carol@mail.org" . Shapes graph Data graph  
  • 48. Same example with blank nodes Try it. RDFShape https://goo.gl/ukY5vq prefix : <http://example.org/> prefix sh: <http://www.w3.org/ns/shacl#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> prefix schema: <http://schema.org/> :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property [ sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string ; ] ; sh:property [ sh:path schema:email ; sh:minCount 1; sh:maxCount 1; sh:nodeKind sh:IRI ; ] . :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org> . :bob schema:firstName "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email "carol@mail.org" . Shapes graph Data graph  
  • 49. Some definitions about SHACL Shape: collection of targets and constraints components Targets: specify which nodes in the data graph must conform to a shape Constraint components: Determine how to validate a node Shape target declarations constraint components :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . . . .
  • 50. Validation Report The output of the validation process is a list of violation errors No errors  RDF conforms to shapes graph [ a sh:ValidationReport ; sh:conforms false ; sh:result [ a sh:ValidationResult ; sh:focusNode :bob ; sh:message "MinCount violation. Expected 1, obtained: 0" ; sh:resultPath schema:name ; sh:resultSeverity sh:Violation ; sh:sourceConstraintComponent sh:MinCountConstraintComponent ; sh:sourceShape :hasName ] ; ... [ a sh:ValidationReport ; sh:conforms true ].
  • 51. SHACL processor SHACL Processor Validation report :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . . . . Shapes graph :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org>. :bob schema:name "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email <mailto:carol@mail.org> . Data Graph [ a sh:ValidationReport ; sh:conforms true ].
  • 52. SHACL Core built-in constraint components Type Constraints Cardinality minCount, maxCount Types of values class, datatype, nodeKind Values node, in, hasValue, property Range of values minInclusive, maxInclusive minExclusive, maxExclusive String based minLength, maxLength, pattern Language based languageIn, uniqueLang Logical constraints not, and, or, xone Closed shapes closed, ignoredProperties Property pair constraints equals, disjoint, lessThan, lessThanOrEquals Non-validating constraints name, description, order, group Qualified shapes qualifiedValueShape, qualifiedValueShapesDisjoint qualifiedMinCount, qualifiedMaxCount
  • 53. Longer example :AdultPerson EXTRA a { a [ schema:Person ] ; :name xsd:string ; :age MinInclusive 18 ; :gender [:Male :Female] OR xsd:string ; :address @:Address ? ; :worksFor @:Company + ; } :Address CLOSED { :addressLine xsd:string {1,3} ; :postalCode /[0-9]{5}/ ; :state @:State ; :city xsd:string } :Company { :name xsd:string ; :state @:State ; :employee @:AdultPerson * ; } :State /[A-Z]{2}/ :AdultPerson a sh:NodeShape ; sh:property [ sh:path rdf:type ; sh:qualifiedValueShape [ sh:hasValue schema:Person ]; sh:qualifiedMinCount 1 ; sh:qualifiedMaxCount 1 ; ] ; sh:targetNode :alice ; sh:property [ sh:path :name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string ; ] ; sh:property [ sh:path :gender ; sh:minCount 1; sh:maxCount 1; sh:in (:Male :Female); ] ; sh:property [ sh:path :age ; sh:maxCount 1; sh:minInclusive 18 ] ; sh:property [ sh:path :address ; sh:node :Address ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path :worksFor ; sh:node :Company ; sh:minCount 1 ; sh:maxCount 1 ]. :Address a sh:NodeShape ; sh:closed true ; sh:property [ sh:path :addressLine; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 3 ] ; sh:property [ sh:path :postalCode ; sh:pattern "[0-9]{5}" ; sh:minCount 1 ; sh:maxCount 3 ] ; sh:property [ sh:path :city ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path :state ; sh:node :State ; ] . :Company a sh:NodeShape ; sh:property [ sh:path :name ; sh:datatype xsd:string ] ; sh:property [ sh:path :state ; sh:node :State ] ; sh:property [ sh:path :employee ; sh:node :AdultPerson ; ] ;. :State a sh:NodeShape ; sh:pattern "[A-Z]{2}" . In ShEx In SHACL Its recursive!!! (not well defined SHACL) Implementation dependent feature Try it: https://tinyurl.com/ycl3mkzr
  • 54. More info about SHACL SHACL by example (slides): https://figshare.com/articles/SHACL_by_example/6449645 SHACL chapter at Validating RDF data book http://book.validatingrdf.com/bookHtml011.html
  • 55. Some challenges and perspectives Theoretical foundations of ShEx/SHACL Inference shapes from data Validation Usability RDF Stream validation Schema ecosystems Wikidata Solid
  • 56. Theoretical foundations of ShEx/SHACL Conversion between ShEx and SHACL SHaclEX library converts subsets of both Challenges Recursion and negation Performance and algorithmic complexity Detect useful subsets of the languages Convert to SPARQL Schema/data mapping Shapes Shapes ShEx SHACL
  • 57. Inference of Shapes from Data Useful use case in practice Knowledge Graph summarization Some prototypes: ShExer, RDFShape, ShapeArchitect Try it with RDFShape: https://tinyurl.com/y8pjcbyf Shape Expression generated for wd:Q51613194 Shapes RDF data infer
  • 58. Validation usability Learning from users Early adopters: WebIndex, HL7 FHIR, Eclipse Lyo, GenWiki,… Improve error information/visualization/navigation/repairing Authoring/visualization tools Propose annotation sets UI generation Error reporting/suggestion (SHOULD/MUST/…) Shapes
  • 59. RDF Stream validation Validation of RDF streams Challenges: Incremental validation Named graphs Addition/removal of triples Sensor Stream RDF data Shapes Validator Stream Validation results
  • 60. Schema ecosystems: Wikidata In May, 2019, Wikidata announced ShEx adoption New namespace for schemas Example: https://www.wikidata.org/wiki/EntitySchema:E2 It opens lots of opportunities/challenges Schema evolution and comparison
  • 61. Schema ecosystems: Solid project SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee Goal: Re-decentralize the Web Separate data from apps Give users more control about their data Internally using linked data & RDF Shapes needed for interoperability See: https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/ Data pod Shape A Shape C Shape B App 1 App 2 App 3
  • 62. Conclusions RDF as a basis for knowledge graphs Explicit schemas can help improve data quality 2 languages proposed: ShEx/SHACL Lots of new challenges and opportunities

Editor's Notes

  1. ShEx: 20 lines of code SHACL: 60 lines of code