SlideShare a Scribd company logo
1 of 68
Download to read offline
Machine Learning Methods 
for Analysing and Linking RDF Data 
Jens Lehmann 
September 16, 2014 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 1 / 35
Structured Machine Learning 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 2 / 35
Structured Machine Learning 
How to analyse 
structured data? 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 2 / 35
Detecting Prime Patterns: Series Finder 
Construct "Modus operandi" of criminals - identified 9 new crime 
patterns in Cambridge MA, USA 
Wang, Tong, et al. "Detecting Patterns of Crime with Series Finder." AAAI 2013. 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 3 / 35
Discovery of Laws of Physics 
Background data generated using experiments 
Mathematical functions on input variables form hypothesis space 
Schmidt, Lipson. "Distilling free-form natural laws from experimental data." Science 2009. 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 4 / 35
Protein Interaction 
Rules learned via Inductive Logic Programming (ProGolem) 
understandable by experts and competitive with statistical learners 
Possibly better drug design and reduction of side effects 
Santos et al. "Automated identification of protein-ligand interaction features using Inductive 
Logic Programming: a hexose binding case study." BMC Bioinformatics 2012. 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 5 / 35
Background Knowledge 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 6 / 35
RDF and the Linked Data Principles 
RDF Triple: 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
RDF and the Linked Data Principles 
RDF Triple: 
Example: 
|http://cs.o{xz.ac.uk/John} 
Subject 
|http://cs.ox.{azc.uk/studies} 
Predicate 
|http://cs.{ozx.ac.uk/CS} 
Object 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
RDF and the Linked Data Principles 
RDF Triple: 
Example: 
|http://cs.o{xz.ac.uk/John} 
Subject 
|http://cs.ox.{azc.uk/studies} 
Predicate 
|http://cs.{ozx.ac.uk/CS} 
Object 
The term Linked Data refers to a set of best practices for publishing and 
interlinking structured data on the Web. 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
RDF and the Linked Data Principles 
RDF Triple: 
Example: 
|http://cs.o{xz.ac.uk/John} 
Subject 
|http://cs.ox.{azc.uk/studies} 
Predicate 
|http://cs.{ozx.ac.uk/CS} 
Object 
The term Linked Data refers to a set of best practices for publishing and 
interlinking structured data on the Web. 
Linked Data principles (simplified version): 
1 Use RDF and URLs as identifiers 
2 Include links to other datasets 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
OWL Ontologies 
Web Ontology Language (OWL) builds on RDF and Description 
Logics 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 8 / 35
OWL Ontologies 
Web Ontology Language (OWL) builds on RDF and Description 
Logics 
Objects 
Specific resources (constants) 
Examples: MARIA, LEIPZIG 
Classes 
Sets of objects (unary predicates) 
Examples: Student, Car, Country 
Properties 
Connections between objects (binary predicates) 
Examples: hasChild, partOf 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 8 / 35
OWL Ontologies 
Web Ontology Language (OWL) builds on RDF and Description 
Logics 
Objects 
Specific resources (constants) 
Examples: MARIA, LEIPZIG 
Classes 
Sets of objects (unary predicates) 
Examples: Student, Car, Country 
Properties 
Connections between objects (binary predicates) 
Examples: hasChild, partOf 
Can be combined to complex concepts (OWL Class Expressions), e.g.: 
Child u 9hasParent.Professor 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 8 / 35
Learning OWL Class Expressions - Definition 
Given: 
Background Knowledge (OWL ontologies and RDF datasets) 
Positive and negative examples (objects in datasets) 
Goal: 
Find OWL class expression describing positive but not negative 
examples 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 9 / 35
Application Example: Therapy Response Prediction 
 0.5-1% of population affected by Rheumatoid Arthritis 
Anti-TNF not effective for several million persons for unknown reasons 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 10 / 35
Learning OWL Class Expressions - Approaches 
Least common subsumers 
Cohen et al. Computing least common subsumers in description 
logics. AAAI 1992 
Terminological decision trees 
Fanizzi et al. Induction of concepts in web ontologies through 
terminological decision trees. ECML PKDD 2010 
Rule-based 
Fanizzi et al. DL-FOIL concept learning in description logics. ILP 
2008 
Genetic Programming 
Lehmann, Jens. Hybrid learning of ontology classes. MLDM 2007 
Refinement operators 
Lehmann et al. Concept learning in description logics using refinement 
operators. ML 2010 
Iannone et al. An algorithm based on counterfactuals for concept 
learning in the semantic web. AI 2007 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 11 / 35
Refinement Operators - Definitions 
Given a DL L, consider the quasi-ordered space hC(L),vT i over 
concepts of L 
 : C(L) ! 2C(L) is a downward L refinement operator if for any 
C 2 C(L): 
D 2 (C) implies D vT C 
Notation: Write C   D instead of D 2 (C) 
Example refinement chain: 
   Person   Man   Man u 9hasChild. 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 12 / 35
Learning using Refinement Operators 
0,45 
 
too weak 
Car 
0,73 
Person 
0,78 
Person u 9attends. 
0,97 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Start with most 
general concept 
(top down) 
Heuristic evaluates 
using pos/neg 
examples 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
Learning using Refinement Operators 
0,45 
 
too weak 
Car 
0,73 
Person 
0,78 
Person u 9attends. 
0,97 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Start with most 
general concept 
(top down) 
Heuristic evaluates 
using pos/neg 
examples 
Operator specialises 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
Learning using Refinement Operators 
0,45 
 
too weak 
Car 
0,73 
Person 
0,78 
Person u 9attends. 
0,97 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Start with most 
general concept 
(top down) 
Heuristic evaluates 
using pos/neg 
examples 
Operator specialises 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
Learning using Refinement Operators 
0,45 
 
too weak 
Car 
0,73 
Person 
0,78 
Person u 9attends. 
0,97 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Start with most 
general concept 
(top down) 
Heuristic evaluates 
using pos/neg 
examples 
Operator specialises 
Continue until 
termination 
criterion met 
= 
Learning Algorithm 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
Properties of Refinement Operators 
An L downward refinement operator  is called 
Finite iff (C) is finite for any concept C 2 C(L) 
C 
C1 . . . . . . Cn 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
Properties of Refinement Operators 
An L downward refinement operator  is called 
Finite iff (C) is finite for any concept C 2 C(L) 
Redundant iff there exist two different  refinement chains from a 
concept C to a concept D. 
C 
C1 . . . . . . Cn 
C 
E . . . 
D 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
Properties of Refinement Operators 
An L downward refinement operator  is called 
Finite iff (C) is finite for any concept C 2 C(L) 
Redundant iff there exist two different  refinement chains from a 
concept C to a concept D. 
Proper iff for C,D 2 C(L), C   D implies C6T D 
C 
C1 . . . . . . Cn 
C 
E . . . 
D 
C 
C  E 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
Properties of Refinement Operators 
An L downward refinement operator  is called 
Finite iff (C) is finite for any concept C 2 C(L) 
Redundant iff there exist two different  refinement chains from a 
concept C to a concept D. 
Proper iff for C,D 2 C(L), C   D implies C6T D 
Complete iff for C,D 2 C(L) with D @T C there is a concept E with 
E T D and a refinement chain C   · · ·   E 
Weakly complete iff for any concept C with C @T  we can reach a 
concept E with E T C from  by . 
C 
C1 . . . . . . Cn 
C 
E . . . 
D 
C 
C  E 
C 
. . . 
D  E 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
Properties of Refinement Operators 
Properties indicate how suitable a refinement operator is for solving 
the learning problem: 
Incomplete operators may miss solutions 
Redundant operators may lead to duplicate concepts in the search tree 
Improper operators may produce equivalent concepts (which cover the 
same examples) 
For infinite operators it may not be possible to compute all refinements 
of a given concept 
Key question: Which properties can be combined? 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 15 / 35
Theorem: Properties of L Refinement Operators 
Theorem 
Maximum sets of combinable properties of L refinement operators for 
L 2 {ALC,ALCN, SHOIN, SROIQ} are: 
1 {weakly complete, complete, finite} 
2 {weakly complete, complete, proper} 
3 {weakly complete, non-redundant, finite} 
4 {weakly complete, non-redundant, proper} 
5 {non-redundant, finite, proper} 
Concept Learning in Description Logics Using Refinement Operators, Lehmann, Hitzler, Ma-chine 
Learning journal, 2010 
Foundations of Refinement Operators for Description Logics; Lehmann, Hitzler, ILP confer-ence, 
2008 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 16 / 35
Definition of  
(C) = 
n 
{?} [ (C) if C =  
(C) otherwise 
B(C) = 
8 
: 
; if C = ? 
{C1 t · · · t Cn | Ci 2 MB (1  i  n)} if C =  
{A0 | A0 2 sh#(A)} if C = A (A 2 NC ) 
[{A u D | D 2 B()} 
{¬A0 | A0 2 sh(A)} if C = ¬A (A 2 NC ) 
[{¬A u D | D 2 B()} 
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D 
[ {9r.D u E | E 2 B()} 
[ {9s.D | s 2 sh#(r)} 
{8r.E | A = ar(r), E 2 A(D)} if C = 8r.D 
[ {8r.D u E | E 2 B()} 
[ {8r.? | 
D = A 2 NC and sh#(A) = ;} 
[ {8s.D | s 2 sh#(r)} 
{C1 u · · · u Ci−1 u D u Ci+1 u · · · u Cn | if C = C1 u · · · u Cn 
D 2 B(Ci ), 1  i  n} (n  2) 
{C1 t · · · t Ci−1 t D t Ci+1 t · · · t Cn | if C = C1 t · · · t Cn 
D 2 B(Ci ), 1  i  n} (n  2) 
[ {(C1 t · · · t Cn) u D | 
D 2 B()} 
Base Operator (Excerpt) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 17 / 35
Definition of  
(C) = 
n 
{?} [ (C) if C =  
(C) otherwise 
B(C) = 
8 
: 
; if C = ? 
{C1 t · · · t Cn | Ci 2 MB (1  i  n)} if C =  
{A0 | A0 2 sh#(A)} if C = A (A 2 NC ) 
[{A u D | D 2 B()} 
{¬A0 | A0 2 sh(A)} if C = ¬A (A 2 NC ) 
[{¬A u D | D 2 B()} 
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D 
[ {9r.D u E | E 2 B()} 
[ {9s.D | s 2 sh#(r)} 
{8r.E | A = ar(r), E 2 A(D)} if C = 8r.D 
[ {8r.D u E | E 2 B()} 
[ {8r.? | 
D = A 2 NC and sh#(A) = ;} 
[ {8s.D | s 2 sh#(r)} 
{C1 u · · · u Ci−1 u D u Ci+1 u · · · u Cn | if C = C1 u · · · u Cn 
D 2 B(Ci ), 1  i  n} (n  2) 
{C1 t · · · t Ci−1 t D t Ci+1 t · · · t Cn | if C = C1 t · · · t Cn 
D 2 B(Ci ), 1  i  n} (n  2) 
[ {(C1 t · · · t Cn) u D | 
D 2 B()} 
Base Operator (Excerpt) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 17 / 35
Definition of  
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D 
[ {9r.D u E | E 2 B()} 
[ {9s.D | s 2 sh#(r)} 
Examples: 
9takesPartIn.SocialEvent   
9takesPartIn.Meeting 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 18 / 35
Definition of  
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D 
[ {9r.D u E | E 2 B()} 
[ {9s.D | s 2 sh#(r)} 
Examples: 
9takesPartIn.SocialEvent   
9takesPartIn.Meeting 
Student u 9takesPartIn.SocialEvent 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 18 / 35
Definition of  
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D 
[ {9r.D u E | E 2 B()} 
[ {9s.D | s 2 sh#(r)} 
Examples: 
9takesPartIn.SocialEvent   
9takesPartIn.Meeting 
Student u 9takesPartIn.SocialEvent 
9leads.SocialEvent 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 18 / 35
Properties of  
# is complete 
# is infinite, e.g. there are infinitely many refinement steps of the 
form: 
  # C1 t C2 t C3 t . . . 
cl 
# is proper 
# is redundant: 8r1.A1 t 8r2.A1  # 8r1.(A1 u A2) t 8r2.A1 
 # 
 # 
8r1.A1 t 8r2.(A1 u A2)  # 8r1.(A1 u A2) t 8r2.(A1 u A2) 
“DL-Learner: Learning Concepts in Description Logics”, 
Jens Lehmann, Journal of Machine Learning Research (JMLR), 2009 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 19 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Learning using Refinement Operators 
0,457 [01] 
 
too weak 
Car 
0,7345789 [012345] 
Person 
0,789 [45] 
Person u 9attends. 
0,97 [4] 
Person u 9attends.Talk 
. . . 
. . . 
. . . 
Redundancy 
elimination 
technique with 
polynomial 
complexity wrt. 
search tree size 
Length of children 
limited by 
expansion value 
Infinite  applicable 
he used by heuristic 
(Bias towards short 
concepts - Occam’s 
Razor) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
Scalability 
Refinement operator should build coherent concepts 
Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz 
Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
Scalability 
Refinement operator should build coherent concepts 
Inference: 
Complete  sound vs. approximation 
Open World Assumption (OWA) vs. Closed World Assumption (CWA) 
Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz 
Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
Scalability 
Refinement operator should build coherent concepts 
Inference: 
Complete  sound vs. approximation 
Open World Assumption (OWA) vs. Closed World Assumption (CWA) 
Stochastic coverage computation 
Pick random example ! perform instance check ! compute 
confidence interval (e.g. via Wald Method) wrt. objective function 
(e.g. F-measure) 
Up to 99% less instance checks in test examples 
Low influence on accuracy shown for 380 learning tasks using 7 
ontologies (0, 2% ± 0, 4% F-measure difference) 
Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz 
Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
Scalability 
Refinement operator should build coherent concepts 
Inference: 
Complete  sound vs. approximation 
Open World Assumption (OWA) vs. Closed World Assumption (CWA) 
Stochastic coverage computation 
Pick random example ! perform instance check ! compute 
confidence interval (e.g. via Wald Method) wrt. objective function 
(e.g. F-measure) 
Up to 99% less instance checks in test examples 
Low influence on accuracy shown for 380 learning tasks using 7 
ontologies (0, 2% ± 0, 4% F-measure difference) 
Fragment extraction for application on large knowledge bases 
Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz 
Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
Carcinogenesis 
Goal: predict whether substance causes cancer 
Why: 
Each year 1000 new substances developed 
Substances can often be only be validated using time consuming and 
expensive experiments with mice ! prioritise those with high risk 
Background knowledge: 
Database of the US National Toxicology Program (NTP) 
“Obtaining accurate structural alerts for the causes of chemical cancers is 
a problem of great scientific and humanitarian value.” (A. Srinivasan, R.D. 
King, S.H. Muggleton, M.J.E. Sternberg 1997) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 22 / 35
Knowledge Base Enrichment 
Pattern Based Knowledge Base Enrichment; Lorenz Bühmann, Jens Lehmann; International 
Semantic Web Conference (ISWC) 2013 
Universal OWL Axiom Enrichment for Large Knowledge Bases; Lorenz Bühmann, Jens 
Lehmann; Knowledge Engineering and Knowledge Management (EKAW) 2012 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 23 / 35
Protégé Plugin 
Support for ontology creation and maintenance 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 24 / 35
Ontology Debugging: ORE 
ORE - A Tool for Repairing and Enriching Knowledge Bases; Lehmann, Bühmann; Interna-tional 
Semantic Web Conference (ISWC) 2010 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 25 / 35
Data Quality Measurement: RDFUnit 
Test-driven Evaluation of Linked Data Quality; World Wide Web Conference (WWW), 
ACM, 2014; Dimitris Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens 
Lehmann, Roland Cornelissen, Amrapali J. Zaveri 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 26 / 35
Robot Scientists Adam  Eve 
Abduction to form hypothesis and  1 000 experiments per day 
12 new scientific discoveries regarding functions of genes in yeast 
King, Ross D et al. The automation of science. Science 324 (2009): 85-89. 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 27 / 35
Link Discovery - Motivation 
Links are backbone of traditional WWW and Data Web 
Links are central for data integration, deduplication, cross-ontology 
question answering, reasoning, federated queries . . . 
Central problem for many large IT companies 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 28 / 35
Link Discovery - Motivation 
Links are backbone of traditional WWW and Data Web 
Links are central for data integration, deduplication, cross-ontology 
question answering, reasoning, federated queries . . . 
Central problem for many large IT companies 
Automated tools (LIMES, SILK) can create a high number of links 
between RDF resources by using heuristics 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 28 / 35
Link Discovery - Definition 
Definition (Link Discovery) 
Given sets S and T of resources and relation R (often owl:sameAs) 
Common approach: Find M = {(s, t) 2 S × T : (s, t)  } 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
Link Discovery - Definition 
Definition (Link Discovery) 
Given sets S and T of resources and relation R (often owl:sameAs) 
Common approach: Find M = {(s, t) 2 S × T : (s, t)  } 
S: DBpedia 
rdfs:label: African Elephant 
T: BBC Wildlife 
dc:title: African Bush Elephant 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
Link Discovery - Definition 
Definition (Link Discovery) 
Given sets S and T of resources and relation R (often owl:sameAs) 
Common approach: Find M = {(s, t) 2 S × T : (s, t)  } 
S: DBpedia 
rdfs:label: African Elephant 
T: BBC Wildlife 
dc:title: African Bush Elephant 
dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ? 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
Link Discovery - Definition 
Definition (Link Discovery) 
Given sets S and T of resources and relation R (often owl:sameAs) 
Common approach: Find M = {(s, t) 2 S × T : (s, t)  } 
S: DBpedia 
rdfs:label: African Elephant 
T: BBC Wildlife 
dc:title: African Bush Elephant 
dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ? 
 = levenshtein(S.rdfs:label,T.dc:title) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
Link Discovery - Definition 
Definition (Link Discovery) 
Given sets S and T of resources and relation R (often owl:sameAs) 
Common approach: Find M = {(s, t) 2 S × T : (s, t)  } 
S: DBpedia 
rdfs:label: African Elephant 
T: BBC Wildlife 
dc:title: African Bush Elephant 
dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ? 
 = levenshtein(S.rdfs:label,T.dc:title) 
(dbpedia:AfricanElephant, bbc:hfzw82929) = 5 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
Example: Link Specification 
f (trigrams(:name, :label), 0.5) f (edit(:socId, :socId), 0.5) 
t 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 30 / 35
Link Specification Syntax and Semantics 
LS [[LS]] 
f (m, ,M) {(s, t, r)|(s, t, r) 2 M ^ (m(s, t)  )} 
LS1 u LS2 {(s, t, r) | (s, t, r1) 2 [[L1]] ^ (s, t, r2) 2 [[L2]] ^ r = min(r1, r2)} 
LS1 t LS2 
8 
: 
(s, t, r) | 
8 : 
r = r1 if 9(s, t, r1) 2 [[L1]] ^ ¬(9r2 : (s, t, r2) 2 [[L2]]), 
r = r2 if 9(s, t, r2) 2 [[L2]] ^ ¬(9r1 : (s, t, r1) 2 [[L1]]), 
r = max(r1, r2) if (s, t, r1) 2 [[L1]] ^ (s, t, r2) 2 [[L2]]. 
Syntax and semantics allow to define an ordering similar to 
subsumption (more specific specs generate less links) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 31 / 35
Link Specification Refinement Operator 
#(LS) = 
8 
: 
{f (m1, 1, ) u · · · u f (mn, 1, ) if LS = ? 
| mi 2 SM, 1  i  n, n  2|SM|} 
f (m, dt(),M) [ LS t f (m0, 1,M) if LS = f (m, ,M) (atomic) 
(m 2 SM,m6= m0) 
LS1 u · · · u LSi−1 u LS0 u LSi+1 u · · · u LSn if LS = LS1 u · · · u LSn(n  2) 
with LS0 2 #(LSi ) 
LS1 t · · · t LSi−1 t LS0 t LSi+1 t · · · t LSn if LS = LS1 t · · · t LSn(n  2) 
with LS0 2 #(LSi ) [ LS t f (m, 1,M) 
(m 2 SM,m not used in LS) 
Upward refinement operator 
Postitive: Weakly complete, finite 
Negative: Not complete, redundant, not proper 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 32 / 35
Refinement Chain Example 
f (edit(:socId, :socId), 1.0) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
Refinement Chain Example 
f (edit(:socId, :socId), 1.0) 
  f (edit(:socId, :socId), 0.5) 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
Refinement Chain Example 
f (edit(:socId, :socId), 1.0) 
  f (edit(:socId, :socId), 0.5)   
f (trigrams(:name, :label), 1.0) f (edit(:socId, :socId), 0.5) 
t 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
Refinement Chain Example 
f (edit(:socId, :socId), 1.0) 
  f (edit(:socId, :socId), 0.5)   
f (trigrams(:name, :label), 1.0) f (edit(:socId, :socId), 0.5) 
t 
  
f (trigrams(:name, :label), 0.5) f (edit(:socId, :socId), 0.5) 
t 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
Projects: DL-Learner and LIMES 
DL-Learner 
Open-Source-Project: http://dl-learner.org 
Extensible Platform for concept learning algorithms 
Supports all RDF/OWL serialisations and major reasoners 
Several thousand downloads 
LIMES (http://aksw.org/Projects/LIMES.html) 
Highly scalable engine (fastest RDF link discovery tool) 
Several machine learning approaches integrated (including the one 
presented) 
“DL-Learner: Learning Concepts in Description Logics”, 
Jens Lehmann, Journal of Machine Learning Research (JMLR), 2009 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 34 / 35
Summary  Conclusions 
Many interesting applications of structured machine learning (therapy 
response prediction, disease prediction, protein folding, data quality 
measurement, ontology debugging) 
Still few machine learning tools for working with RDF/OWL although 
more and more data available 
Refinement operators allow to apply supervised machine learning on 
complex background knowledge 
Can be applied to other languages like link specifications 
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 35 / 35

More Related Content

What's hot

Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
 
Ranking Objects by Exploiting Relationships: Computing Top-K over Aggregation
Ranking Objects by Exploiting Relationships: Computing Top-K over AggregationRanking Objects by Exploiting Relationships: Computing Top-K over Aggregation
Ranking Objects by Exploiting Relationships: Computing Top-K over AggregationJason Yang
 
A Survey of Evaluation Techniques and Systems for Answer Set Programming
A Survey of Evaluation Techniques and Systems for Answer Set ProgrammingA Survey of Evaluation Techniques and Systems for Answer Set Programming
A Survey of Evaluation Techniques and Systems for Answer Set ProgrammingFörderverein Technische Fakultät
 
RDF APIs for .NET Framework
RDF APIs for .NET FrameworkRDF APIs for .NET Framework
RDF APIs for .NET FrameworkAdriana Ivanciu
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyAlbert Meroño-Peñuela
 
File Handling in C Part I
File Handling in C Part IFile Handling in C Part I
File Handling in C Part IArpana Awasthi
 
Csr2011 june16 17_00_lohrey
Csr2011 june16 17_00_lohreyCsr2011 june16 17_00_lohrey
Csr2011 june16 17_00_lohreyCSR2011
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingGiuseppe Rizzo
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webMahdi Atawneh
 
Information Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open VocabulariesInformation Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open VocabulariesGhislain Atemezing
 
The Statistical Significance of "R"
The Statistical Significance of "R"The Statistical Significance of "R"
The Statistical Significance of "R"ppvora
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in JuliaJiahao Chen
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...Oscar Corcho
 

What's hot (19)

LD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and toolsLD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and tools
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Ranking Objects by Exploiting Relationships: Computing Top-K over Aggregation
Ranking Objects by Exploiting Relationships: Computing Top-K over AggregationRanking Objects by Exploiting Relationships: Computing Top-K over Aggregation
Ranking Objects by Exploiting Relationships: Computing Top-K over Aggregation
 
A Survey of Evaluation Techniques and Systems for Answer Set Programming
A Survey of Evaluation Techniques and Systems for Answer Set ProgrammingA Survey of Evaluation Techniques and Systems for Answer Set Programming
A Survey of Evaluation Techniques and Systems for Answer Set Programming
 
RDF APIs for .NET Framework
RDF APIs for .NET FrameworkRDF APIs for .NET Framework
RDF APIs for .NET Framework
 
.Net and Rdf APIs
.Net and Rdf APIs.Net and Rdf APIs
.Net and Rdf APIs
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
R programming
R programmingR programming
R programming
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic Study
 
File Handling in C Part I
File Handling in C Part IFile Handling in C Part I
File Handling in C Part I
 
Csr2011 june16 17_00_lohrey
Csr2011 june16 17_00_lohreyCsr2011 june16 17_00_lohrey
Csr2011 june16 17_00_lohrey
 
Context-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity LinkingContext-Enhanced Adaptive Entity Linking
Context-Enhanced Adaptive Entity Linking
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
 
Information Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open VocabulariesInformation Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open Vocabularies
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
 
The Statistical Significance of "R"
The Statistical Significance of "R"The Statistical Significance of "R"
The Statistical Significance of "R"
 
What's next in Julia
What's next in JuliaWhat's next in Julia
What's next in Julia
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 

Similar to Machine Learning Methods for Analysing and Linking RDF Data

A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)Raphael Troncy
 
Verifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetVerifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetAlexandre Rademaker
 
Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)Ralf Laemmel
 
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
Is Reinforcement Learning (Not) for Natural
Language Processing.pdfIs Reinforcement Learning (Not) for Natural
Language Processing.pdf
Is Reinforcement Learning (Not) for Natural Language Processing.pdfPo-Chuan Chen
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsDimitris Kontokostas
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesAlexandra Roatiș
 
Introduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologyIntroduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologySteven Miller
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Textbutest
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...semanticsconference
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanfordSakthivel C R
 
Translating Natural Language into SPARQL for Neural Question Answering
Translating Natural Language into SPARQL for Neural Question AnsweringTranslating Natural Language into SPARQL for Neural Question Answering
Translating Natural Language into SPARQL for Neural Question AnsweringTommaso Soru
 
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...Normunds Grūzītis
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paperDBOnto
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paperDBOnto
 

Similar to Machine Learning Methods for Analysing and Linking RDF Data (20)

A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)
 
Verifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetVerifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNet
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 
Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)
 
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
Is Reinforcement Learning (Not) for Natural
Language Processing.pdfIs Reinforcement Learning (Not) for Natural
Language Processing.pdf
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
 
Introduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and TerminologyIntroduction to Ontology Concepts and Terminology
Introduction to Ontology Concepts and Terminology
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
 
Link Discovery Tutorial Introduction
Link Discovery Tutorial IntroductionLink Discovery Tutorial Introduction
Link Discovery Tutorial Introduction
 
Jesús Barrasa
Jesús BarrasaJesús Barrasa
Jesús Barrasa
 
Link Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: AccuracyLink Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: Accuracy
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
RDF briefing
RDF briefingRDF briefing
RDF briefing
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
Translating Natural Language into SPARQL for Neural Question Answering
Translating Natural Language into SPARQL for Neural Question AnsweringTranslating Natural Language into SPARQL for Neural Question Answering
Translating Natural Language into SPARQL for Neural Question Answering
 
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 

Recently uploaded

Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Tamer Koksalan, PhD
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 

Recently uploaded (20)

Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)Carbon Dioxide Capture and Storage (CSS)
Carbon Dioxide Capture and Storage (CSS)
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 

Machine Learning Methods for Analysing and Linking RDF Data

  • 1. Machine Learning Methods for Analysing and Linking RDF Data Jens Lehmann September 16, 2014 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 1 / 35
  • 2. Structured Machine Learning Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 2 / 35
  • 3. Structured Machine Learning How to analyse structured data? Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 2 / 35
  • 4. Detecting Prime Patterns: Series Finder Construct "Modus operandi" of criminals - identified 9 new crime patterns in Cambridge MA, USA Wang, Tong, et al. "Detecting Patterns of Crime with Series Finder." AAAI 2013. Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 3 / 35
  • 5. Discovery of Laws of Physics Background data generated using experiments Mathematical functions on input variables form hypothesis space Schmidt, Lipson. "Distilling free-form natural laws from experimental data." Science 2009. Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 4 / 35
  • 6. Protein Interaction Rules learned via Inductive Logic Programming (ProGolem) understandable by experts and competitive with statistical learners Possibly better drug design and reduction of side effects Santos et al. "Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study." BMC Bioinformatics 2012. Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 5 / 35
  • 7. Background Knowledge Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 6 / 35
  • 8. RDF and the Linked Data Principles RDF Triple: Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
  • 9. RDF and the Linked Data Principles RDF Triple: Example: |http://cs.o{xz.ac.uk/John} Subject |http://cs.ox.{azc.uk/studies} Predicate |http://cs.{ozx.ac.uk/CS} Object Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
  • 10. RDF and the Linked Data Principles RDF Triple: Example: |http://cs.o{xz.ac.uk/John} Subject |http://cs.ox.{azc.uk/studies} Predicate |http://cs.{ozx.ac.uk/CS} Object The term Linked Data refers to a set of best practices for publishing and interlinking structured data on the Web. Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
  • 11. RDF and the Linked Data Principles RDF Triple: Example: |http://cs.o{xz.ac.uk/John} Subject |http://cs.ox.{azc.uk/studies} Predicate |http://cs.{ozx.ac.uk/CS} Object The term Linked Data refers to a set of best practices for publishing and interlinking structured data on the Web. Linked Data principles (simplified version): 1 Use RDF and URLs as identifiers 2 Include links to other datasets Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 7 / 35
  • 12. OWL Ontologies Web Ontology Language (OWL) builds on RDF and Description Logics Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 8 / 35
  • 13. OWL Ontologies Web Ontology Language (OWL) builds on RDF and Description Logics Objects Specific resources (constants) Examples: MARIA, LEIPZIG Classes Sets of objects (unary predicates) Examples: Student, Car, Country Properties Connections between objects (binary predicates) Examples: hasChild, partOf Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 8 / 35
  • 14. OWL Ontologies Web Ontology Language (OWL) builds on RDF and Description Logics Objects Specific resources (constants) Examples: MARIA, LEIPZIG Classes Sets of objects (unary predicates) Examples: Student, Car, Country Properties Connections between objects (binary predicates) Examples: hasChild, partOf Can be combined to complex concepts (OWL Class Expressions), e.g.: Child u 9hasParent.Professor Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 8 / 35
  • 15. Learning OWL Class Expressions - Definition Given: Background Knowledge (OWL ontologies and RDF datasets) Positive and negative examples (objects in datasets) Goal: Find OWL class expression describing positive but not negative examples Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 9 / 35
  • 16. Application Example: Therapy Response Prediction 0.5-1% of population affected by Rheumatoid Arthritis Anti-TNF not effective for several million persons for unknown reasons Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 10 / 35
  • 17. Learning OWL Class Expressions - Approaches Least common subsumers Cohen et al. Computing least common subsumers in description logics. AAAI 1992 Terminological decision trees Fanizzi et al. Induction of concepts in web ontologies through terminological decision trees. ECML PKDD 2010 Rule-based Fanizzi et al. DL-FOIL concept learning in description logics. ILP 2008 Genetic Programming Lehmann, Jens. Hybrid learning of ontology classes. MLDM 2007 Refinement operators Lehmann et al. Concept learning in description logics using refinement operators. ML 2010 Iannone et al. An algorithm based on counterfactuals for concept learning in the semantic web. AI 2007 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 11 / 35
  • 18. Refinement Operators - Definitions Given a DL L, consider the quasi-ordered space hC(L),vT i over concepts of L : C(L) ! 2C(L) is a downward L refinement operator if for any C 2 C(L): D 2 (C) implies D vT C Notation: Write C D instead of D 2 (C) Example refinement chain: Person Man Man u 9hasChild. Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 12 / 35
  • 19. Learning using Refinement Operators 0,45 too weak Car 0,73 Person 0,78 Person u 9attends. 0,97 Person u 9attends.Talk . . . . . . . . . Start with most general concept (top down) Heuristic evaluates using pos/neg examples Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
  • 20. Learning using Refinement Operators 0,45 too weak Car 0,73 Person 0,78 Person u 9attends. 0,97 Person u 9attends.Talk . . . . . . . . . Start with most general concept (top down) Heuristic evaluates using pos/neg examples Operator specialises Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
  • 21. Learning using Refinement Operators 0,45 too weak Car 0,73 Person 0,78 Person u 9attends. 0,97 Person u 9attends.Talk . . . . . . . . . Start with most general concept (top down) Heuristic evaluates using pos/neg examples Operator specialises Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
  • 22. Learning using Refinement Operators 0,45 too weak Car 0,73 Person 0,78 Person u 9attends. 0,97 Person u 9attends.Talk . . . . . . . . . Start with most general concept (top down) Heuristic evaluates using pos/neg examples Operator specialises Continue until termination criterion met = Learning Algorithm Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 13 / 35
  • 23. Properties of Refinement Operators An L downward refinement operator is called Finite iff (C) is finite for any concept C 2 C(L) C C1 . . . . . . Cn Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
  • 24. Properties of Refinement Operators An L downward refinement operator is called Finite iff (C) is finite for any concept C 2 C(L) Redundant iff there exist two different refinement chains from a concept C to a concept D. C C1 . . . . . . Cn C E . . . D Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
  • 25. Properties of Refinement Operators An L downward refinement operator is called Finite iff (C) is finite for any concept C 2 C(L) Redundant iff there exist two different refinement chains from a concept C to a concept D. Proper iff for C,D 2 C(L), C D implies C6T D C C1 . . . . . . Cn C E . . . D C C E Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
  • 26. Properties of Refinement Operators An L downward refinement operator is called Finite iff (C) is finite for any concept C 2 C(L) Redundant iff there exist two different refinement chains from a concept C to a concept D. Proper iff for C,D 2 C(L), C D implies C6T D Complete iff for C,D 2 C(L) with D @T C there is a concept E with E T D and a refinement chain C · · · E Weakly complete iff for any concept C with C @T we can reach a concept E with E T C from by . C C1 . . . . . . Cn C E . . . D C C E C . . . D E Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 14 / 35
  • 27. Properties of Refinement Operators Properties indicate how suitable a refinement operator is for solving the learning problem: Incomplete operators may miss solutions Redundant operators may lead to duplicate concepts in the search tree Improper operators may produce equivalent concepts (which cover the same examples) For infinite operators it may not be possible to compute all refinements of a given concept Key question: Which properties can be combined? Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 15 / 35
  • 28. Theorem: Properties of L Refinement Operators Theorem Maximum sets of combinable properties of L refinement operators for L 2 {ALC,ALCN, SHOIN, SROIQ} are: 1 {weakly complete, complete, finite} 2 {weakly complete, complete, proper} 3 {weakly complete, non-redundant, finite} 4 {weakly complete, non-redundant, proper} 5 {non-redundant, finite, proper} Concept Learning in Description Logics Using Refinement Operators, Lehmann, Hitzler, Ma-chine Learning journal, 2010 Foundations of Refinement Operators for Description Logics; Lehmann, Hitzler, ILP confer-ence, 2008 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 16 / 35
  • 29. Definition of (C) = n {?} [ (C) if C = (C) otherwise B(C) = 8 : ; if C = ? {C1 t · · · t Cn | Ci 2 MB (1 i n)} if C = {A0 | A0 2 sh#(A)} if C = A (A 2 NC ) [{A u D | D 2 B()} {¬A0 | A0 2 sh(A)} if C = ¬A (A 2 NC ) [{¬A u D | D 2 B()} {9r.E | A = ar(r), E 2 A(D)} if C = 9r.D [ {9r.D u E | E 2 B()} [ {9s.D | s 2 sh#(r)} {8r.E | A = ar(r), E 2 A(D)} if C = 8r.D [ {8r.D u E | E 2 B()} [ {8r.? | D = A 2 NC and sh#(A) = ;} [ {8s.D | s 2 sh#(r)} {C1 u · · · u Ci−1 u D u Ci+1 u · · · u Cn | if C = C1 u · · · u Cn D 2 B(Ci ), 1 i n} (n 2) {C1 t · · · t Ci−1 t D t Ci+1 t · · · t Cn | if C = C1 t · · · t Cn D 2 B(Ci ), 1 i n} (n 2) [ {(C1 t · · · t Cn) u D | D 2 B()} Base Operator (Excerpt) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 17 / 35
  • 30. Definition of (C) = n {?} [ (C) if C = (C) otherwise B(C) = 8 : ; if C = ? {C1 t · · · t Cn | Ci 2 MB (1 i n)} if C = {A0 | A0 2 sh#(A)} if C = A (A 2 NC ) [{A u D | D 2 B()} {¬A0 | A0 2 sh(A)} if C = ¬A (A 2 NC ) [{¬A u D | D 2 B()} {9r.E | A = ar(r), E 2 A(D)} if C = 9r.D [ {9r.D u E | E 2 B()} [ {9s.D | s 2 sh#(r)} {8r.E | A = ar(r), E 2 A(D)} if C = 8r.D [ {8r.D u E | E 2 B()} [ {8r.? | D = A 2 NC and sh#(A) = ;} [ {8s.D | s 2 sh#(r)} {C1 u · · · u Ci−1 u D u Ci+1 u · · · u Cn | if C = C1 u · · · u Cn D 2 B(Ci ), 1 i n} (n 2) {C1 t · · · t Ci−1 t D t Ci+1 t · · · t Cn | if C = C1 t · · · t Cn D 2 B(Ci ), 1 i n} (n 2) [ {(C1 t · · · t Cn) u D | D 2 B()} Base Operator (Excerpt) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 17 / 35
  • 31. Definition of {9r.E | A = ar(r), E 2 A(D)} if C = 9r.D [ {9r.D u E | E 2 B()} [ {9s.D | s 2 sh#(r)} Examples: 9takesPartIn.SocialEvent 9takesPartIn.Meeting Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 18 / 35
  • 32. Definition of {9r.E | A = ar(r), E 2 A(D)} if C = 9r.D [ {9r.D u E | E 2 B()} [ {9s.D | s 2 sh#(r)} Examples: 9takesPartIn.SocialEvent 9takesPartIn.Meeting Student u 9takesPartIn.SocialEvent Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 18 / 35
  • 33. Definition of {9r.E | A = ar(r), E 2 A(D)} if C = 9r.D [ {9r.D u E | E 2 B()} [ {9s.D | s 2 sh#(r)} Examples: 9takesPartIn.SocialEvent 9takesPartIn.Meeting Student u 9takesPartIn.SocialEvent 9leads.SocialEvent Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 18 / 35
  • 34. Properties of # is complete # is infinite, e.g. there are infinitely many refinement steps of the form: # C1 t C2 t C3 t . . . cl # is proper # is redundant: 8r1.A1 t 8r2.A1 # 8r1.(A1 u A2) t 8r2.A1 # # 8r1.A1 t 8r2.(A1 u A2) # 8r1.(A1 u A2) t 8r2.(A1 u A2) “DL-Learner: Learning Concepts in Description Logics”, Jens Lehmann, Journal of Machine Learning Research (JMLR), 2009 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 19 / 35
  • 35. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 36. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 37. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 38. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 39. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 40. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 41. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 42. Learning using Refinement Operators 0,457 [01] too weak Car 0,7345789 [012345] Person 0,789 [45] Person u 9attends. 0,97 [4] Person u 9attends.Talk . . . . . . . . . Redundancy elimination technique with polynomial complexity wrt. search tree size Length of children limited by expansion value Infinite applicable he used by heuristic (Bias towards short concepts - Occam’s Razor) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 20 / 35
  • 43. Scalability Refinement operator should build coherent concepts Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
  • 44. Scalability Refinement operator should build coherent concepts Inference: Complete sound vs. approximation Open World Assumption (OWA) vs. Closed World Assumption (CWA) Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
  • 45. Scalability Refinement operator should build coherent concepts Inference: Complete sound vs. approximation Open World Assumption (OWA) vs. Closed World Assumption (CWA) Stochastic coverage computation Pick random example ! perform instance check ! compute confidence interval (e.g. via Wald Method) wrt. objective function (e.g. F-measure) Up to 99% less instance checks in test examples Low influence on accuracy shown for 380 learning tasks using 7 ontologies (0, 2% ± 0, 4% F-measure difference) Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
  • 46. Scalability Refinement operator should build coherent concepts Inference: Complete sound vs. approximation Open World Assumption (OWA) vs. Closed World Assumption (CWA) Stochastic coverage computation Pick random example ! perform instance check ! compute confidence interval (e.g. via Wald Method) wrt. objective function (e.g. F-measure) Up to 99% less instance checks in test examples Low influence on accuracy shown for 380 learning tasks using 7 ontologies (0, 2% ± 0, 4% F-measure difference) Fragment extraction for application on large knowledge bases Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 21 / 35
  • 47. Carcinogenesis Goal: predict whether substance causes cancer Why: Each year 1000 new substances developed Substances can often be only be validated using time consuming and expensive experiments with mice ! prioritise those with high risk Background knowledge: Database of the US National Toxicology Program (NTP) “Obtaining accurate structural alerts for the causes of chemical cancers is a problem of great scientific and humanitarian value.” (A. Srinivasan, R.D. King, S.H. Muggleton, M.J.E. Sternberg 1997) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 22 / 35
  • 48. Knowledge Base Enrichment Pattern Based Knowledge Base Enrichment; Lorenz Bühmann, Jens Lehmann; International Semantic Web Conference (ISWC) 2013 Universal OWL Axiom Enrichment for Large Knowledge Bases; Lorenz Bühmann, Jens Lehmann; Knowledge Engineering and Knowledge Management (EKAW) 2012 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 23 / 35
  • 49. Protégé Plugin Support for ontology creation and maintenance Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 24 / 35
  • 50. Ontology Debugging: ORE ORE - A Tool for Repairing and Enriching Knowledge Bases; Lehmann, Bühmann; Interna-tional Semantic Web Conference (ISWC) 2010 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 25 / 35
  • 51. Data Quality Measurement: RDFUnit Test-driven Evaluation of Linked Data Quality; World Wide Web Conference (WWW), ACM, 2014; Dimitris Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens Lehmann, Roland Cornelissen, Amrapali J. Zaveri Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 26 / 35
  • 52. Robot Scientists Adam Eve Abduction to form hypothesis and 1 000 experiments per day 12 new scientific discoveries regarding functions of genes in yeast King, Ross D et al. The automation of science. Science 324 (2009): 85-89. Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 27 / 35
  • 53. Link Discovery - Motivation Links are backbone of traditional WWW and Data Web Links are central for data integration, deduplication, cross-ontology question answering, reasoning, federated queries . . . Central problem for many large IT companies Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 28 / 35
  • 54. Link Discovery - Motivation Links are backbone of traditional WWW and Data Web Links are central for data integration, deduplication, cross-ontology question answering, reasoning, federated queries . . . Central problem for many large IT companies Automated tools (LIMES, SILK) can create a high number of links between RDF resources by using heuristics Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 28 / 35
  • 55. Link Discovery - Definition Definition (Link Discovery) Given sets S and T of resources and relation R (often owl:sameAs) Common approach: Find M = {(s, t) 2 S × T : (s, t) } Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
  • 56. Link Discovery - Definition Definition (Link Discovery) Given sets S and T of resources and relation R (often owl:sameAs) Common approach: Find M = {(s, t) 2 S × T : (s, t) } S: DBpedia rdfs:label: African Elephant T: BBC Wildlife dc:title: African Bush Elephant Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
  • 57. Link Discovery - Definition Definition (Link Discovery) Given sets S and T of resources and relation R (often owl:sameAs) Common approach: Find M = {(s, t) 2 S × T : (s, t) } S: DBpedia rdfs:label: African Elephant T: BBC Wildlife dc:title: African Bush Elephant dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ? Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
  • 58. Link Discovery - Definition Definition (Link Discovery) Given sets S and T of resources and relation R (often owl:sameAs) Common approach: Find M = {(s, t) 2 S × T : (s, t) } S: DBpedia rdfs:label: African Elephant T: BBC Wildlife dc:title: African Bush Elephant dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ? = levenshtein(S.rdfs:label,T.dc:title) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
  • 59. Link Discovery - Definition Definition (Link Discovery) Given sets S and T of resources and relation R (often owl:sameAs) Common approach: Find M = {(s, t) 2 S × T : (s, t) } S: DBpedia rdfs:label: African Elephant T: BBC Wildlife dc:title: African Bush Elephant dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ? = levenshtein(S.rdfs:label,T.dc:title) (dbpedia:AfricanElephant, bbc:hfzw82929) = 5 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 29 / 35
  • 60. Example: Link Specification f (trigrams(:name, :label), 0.5) f (edit(:socId, :socId), 0.5) t Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 30 / 35
  • 61. Link Specification Syntax and Semantics LS [[LS]] f (m, ,M) {(s, t, r)|(s, t, r) 2 M ^ (m(s, t) )} LS1 u LS2 {(s, t, r) | (s, t, r1) 2 [[L1]] ^ (s, t, r2) 2 [[L2]] ^ r = min(r1, r2)} LS1 t LS2 8 : (s, t, r) | 8 : r = r1 if 9(s, t, r1) 2 [[L1]] ^ ¬(9r2 : (s, t, r2) 2 [[L2]]), r = r2 if 9(s, t, r2) 2 [[L2]] ^ ¬(9r1 : (s, t, r1) 2 [[L1]]), r = max(r1, r2) if (s, t, r1) 2 [[L1]] ^ (s, t, r2) 2 [[L2]]. Syntax and semantics allow to define an ordering similar to subsumption (more specific specs generate less links) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 31 / 35
  • 62. Link Specification Refinement Operator #(LS) = 8 : {f (m1, 1, ) u · · · u f (mn, 1, ) if LS = ? | mi 2 SM, 1 i n, n 2|SM|} f (m, dt(),M) [ LS t f (m0, 1,M) if LS = f (m, ,M) (atomic) (m 2 SM,m6= m0) LS1 u · · · u LSi−1 u LS0 u LSi+1 u · · · u LSn if LS = LS1 u · · · u LSn(n 2) with LS0 2 #(LSi ) LS1 t · · · t LSi−1 t LS0 t LSi+1 t · · · t LSn if LS = LS1 t · · · t LSn(n 2) with LS0 2 #(LSi ) [ LS t f (m, 1,M) (m 2 SM,m not used in LS) Upward refinement operator Postitive: Weakly complete, finite Negative: Not complete, redundant, not proper Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 32 / 35
  • 63. Refinement Chain Example f (edit(:socId, :socId), 1.0) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
  • 64. Refinement Chain Example f (edit(:socId, :socId), 1.0) f (edit(:socId, :socId), 0.5) Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
  • 65. Refinement Chain Example f (edit(:socId, :socId), 1.0) f (edit(:socId, :socId), 0.5) f (trigrams(:name, :label), 1.0) f (edit(:socId, :socId), 0.5) t Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
  • 66. Refinement Chain Example f (edit(:socId, :socId), 1.0) f (edit(:socId, :socId), 0.5) f (trigrams(:name, :label), 1.0) f (edit(:socId, :socId), 0.5) t f (trigrams(:name, :label), 0.5) f (edit(:socId, :socId), 0.5) t Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 33 / 35
  • 67. Projects: DL-Learner and LIMES DL-Learner Open-Source-Project: http://dl-learner.org Extensible Platform for concept learning algorithms Supports all RDF/OWL serialisations and major reasoners Several thousand downloads LIMES (http://aksw.org/Projects/LIMES.html) Highly scalable engine (fastest RDF link discovery tool) Several machine learning approaches integrated (including the one presented) “DL-Learner: Learning Concepts in Description Logics”, Jens Lehmann, Journal of Machine Learning Research (JMLR), 2009 Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 34 / 35
  • 68. Summary Conclusions Many interesting applications of structured machine learning (therapy response prediction, disease prediction, protein folding, data quality measurement, ontology debugging) Still few machine learning tools for working with RDF/OWL although more and more data available Refinement operators allow to apply supervised machine learning on complex background knowledge Can be applied to other languages like link specifications Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 35 / 35