SlideShare a Scribd company logo
1 of 201
Download to read offline
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Integrating	
  Relational	
  Databases	
  
with	
  the	
  Semantic	
  Web:
A	
  Reflection
Juan	
  F.	
  Sequeda
Joint	
  work	
  with	
  Daniel	
  P.	
  Miranker	
  (UT	
  Austin)	
  and	
  Marcelo	
  Arenas	
  (PUC	
  Chile)
Thanks	
  to:	
  Oscar	
  Corcho,	
  Aibo Tian,	
  Mayank Kejriwal,	
  Hamid	
  Tirmizi
13th	
  Reasoning	
  Web	
  Summer	
  School	
  (RW	
  2017)	
  – July	
  7	
  to	
  11,	
  2017	
  – London,	
  UK
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 2
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Take	
  away	
  message	
  of	
  this	
  talk
• Reflect	
  on	
  10	
  years	
  of	
  (our)	
  research	
  on	
  
Integrating	
  Relational	
  Database	
  with	
  the	
  
Semantic	
  Web
– DISCLAIMER:	
  This	
  is	
  NOT	
  a	
  Survey
– W3C	
  Relational	
  Database	
  to	
  RDF	
  Standards	
  
(Science	
  vs	
  Engineering)
• Provide	
  answer	
  to	
  the	
  research	
  question:	
  
• Thesis:
How and	
  to	
  what extent can	
  Relational	
  Databases	
  be	
  
integrated	
  with	
  the	
  Semantic	
  Web?
Much	
  of	
  the	
  existing	
  Relational	
  Database	
  infrastructure	
  can	
  be	
  
reused	
  to	
  support	
  the	
  Semantic	
  Web
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Data
Logic
RDBMS
Semantic	
  
Web
Workshop	
   on	
  
Logic	
  and	
  Data	
  Bases,	
  
Toulouse	
   1977
Gallaire,	
  Nicolas	
  &	
  
Minker
SQL99
Recursion
KL-­‐ONE
Description	
  
Logic RDF OWL
Views Triggers
Semantic
Networks
Japanese	
   5th
Generation	
  Project
MCC
Austin,	
   TX
Today1970s
Relational	
  
Algebra
Workshops	
   on
Expert	
  Systems
Deductive	
  Databases
KRDB
1980s 1990s 2000s
Let’s	
  put	
  History	
  in	
  Today’s	
  Context
4
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
What	
  is	
  the	
  relationship	
  between
Relational	
  Model
Table	
  Definition
ConstraintsS
Q
L
Relational	
  Databases
RDF
RDFS
OWL
S
P
A
R
Q
L
TIME
Triggers Rules
Semantic	
  Web
Sequeda	
   et	
  al.	
  SQL	
  Databases	
  are	
  a	
  Moving	
  Target.	
  W3C	
  Workshop	
   on	
  RDF	
  Access	
  on	
  RDB.	
  2007
Progra
mmer
type
2 “Bob”
name
ITEmployee
subClassOf
SELECT	
  ?s	
  ?n	
  {
?s	
  type	
  ITEmployee.
?s	
  name	
  ?n
}
Literal
name
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Once	
  upon	
  a	
  time	
  …
• D2R	
  (Map,Q,Server),	
  Virtuoso	
  RDF	
  Views,	
  SquirrelRDF,	
  R2D2,	
  
Relational.OWL,	
  DB2OWL,	
  R2O,	
  Triplify,	
  Dartgrid,	
  RDBToOnto,	
  
METAmorphoses,…
https://www.w3.org/2007/03/RdfRDB/
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
F2F	
  Meeting	
  
ISWC	
  2008
March	
  2008 February	
  2009
1. Recommendation	
  
to	
  standardize	
  a	
  
mapping	
  language
2. RDB2RDF	
  Survey
(2)	
  http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf
(1)	
  http://www.w3.org/2005/Incubator/rdb2rdf/XGR-­‐rdb2rdf-­‐20090126/
October	
  2008
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Sept	
  2009 Sept	
  2012
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
0
50
100
150
200
250
Sep-­‐09
Oct-­‐09
Nov-­‐09
Dec-­‐09
Jan-­‐10
Feb-­‐10
Mar-­‐10
Apr-­‐10
May-­‐10
Jun-­‐10
Jul-­‐10
Aug-­‐10
Sep-­‐10
Oct-­‐10
Nov-­‐10
Dec-­‐10
Jan-­‐11
Feb-­‐11
Mar-­‐11
Apr-­‐11
May-­‐11
Jun-­‐11
Jul-­‐11
Aug-­‐11
Sep-­‐11
Oct-­‐11
Nov-­‐11
Dec-­‐11
Jan-­‐12
Feb-­‐12
Mar-­‐12
Apr-­‐12
May-­‐12
Jun-­‐12
Jul-­‐12
Aug-­‐12
Sep-­‐12
Oct-­‐12
First	
  F2F	
  
@Semtech 2010
FPWD
R2RML
WD
R2RML	
  +	
  DM
Rec
R2RML	
  +	
  DM
Candidate	
  Rec
R2RML	
  +	
  DM Proposed	
  Rec
R2RML	
  +	
  DM
FPWD
DM
WD
R2RML+DM
WD
R2RML+DM
Photo	
  from	
  cygri http://www.flickr.com/photos/cygri/4719458268/
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
W3C	
  Relational	
  Database	
  to	
  RDF	
  (RDB2RDF)	
  Standards
• Tools:	
  Ultrawrap,	
  Morph,	
  ontop,	
  …
• Ontology	
  Based	
  Data	
  Access	
  (OBDA)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Outline
• 9:00	
  – 10:30
– Intro
– Relational	
  Database	
  à Semantic	
  Web:	
  
Data	
  Mapping
• 10:30	
  – 11:00
– Coffee	
  Break
• 11:00	
  – 12:30
– Semantic	
  Web	
  à Relational	
  Database:	
  
Data	
  Access
– Conclusion
11
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RELATIONAL	
  DATABASES	
  à
SEMANTIC	
  WEB:	
  DATA	
  MAPPING
12
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RDF
W3C	
  Direct	
  Mapping	
  Overview
Relational
Database
Direct	
  
Mapping
Engine
Input:	
  
Database	
  (Schema	
  and	
  Data)
Primary	
  Keys
Foreign	
  Keys
Output
RDF	
  graph
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
LineItem
LineItem/lineid=6789
Order#orderid
6789
Input
• Relational	
  Schema
• Primary	
  Keys	
  PK	
  and	
  
Foreign	
  Keys	
  FK over	
  R
• Relational	
  Data
Output
• RDF	
  graphDirect	
  Mapping
W3C	
  Direct	
  Mapping
14
lineid price quantity product orderid
6789 30 2 Shoes Foo 1234
6790 20 2 Tshirt Bar 1234
<LineItem/lineid=6790>
<Order/orderid=1234>
LineItem#ref-­‐orderid LineItem#ref-­‐orderid
1234
2017-­‐07-­‐07
100
USD
1
30
2
Shoes	
  Foo
1234
6790
20
2Tshirt Bar
1234
Order#date
Order#total Order#currency
Order#status
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
Lineitem#orderid
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
What	
  do	
  we	
  need	
  to	
  automatically	
  generate?
• Generate	
  Identifiers
– IRI
– Blank	
  Nodes
• Generate	
  Triples
– Table
– Literal
– Reference
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Generating	
  Identifiers
• Identifier	
  for	
  rows,	
  tables,	
  columns	
  and	
  
foreign	
  keys
• If	
  a	
  table	
  has	
  a	
  primary	
  key,	
  
– then	
  the	
  row	
  identifier	
  will	
  be	
  an	
  IRI,	
  
– otherwise	
  a	
  blank	
  node
• The	
  identifiers	
  for	
  table,	
  columns	
  and	
  foreign	
  
keys	
  are	
  IRIs
• IRIs	
  are	
  generated	
  by	
  appending	
  to	
  a	
  given	
  
base	
  IRI
• All	
  strings	
  are	
  percent	
  encoded
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Row	
  Node
1)	
  <http://www.ex.com/Person/ID=1>
Base	
  IRI “Table	
  Name”/“PK	
  attr”=“PK	
  value”
2)	
  <http://www.ex.com/Person/ID=1;SID=123>
Base	
  IRI “Table	
  Name”/“PK	
  attr”=“PK	
  value”
3)	
  Fresh	
  Blank	
  Node
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
More	
  IRI
1)	
  <http://www.ex.com/Person>
Base	
  IRI “Table	
  Name”
2)	
  <http://www.ex.com/Person#NAME>
Base	
  IRI “Table	
  Name”#“Attribute”
3)	
  <http://www.ex.com/Person#ref-­‐CID>
Base	
  IRI “Table	
  Name”#ref-­‐“Attribute”
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
ID (pk) NAME AGE
1 Alice 25
2 Bob NULL
Person
Table	
  Triple
19
<http://www.ex.com/Person/ID=1>
<http://www.ex.com/Person>
rdf:type
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
<http://www.ex.com/Person/ID=1>
<http://www.ex.com/Person#NAME>
“Alice”	
  .
Literal	
  Triples
20
ID (pk) NAME AGE
1 Alice 25
2 Bob NULL
Person
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
ID
(pk)
NAME AGE
CID
(fk)
1 Alice 25 100
2 Bob NULL 200
Person
CID
(pk)
TITLE
100 Austin
200 Madrid
City
Reference	
  Triples
21
<http://www.ex.com/Person/ID=1>
<http://www.ex.com/Person#ref-­‐CID>
<http://www.ex.com/City/CID=100>.	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Direct	
  Mapping	
  Result
22
ID NAME AGE CID
1 Alice 25 100
2 Bob NULL 100
Person
CID NAME
100 Austin
200 Madrid
City
<Person/ID=1>
<City/CID=100>
Alice
25
Austin
<Person/ID=2>
Alice
<City/CID=200> Madrid
<Person#NAME>
<Person#AGE> <Person#NAME>
<Person#NAME>
<Person#NAME>
<Person#ref-­‐CID>
<Person#ref-­‐CID>
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Summary	
  of	
  W3C	
  Direct	
  Mapping
• Default	
  and	
  Automatic	
  Mapping
• URIs	
  are	
  automatically	
  generated
– <table>
– <table#attribute>
– <table#ref-­‐attribute>
– <Table/pkAttr=pkValue>
• RDF	
  represents	
  the	
  same	
  relational	
  schema
• RDF	
  can	
  be	
  transformed	
  by	
  
SPARQL	
  CONSTRUCT
– RDF	
  represents	
  the	
  structure	
  and	
  ontology	
  of	
  mapping	
  
author’s	
  choice
23
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Issues	
  with	
  the	
  W3C	
  Direct	
  Mapping
1. Mapping	
  is	
  only	
  from	
  Relational	
  data	
  to	
  RDF	
  
data.	
  
– The	
  relational	
  schema	
  is	
  not	
  taken	
  in	
  account.	
  
Hence	
  no	
  relational	
  schema	
  to	
  OWL
2. Semantics	
  is	
  not	
  defined	
  for	
  NULL	
  values
– “The	
  direct	
  mapping	
  does	
  not	
  generate	
  triples	
  
for	
  NULL	
  values.	
  Note	
  that	
  it	
  is	
  not	
  known	
  how	
  to	
  
relate	
  the	
  behavior	
  of	
  the	
  obtained	
  RDF	
  graph	
  
with	
  the	
  standard	
  SQL	
  semantics	
  of	
  the	
  NULL	
  
values	
  of	
  the	
  source	
  RDB.”	
  
24
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Research	
  Problem	
  with	
  Direct	
  Mapping
• How	
  can	
  a	
  relational	
  database	
  schema	
  and	
  
data,	
  including	
  nulls,	
  be	
  automatically
mapped	
  to	
  RDF	
  and	
  OWL?
• How	
  can	
  we	
  assure	
  correctness	
  of	
  mapping?
– Information	
  Preservation:	
  no	
  information	
  is	
  lost
– Query	
  Preservation:	
  no	
  queries	
  are	
  lost
– Monotonicity:	
  inserts	
  does	
  not	
  affect
– Semantics	
  Preservation:	
  constraints	
  are	
  not	
  lost
25
Hypothesis:	
  Relational	
  Databases	
  can	
  be	
  automatically	
  
mapped	
  to	
  RDF	
  and	
  OWL	
  under	
  a	
  correct	
  mapping
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
LineItem
LineItem/lineid=6789
Order#orderid
6789
Input
• Relational	
  Schema	
  R
• Set	
  Σ of	
  Primary	
  Keys	
  PK	
  
and	
  Foreign	
  Keys	
  FK over	
  R
• Instance	
  I of	
  R
Output
• RDF	
  graph
• OWL	
  Ontology	
  as	
  
RDF	
  graph
Direct	
  Mapping
Direct	
  Mapping	
  as	
  Ontology	
  Overview
26
lineid price quantity product orderid
6789 30 2 Shoes Foo 1234
6790 20 2 Tshirt Bar 1234
<LineItem/lineid=6790>
<Order/orderid=1234>
LineItem#ref-­‐orderid
1234
2017-­‐07-­‐07
100
USD
1
30
2
Shoes	
  Foo
1234
6790
20
2Tshirt Bar
1234
Order#date
Order#total Order#currency
Order#status
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
Lineitem#orderid
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
<Order> <LineItem>
LineItem#ref-­‐orderid
owl:Class
rdf:type rdf:type
We	
  need	
  to	
  be	
  careful	
  about	
  two	
  issues
• Binary	
  Relations
• NULLs
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
NULLs
• What	
  should	
  we	
  do	
  with	
  NULLs?
– Generate	
  a	
  Blank	
  Node
– Don’t	
  generate	
  a	
  triple
27
How	
  do	
  we	
  
reconstruct	
  the	
  
NULL?
lineid product comment
6789 Shoes	
  Foo “…”
6790 Tshirt Bar NULL
LineItem/lineid=6789
“…”
_:a
LineItem/lineid=6789 “…”
pr:title
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Direct	
  Mapping
Input:	
  A	
  relational	
  schema	
  R a	
  set	
  of	
  Σ of	
  
primary	
  keys	
  and	
  foreign	
  keys	
  and	
  a	
  database	
  
instance	
  I of	
  this	
  schema
Output:	
  An	
  RDF	
  Graph
28
Definition:
A	
  direct	
  mapping	
  M is	
  a	
  total	
  function	
  from	
  the	
  
set	
  of	
  all	
  (R,	
  Σ,	
  I)	
  to	
  the	
  set	
  of	
  all	
  RDF	
  graphs
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
I
R,	
  Σ Predicates	
  to	
  
store	
  (R,	
  Σ,	
  I)
Predicates	
  to	
  
Store	
  Ontology	
  O
Datalog	
  Rules	
  
to	
  generate	
  
O	
  from	
  R,	
  Σ
Datalog	
  Rules	
  
to	
  generate	
  
RDF	
  from	
  O	
  and	
  I
Datalog	
  Rules	
  
to	
  generate	
  
OWL	
  from	
  O
OWL
RDF
Direct	
  Mapping	
  RDB	
  to	
  RDF	
  and	
  OWL
29
Rel(r)
Attr(a,	
  r)
PKn(a1,	
  …	
  ,	
  an,	
  r)
Value(v,	
  a,	
  t,	
  r)
…
Class(X)	
  ←	
  Rel(X),	
  
¬IsBinRel(X)
Triple(U,"rdf:type","owl:Class")	
  
←	
  Class(R),	
  ClassIRI(R,	
  U)
Triple(s,	
  p,	
  o)	
  ←	
  …	
  
On	
  Directly	
  Mapping	
  Relational	
  Databases	
  to	
  RDF	
  and	
  OWL
Sequeda,	
  Arenas,	
  Miranker.	
  WWW	
  2012
Class(C)
DtP(p,	
  C)
ObjP(p,	
  S,	
  T)
…
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Input:	
  Relational	
  Schema
• Rel(r) :	
  
– Rel(Order)
• Attr(a,	
  r) :	
  
– Attr(total,	
  Order)
• PKn(a1,	
  …	
  ,	
  an,	
  r) :	
  
– PK1(orderid,	
  Order)
• FKn(a1,	
  …	
  ,	
  an,	
  r,	
  b1,	
  …	
  ,	
  bn,	
  s)	
  :	
  
– FK1(orderid,	
  LineItem,	
  orderid,	
  Order)
30
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
LineItem
lineid price quantity product orderid
6789 30 2 Shoes Foo 1234
6790 20 2 Tshirt Bar 1234
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Input:	
  Relational	
  Schema
• Value(v,	
  a,	
  t,	
  r)
31
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Input:	
  Relational	
  Schema
• Value(v,	
  a,	
  t,	
  r)
– Value(	
  1234,	
  orderid,	
  t1,	
  Order)
32
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Input:	
  Relational	
  Schema
• Value(v,	
  a,	
  t,	
  r)
– Value(	
  1234,	
  orderid,	
  t1,	
  Order)
– Value(	
  2017-­‐07-­‐07,	
  date,	
  t1,	
  Order)
33
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Input:	
  Relational	
  Schema
• Value(v,	
  a,	
  t,	
  r)
– Value(	
  1234,	
  orderid,	
  t1,	
  Order)
– Value(	
  2017-­‐07-­‐07,	
  date,	
  t1,	
  Order)
– Value(	
  100,	
  total,	
  t1,	
  Order)
34
orderid date total currency status
1234 2017-­‐07-­‐07 100 USD 1
Order
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Mapping	
  to	
  OWL
35
Triple(http://ex.org/Order,	
  rdf:type,	
  owl:Class)
Triple(U,"rdf:type","owl:Class")	
  ←	
  Class(R),	
  ClassIRI(R,	
  U)
ClassIRI(R,	
  X)	
  ←	
  Class(R),	
  Concat2(base,	
  R,	
  X)
Class(X)	
  ←	
  Rel(X),	
  ¬IsBinRel(X)
IsBinRel(X)	
  ←	
  BinRel(X,	
  A,	
  B,	
  S,	
  C,	
  T,	
  D)
BinRel(R,	
  A,	
  B,	
  S,	
  C,	
  T,	
  D)	
  ←	
  
PK2(A,	
  B,	
  R),	
  ¬ThreeAttr(R),	
  FK1(A,R,C,S),R	
  ≠	
  S,	
  FK1(B,R,D,T),R	
  ≠	
  T,
¬TwoFK(A,	
  R),	
  ¬TwoFK (B,	
  R),	
  ¬OneFK(A,	
  B,	
  R),	
  ¬FKTo(R)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Generating	
  IRIs	
  for	
  Tuples
36
Generate	
  IRIs	
  for	
  the	
  tuples	
  of	
  the	
  relations	
  having	
  a	
  primary	
  key:	
  
Generate	
  blank	
  nodes	
  for	
  the	
  tuples	
  of	
  the	
  relations	
  not	
  having	
  a	
  primary	
  key	
  
Generate	
  an	
  identifier	
  X	
  of	
  a	
  tuple	
  T	
  of	
  a	
  relation	
  R,	
  which	
  is	
  an	
  IRI	
  if	
  R	
  has	
  a	
  primary	
  key	
  or	
  a	
  
blank	
  node.	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Mapping	
  to	
  RDF:	
  Table	
  Triples
37
Table	
  triples:	
  for	
  each	
  relation,	
  store	
  the	
  tuples	
  that	
  belongs	
  to	
  it
Triple(http://ex.org/Order#orderid=1234 ,	
  rdf:type,	
  http://ex.org/Order )
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Mapping	
  to	
  RDF:	
  Literal	
  Triples
38
Literal	
  triples:	
  for	
  each	
  tuple,	
  store	
  the	
  values	
  in	
  each	
  of	
  its	
  attributes
Triple(http://ex.org/person#ssn=123 ,	
  http://ex.org/person#name ,	
  “Juan”)
Generate	
  for	
  every	
  tuple	
  t	
  in	
  a	
  relation	
  R	
  and	
  for	
  every	
  attribute	
  A	
  of	
  R,	
  a	
  triple	
  
storing	
  the	
  value	
  of	
  t	
  in	
  A,	
  which	
  is	
  called	
  a	
  literal	
  triple.	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Mapping	
  to	
  RDF:	
  Reference	
  Triples
39
Reference	
  triples:	
  store	
  the	
  references	
  generated	
  by	
  the	
  FKs
Triple(http://ex.org/student#id=3 ,	
  
http://ex.org/student,person#ssn,ssn ,	
  
http://ex.org/person#ssn=123 )
Construct	
  reference	
  triples	
  for	
  object	
  
properties	
  that	
  are	
  generated	
  from	
  binary	
  
relations	
  
Construct	
  reference	
  triples	
  for	
  object	
  
properties	
  that	
  are	
  generated	
  from	
  foreign	
  
keys	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Information	
  Preservation
40
I
R,	
  Σ
DM(R,	
  Σ,	
  I)
DM-­‐ (DM(R,	
  Σ,	
  I))	
  
Proof:	
  Provide	
  a	
  computable	
  mapping	
  DM-­‐
Theorem:	
  The	
  Direct	
  Mapping	
  DM is	
  information	
  preserving
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Query	
  Preservation
41
I
R,	
  Σ DM(R,	
  Σ,	
  I)
eval(Q*,	
  DM(R,	
  Σ,	
  I))eval(Q,	
  I)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Relational	
  Algebra	
  tuples	
  vs.	
  	
  SPARQL	
  mappings
42
ssn name age
789 Daniel NULL
person
t.ssn	
  =	
  789
t.name	
  =	
  Daniel
t.age	
  =	
  NULL
Then,	
  tr(t)	
  =	
  μ	
  :
• Domain	
  of	
  μ	
  is	
  {?ssn,	
  ?name}
• μ(?ssn)	
  =	
  789
• μ(?name)	
  =	
  Daniel
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Query	
  Preservation
43
I
R,	
  Σ DM(R,	
  Σ,	
  I)
eval(Q*,	
  DM(R,	
  Σ,	
  I))tr(eval(Q,	
  I)) =
Proof:	
  By	
  induction	
  on	
  the	
  structure	
  of	
  Q
Bottom-­‐up	
  algorithm	
  for	
  translating	
  Q	
  into	
  Q*
Theorem:	
  The	
  Direct	
  Mapping	
  is	
  query	
  preserving
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Monotonicity
44
DM(R,	
  Σ,	
  I1)	
  
DM(R,	
  Σ,	
  I2)
I1 ⊆ I2 DM(R,	
  Σ,	
  I1)	
  ⊆ DM(R,	
  Σ,	
  I2)
I1
R,	
  Σ
I2
R,	
  Σ
Proof:	
  All	
  negative	
  atoms	
  in	
  the	
  Datalog	
  rules	
  refer	
  to	
  the	
  schema,	
  where	
  the	
  schema	
  is	
  fixed
Theorem:	
  The	
  Direct	
  Mapping	
  DM	
  is	
  Monotone
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Semantics	
  Preservation
45
DM(R,	
  Σ,	
  I)	
  
DM(R,	
  Σ,	
  I)	
  
I
R,	
  Σ
I
R,	
  Σ
I satisfies	
   Σ	
  
I does	
  not	
  satisfies	
   Σ
Consistent	
   under	
  OWL	
  semantics
Not	
  consistent	
   under	
  OWL	
  semantics
ssn name
123 Juan
123 Marcelo
person
ssn is	
  the	
  PK
#ssn=123
Juan
Marcelo
DM(R,	
  Σ,	
  I)	
  
12
3
person#ssn
I does	
  not satisfy	
  Σ however DM(R,	
  Σ,	
  I)	
  is	
  consistent
under	
  OWL	
  semantics
Proposition:	
  The	
  direct	
  mapping	
  DM is	
  not	
  semantics	
  preserving.	
  
Theorem:	
  No	
  monotone	
  direct	
  mapping	
  is	
  semantics	
  preserving	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Extending	
  DM for	
  Semantics	
  Preservation
• Family	
  of	
  Datalog rules	
  to	
  determine	
  violation	
  
– Primary	
  Keys
– Foreign	
  Keys
– Create	
  artificial	
  triple	
  that	
  will	
  generate	
  
contradiction
• Non-­‐monotone	
  direct	
  mapping
• Information	
  Preserving
• Query	
  Preserving
• Semantics	
  Preserving
46
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Reflection	
  1
• We	
  studied	
  how	
  Relational	
  Databases	
  can	
  be	
  
automatically	
  and	
  correctly	
  mapped	
  to	
  the	
  
Semantic	
  Web
– HOW:	
  Defined	
  a	
  Direct	
  Mapping	
  using	
  Datalog	
  
rules
– EXTENT:	
  Information	
  and	
  Query	
  Preserving.	
  
Monotonicity	
  is	
  an	
  obstacle	
  for	
  Semantics	
  
Preservation	
  
Recall	
  the	
  Hypothesis:	
  
Relational	
  Databases	
  can	
  be	
  automatically	
  mapped	
  
to	
  RDF	
  and	
  OWL	
  under	
  a	
  correct	
  mapping
Information	
  Preserving,	
  Query	
  Preserving	
  and	
  Monotone
or	
  
Information,	
  Query	
  and	
  Semantics	
  Preserving
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RDF
W3C	
  R2RML
Relational
Database
R2RML
Mapping
Engine
OWL
Ontologies	
  
(e.g FOAF,	
  etc)
R2RML
File
Input
Database	
  (schema	
  and	
  data)
Target	
  Ontologies
Mappings	
  between	
  the	
  Database	
  and	
  
Target	
  Ontologies	
  in	
  R2RML
Output
RDF	
  graph
Direct	
  Mapping	
  helps	
  to	
  “bootstrap”	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Direct	
  Mapping	
  as	
  R2RML
49
ID NAME AGE CID
1 Alice 25 100
2 Bob NULL 100
Person
CID NAME
100 Austin
200 Madrid
City
<Person/ID=1>
<City/CID=100>
Alice
25
Austin
<Person/ID=2>
Alice
<City/CID=200> Madrid
<Person#NAME>
<Person#AGE> <Person#NAME>
<Person#NAME>
<Person#NAME>
<Person#ref-­‐CID>
<Person#ref-­‐CID>
How	
  can	
  this	
  be	
  
represented	
  as	
  R2RML?	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
].
Direct	
  Mapping	
  as	
  R2RML
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
].
Direct	
  Mapping	
  as	
  R2RML
51
Logical	
  Table:	
  What	
  is	
  being	
  mapped?	
  
SubjectMap:	
  How	
  to	
  generate	
  the	
  Subject?
PredicateObjectMap:	
  How	
  to	
  generate	
  the	
  Predicate	
  and	
  Object?
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
]
.
Logical	
  Table
52
What	
  is	
  being	
  mapped?
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
]
.
Subject	
  URI	
  Template
53
Subject	
  URI
<Subject	
  URI>	
  rdf:type <Class	
  URI>	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
]
.
Predicate	
  URI	
  Constant
54
Predicate	
  URI
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
]
.
Object	
  Column	
  Value
55
Object	
  Literal
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
<http://www.ex.com/Person/ID=1>
<http://www.ex.com/Person#NAME>
<http://www.ex.com/Person/1>
foaf:name
“Ugly”	
  vs “Cool”	
  URIs
56
foaf:Person
<http://www.ex.com/Person>
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [rr:column ”NAME" ]
]
.
Customization
57
Customized	
  Subject	
  URI
Customized	
  Class
Customized	
  Property
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
What	
  if	
  …	
  
58
ID NAME GENDER
1 Alice F
2 Bob M
Person
<Person/1> Alice
foaf:name
<Woman>
rdf:type
SELECT	
  ID,	
  NAME	
  
FROM	
  Person	
  
WHERE	
  GENDER	
  =	
  "F"
R2RML	
  View
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery
“””SELECT ID, NAME
FROM Person WHERE gender = “F” “””];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class <http://www.ex.com/Woman>
];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [rr:column ”NAME" ]
]
.
R2RML	
  View
59
Query	
  instead	
  of	
  table
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Quick	
  Overview	
  of	
  R2RML
• Manual	
  and	
  Customizable	
  Language
• Learning	
  Curve
• Direct	
  Mapping	
  bootstraps	
  R2RML
• RDF	
  represents	
  the	
  structure	
  and	
  ontology	
  of	
  
mapping	
  author’s	
  choice
60
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
W3C	
  R2RML	
  Details
• Logical	
  Tables:	
  What	
  is	
  being	
  mapped
• Term	
  Maps:	
  How	
  to	
  create	
  RDF	
  terms
• How	
  to	
  create	
  Triples	
  from	
  a	
  table
• How	
  to	
  create	
  Triples	
  between	
  two	
  tables
• Languages
• Datatypes
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
R2RML	
  Mapping
Input	
  Database
Logical	
  Table	
  
Logical	
  Table	
  =	
  existing	
  table	
  or	
  view	
  in	
  database
R2RML	
  View	
  =	
  SQL	
  Query
R2RML	
  Mapping
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
pid name
100 Dan
200 Marcelo
Student
Professor
ex:Student1	
  rdf:type ex:Student .
ex:Student2	
  rdf:type ex:Student .
ex:Professor100	
  rdf:type ex:Professor .
ex:Professor200	
  rdf:type ex:Professor .
ex:Student1	
  foaf:name “Juan”.
…
R2RML	
  Mapping
R2RML	
  Mapping
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
R2RML	
  Mapping
• A	
  R2RML	
  Mapping	
  M consists	
  of	
  a	
  finite	
  set	
  
TM TripleMaps.
• Each	
  TM	
  ∈TM	
  consists	
  of	
  a	
  tuple	
  
(LT,	
  SM,	
  POM)
– LT:	
  LogicalTable
– SM:	
  SubjectMap
– POM:	
  PredicateObjectMap
• Each	
  POM∈POM	
  consists	
  of	
  a	
  pair	
  (PM,	
  OM)*
– PM:	
  PredicateMap
– OM:	
  ObjectMap
*	
  For	
  simplicity
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
R2RML	
  Mapping
• An	
  R2RML	
  Mapping	
  is	
  represented	
  as	
  an	
  RDF	
  
Graph	
  itself.
• Associated	
  RDFS	
  schema
– http://www.w3.org/ns/r2rml
• Turtle	
  is	
  the	
  recommended	
  syntax
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
LogicalTable
• Tabular	
  data	
  mapped	
  to	
  RDF
– rr:logicalTable
1. Existing	
  Relational	
  table	
  or	
  view
– rr:tableName
2. R2RML	
  (SQL)	
  View
– rr:sqlQuery
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [rr:column ”NAME" ]
]
.
67
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery
“””SELECT ID, NAME
FROM Person WHERE gender = “F” “””];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class <http://www.ex.com/Woman>
];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [rr:column ”NAME" ]
]
.
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
How	
  to	
  create	
  RDF	
  terms	
  that	
  define	
  RDF	
  Triples?
• RDF	
  term	
  is	
  either	
  an	
  IRI,	
  a	
  blank	
  node,	
  or	
  a	
  
literal
• Answer
1. Constant	
  Value
2. Value	
  in	
  the	
  database
a. Raw	
  Value	
  in	
  a	
  Column
b. Column	
  Value	
  applied	
  to	
  a	
  template
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
TermMap
• A	
  TermMap is	
  a	
  function	
  that	
  generates	
  an	
  
RDF	
  Term	
  from	
  a	
  logical	
  table	
  row.
• RDF	
  Term	
  is	
  either	
  a	
  IRI,	
  or	
  a	
  Blank	
  Node,	
  or	
  a	
  
Literal
Logical	
  Table	
  Row
TermMap
IRI
Bnode
Literal
RDF	
  Term
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
TermMap
• A	
  TermMap must	
  be	
  exactly	
  on	
  of	
  the	
  
following
– Constant-­‐valued	
  TermMap
– Column-­‐valued	
  TermMap
– Template-­‐valued	
  TermMap
• If	
  TermMaps are	
  used	
  to	
  create	
  S,	
  P,	
  O,	
  then
– 3	
  ways	
  to	
  create	
  a	
  subject
– 3	
  ways	
  to	
  create	
  a	
  predicate
– 3	
  ways	
  to	
  create	
  an	
  object
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Stemplate
Ptemplate
Otemplate
Oconstant
Ocolumn
PConstant
Otemplate
Oconstant
Ocolumn
Pcolumn
Otemplate
Oconstant
Ocolumn
Sconstant
Ptemplate
Otemplate
Oconstant
Ocolumn
PConstant
Otemplate
Oconstant
Ocolumn
Pcolumn
Otemplate
Oconstant
Ocolumn
Scolumn
Ptemplate
Otemplate
Oconstant
Ocolumn
PConstant
Otemplate
Oconstant
Ocolumn
Pcolumn
Otemplate
Oconstant
Ocolumn
How	
  many	
  ways	
  to	
  create	
  a	
  Triple?
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Constant-­‐valued	
  TermMap
• A	
  TermMap that	
  ignores	
  the	
  logical	
  table	
  row	
  
and	
  always	
  generates	
  the	
  same	
  RDF	
  term
• rr:constant
• Commonly	
  used	
  to	
  generate	
  constant	
  IRIs	
  as	
  
the	
  predicate
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [rr:column ”NAME" ]
]
.
75
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Column-­‐valued	
  TermMap
• A	
  TermMap that	
  maps	
  a	
  column	
  value	
  of	
  a	
  
column	
  name	
  in	
  a	
  logical	
  table	
  row
• rr:column
• Commonly	
  used	
  to	
  generate	
  Literals	
  as	
  the	
  
object
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [rr:column ”NAME" ]
]
.
77
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Template-­‐valued	
  TermMap
• A	
  TermMap that	
  maps	
  the	
  column	
  values	
  of	
  a	
  
set	
  of	
  column	
  names	
  to	
  a	
  string	
  template.
• A	
  string	
  template is	
  a	
  format	
  that	
  can	
  be	
  used	
  
to	
  build	
  strings	
  from	
  multiple	
  components.
• rr:template
• Commonly	
  used	
  to	
  generate	
  IRIs	
  as	
  the	
  
subject	
  or	
  concatenate	
  different	
  attributes
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [rr:column ”NAME" ]
]
.
79
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Commonly used…
• …	
  but	
  any	
  of	
  these	
  TermMaps can	
  be	
  used	
  to	
  
create	
  any	
  RDF	
  Term	
  (s,p,o).	
  Recall:
– 3	
  ways	
  to	
  create	
  a	
  subject
– 3	
  ways	
  to	
  create	
  a	
  predicate
– 3	
  ways	
  to	
  create	
  an	
  object
• Template-­‐valued	
  TermMap are	
  commonly	
  
used	
  to	
  create	
  an	
  IRI	
  for	
  a	
  subject,	
  but	
  can	
  be	
  
used	
  to	
  create	
  Literal	
  for	
  an	
  object.
• How	
  to	
  specify	
  the	
  term	
  (IRI	
  or	
  Literal	
  in	
  this	
  
case)?
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
TermType
• Specify	
  the	
  type	
  of	
  a	
  term	
  that	
  a	
  TermMap
should	
  generate
• Force what	
  the	
  RDF	
  term	
  should	
  be
• Three	
  types	
  of	
  TermType:
– rr:IRI
– rr:BlankNode
– rr:Literal
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [
rr:template ”{FIRST_NAME} {LAST_NAME}”;
rr:termType rr:Literal;
]
]
.
82
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template ”person{ID}";
rr:termType rr:BlankNode;
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [rr:column ”NAME" ]
]
.
83
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
TermType (cont…)
• Can	
  only	
  be	
  applied	
  to	
  Template	
  and	
  Column	
  
valued	
  TermMap
• Applying	
  to	
  Constant-­‐valued	
  TermMap has	
  no	
  
effect
– i.e If	
  the	
  constant	
  is	
  an	
  IRI,	
  the	
  term	
  type	
  is	
  
automatically	
  an	
  IRI
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
TermType Rules
• If	
  the	
  Term	
  Map	
  is	
  for	
  a	
  
1. Subject	
  à TermType =	
  IRI	
  or	
  Blank	
  Node
2. Predicate	
  à TermType =	
  IRI	
  
3. Object	
  à TermType =	
  IRI or	
  Blank	
  Node	
  or	
  Literal
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
TermType is	
  Optional
• If	
  a	
  TermType is	
  not	
  specified	
  then
– Default	
  =	
  IRI
– Unless	
  it’s	
  for	
  an	
  object	
  being	
  defined	
  by	
  a	
  
Column-­‐based	
  TermMap or	
  has	
  a	
  language	
  tag	
  or	
  
specified	
  datatype,	
  then	
  the	
  TermType is	
  a	
  Literal
• That’s	
  why	
  if	
  there	
  is	
  a	
  template	
  in	
  an	
  
ObjectMap,	
  it	
  will	
  always	
  generate	
  an	
  IRI,	
  
unless	
  a	
  TermType to	
  Literal	
  is	
  specified.
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [
rr:template ”{FIRST_NAME} {LAST_NAME}”;
rr:termType rr:Literal;
]
]
87
rr:predicateObjectMap [
rr:predicateMap [rr:constant ex:role ]
rr:objectMap [
rr:template ”http://ex.com/role/{role}”
]
]
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name ]
rr:objectMap [
rr:template ”{FIRST_NAME} {LAST_NAME}”
]
]
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
NOW	
  WE	
  HAVE	
  THE	
  ELEMENTS	
  TO	
  
CREATE	
  TRIPLES
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Generating	
  an	
  RDF	
  Triple
• TermMap that	
  specifies	
  what	
  RDF	
  term	
  should	
  
be	
  for	
  S,	
  P,	
  O
– SubjectMap
– PredicateMap
– ObjectMap
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
SubjectMap
• SubjectMap is	
  a	
  TermMap
• rr:subjectMap
• Specifies	
  what	
  the	
  subject	
  of	
  a	
  triple	
  should	
  be
• 3	
  ways	
  to	
  create	
  a	
  subject
– Template-­‐valued	
  Term	
  Map
– Column-­‐valued	
  Term	
  Map
– Constant-­‐valued	
  Term	
  Map
• Has	
  to	
  be	
  an	
  IRI	
  or	
  Blank	
  Node
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
SubjectMap
• SubjectMaps are	
  usually Template-­‐valued	
  
TermMap
• Use-­‐case	
  for	
  Column-­‐valued	
  TermMap
– Use	
  a	
  column	
  value	
  to	
  create	
  a	
  blank	
  node
– URI	
  exist	
  as	
  a	
  column	
  value
• Use-­‐case	
  for	
  Constant-­‐valued	
  TermMap
– For	
  all	
  tuples:	
  <CompanyABC>	
  <consistsOf>	
  <Dep{id}>
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
SubjectMap
• Optionally,	
  a	
  SubjectMap may	
  have	
  one	
  or	
  
more	
  Class	
  IRIs	
  associated
– This	
  will	
  generate	
  rdf:type triples
• rr:class
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [rr:column ”NAME" ]
]
.
94
Optional
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
PredicateObjectMap
• A	
  function	
  that	
  creates	
  one	
  or	
  more	
  predicate-­‐
object	
  pairs	
  for	
  each	
  logical	
  table	
  row.
• rr:predicateObjectMap
• It	
  is	
  used	
  in	
  conjunction	
  with	
  a	
  SubjectMap to	
  
generate	
  RDF	
  triples	
  in	
  a	
  TriplesMap.
• A	
  predicate-­‐object	
  pair	
  consists	
  of
– One	
  or	
  more	
  PredicateMaps
– One	
  or	
  more	
  ObjectMaps or	
  
ReferencingObjectMaps
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name];
rr:objectMap [rr:column ”NAME" ]
]
.
96
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
PredicateMap
• PredicateMap is	
  a	
  TermMap
• rr:predicateMap
• Specifies	
  what	
  the	
  predicate	
  of	
  a	
  RDF	
  triple	
  
should	
  be
• 3	
  ways	
  to	
  create	
  a	
  predicate
– Template-­‐valued	
  Term	
  Map
– Column-­‐valued	
  Term	
  Map
– Constant-­‐valued	
  Term	
  Map
• Has	
  to	
  be	
  an	
  IRI
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
PredicateMap
• PredicateMaps are	
  usually Constant-­‐valued	
  
TermMap
• Use-­‐case	
  for	
  Column-­‐valued	
  TermMap
– …	
  
• Use-­‐case	
  for	
  Template-­‐valued	
  TermMap
– …
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name];
rr:objectMap [rr:column ”NAME" ]
]
.
99
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [rr:column ”NAME" ]
]
.
10
0
Shortcut!
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Constant	
  Shortcut	
  Properties
• ?x	
  rr:predicate ?y
• ?x	
  rr:predicateMap [	
  rr:constant ?y	
  ]
• ?x	
  rr:subject ?y
• ?x	
  rr:subjectMap [	
  rr:constant ?y	
  ]
• ?x	
  rr:object ?y
• ?x	
  rr:objectMap [	
  rr:constant ?y	
  ]
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
ObjectMap
• ObjectMap is	
  a	
  TermMap
• rr:objectMap
• Specifies	
  what	
  the	
  object	
  of	
  a	
  triple	
  should	
  be
• 3	
  ways	
  to	
  create	
  a	
  predicate
– Template-­‐valued	
  Term	
  Map
– Column-­‐valued	
  Term	
  Map
– Constant-­‐valued	
  Term	
  Map
• Has	
  to	
  be	
  an	
  IRI	
  or	
  Literal	
  or	
  Blank	
  Node
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
ObjectMap
• ObjectMaps are	
  usually Column-­‐valued	
  
TermMap
• Use-­‐case	
  for	
  Template-­‐valued	
  TermMap
– Concatenate	
  values
– Create	
  IRIs
• Use-­‐case	
  for	
  Constant-­‐valued	
  TermMap
– All	
  rows	
  in	
  a	
  table	
  share	
  a	
  role
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicateObjectMap [
rr:predicateMap [rr:constant foaf:name];
rr:objectMap [rr:column ”NAME" ]
]
.
10
4
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student2 rdf:type ex:Student .
TripleMap
Example	
  1
• We	
  now	
  have	
  sufficient	
  elements	
  to	
  create	
  a	
  
mapping	
  that	
  will	
  generate
– A	
  Subject	
  IRI
– rdf:Type triple(s)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Example	
  1
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
].
Logical	
  Table	
  is	
  a	
  Table	
  Name
SubjectMap is	
  a
Template-­‐valued	
   TermMap
And	
  it	
  has	
  one	
  Class	
  IRI	
  	
  	
  
sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student2 rdf:type ex:Student .
TripleMap
ρ1 :	
  	
  	
  	
  Student(s,	
  x,	
  y)	
  ∧ p	
  =	
  rdf:type∧ o	
  =	
  ex:Student →	
  Triple(s,	
  p,	
  o)	
  
Predicate	
   ObjectQuery	
  over	
  R	
  and	
  Subject
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Class	
  RDB2RDF	
  Rule
• Given	
  a	
  relational	
  schema	
  R such,	
  a	
  class	
  
RDB2RDF-­‐rule	
  ρ over	
  R	
  is	
  a	
  first-­‐order	
  formula	
  
of	
  the	
  form:	
  
∀s∀p∀o∀x̄	
  α(s,x ̄)	
  ∧ p	
  =	
  type ∧ o	
  =	
  c	
  →	
  triple(s,p,o)	
  
where	
  α(s,x ̄) is	
  a	
  query	
  over	
  R	
  and	
  c	
  ∈ D and	
  D
is	
  a	
  a	
  countably infinite	
  domain	
  of	
  constants	
  
10
8
ρ1 :	
  	
  	
  	
  Student(s,	
  x,	
  y)	
  ∧ p	
  =	
  rdf:type∧ o	
  =	
  ex:Student →	
  Triple(s,	
  p,	
  o)	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student1 ex:name “Juan” .
ex:Student2 rdf:type ex:Student .
ex:Student2 ex:name “Martin” .
TripleMap
Example	
  2
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Example	
  2
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:name;
rr:objectMap [ rr:column “name”];
].
Logical	
  Table	
  is	
  a	
  Table	
  Name
SubjectMap is	
  a
Template-­‐valued	
   TermMap
And	
  it	
  has	
  one	
  Class	
  IRI	
  	
  	
  
PredicateObjectMap
PredicateMap which	
  is	
  a	
  
Constant-­‐valued	
  TermMap
ObjectMap which	
  is	
  a	
  
Column-­‐valued	
  TermMap
ρ2 :	
  	
  	
  	
  Student(s,	
  o,	
  y)	
  ∧ p	
  =	
  ex:name →	
  Triple(s,	
  p,	
  o)	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Predicate	
  RDB2RDF	
  Rule
• Given	
  a	
  relational	
  schema	
  R such,	
  a	
  class	
  
RDB2RDF-­‐rule	
  ρ over	
  R	
  is	
  a	
  first-­‐order	
  formula	
  
of	
  the	
  form:	
  
∀s∀p∀o∀x̄	
  β(s,	
  o,	
  x	
  ̄)	
  ∧ p	
  =	
  c	
  →	
  triple(s,	
  p,	
  o)	
  
where	
  β(s,	
  o,	
  x	
  ̄)	
  is	
  a	
  query	
  over	
  R	
  and	
  c	
  ∈ D and	
  
D is	
  a	
  a	
  countably infinite	
  domain	
  of	
  constants	
  
11
1
ρ2 :	
  	
  	
  	
  Student(s,	
  o,	
  y)	
  ∧ p	
  =	
  ex:name →	
  Triple(s,	
  p,	
  o)	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RDB2RDF	
  Mapping
• An	
  RDB2RDF	
  mapping	
  M over	
  R is	
  a	
  finite	
  set	
  
of	
  class	
  or	
  predicate	
  RDB2RDF	
  rules	
  over	
  R
11
2
M =	
  {ρ1, ρ2}
ρ1 :	
  	
  	
  	
  Student(s,	
  x,	
  y)	
  ∧ p	
  =	
  rdf:type∧ o	
  =	
  ex:Student →	
  Triple(s,	
  p,	
  o)	
  
ρ2 :	
  	
  	
  	
  Student(s,	
  o,	
  y)	
  ∧ p	
  =	
  ex:name →	
  Triple(s,	
  p,	
  o)	
  
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:name;
rr:objectMap [ rr:column “name”];
].
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student1 ex:comment “Juan is a Student” .
ex:Student2 rdf:type ex:Student .
ex:Student2 ex:comment “Martin is a Student” .
TripleMap
Example	
  3
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Example	
  3
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:comment;
rr:objectMap [
rr:template “{name} is a Student”;
rr:termType rr:Literal;
];
].
Logical	
  Table	
  is	
  a	
  Table	
  Name
SubjectMap is	
  a
Template-­‐valued	
   TermMap
And	
  it	
  has	
  one	
  Class	
  IRI	
  	
  	
  
PredicateObjectMap
PredicateMap which	
  is	
  a	
  
Constant-­‐valued	
  TermMap
ObjectMap which	
  is	
  a	
  
Template-­‐valued	
   TermMap
TermType
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student1 ex:webpage <http://ex.com/Juan>.
ex:Student2 rdf:type ex:Student .
ex:Student2 ex:webpage <http://ex.com/Martin>.
TripleMap
Example	
  4
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Example	
  4
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:webpage;
rr:objectMap [
rr:template “http://ex.com/{name}”;
];
].
Logical	
  Table	
  is	
  a	
  Table	
  Name
SubjectMap is	
  a
Template-­‐valued	
   TermMap
And	
  it	
  has	
  one	
  Class	
  IRI	
  	
  	
  
PredicateObjectMap
PredicateMap which	
  is	
  a	
  
Constant-­‐valued	
  TermMap
ObjectMap which	
  is	
  a	
  
Template-­‐valued	
   TermMap
Note	
  that	
  there	
  is	
  not	
  TermType
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student1 ex:studentType ex:GradStudent.
ex:Student2 rdf:type ex:Student .
ex:Student2 ex:studentType ex:GradStudent.
TripleMap
Example	
  5
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Example	
  6
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:studentType;
rr:object ex:GradStudent ;
].
Logical	
  Table	
  is	
  a	
  Table	
  Name
SubjectMap is	
  a
Template-­‐valued	
   TermMap
And	
  it	
  has	
  one	
  Class	
  IRI	
  	
  	
  
PredicateObjectMap
PredicateMap which	
  is	
  a	
  
Constant-­‐valued	
  TermMap
ObjectMap which	
  is	
  a	
  
Constant-­‐valued	
  TermMap
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RefObjectMap
• A	
  RefObjectMap (Referencing	
  ObjectMap)	
  
allows	
  using	
  the	
  subject	
  of	
  another	
  
TriplesMap as	
  the	
  object	
  generated	
  by	
  a	
  
ObjectMap.
• rr:objectMap
• A	
  RefObjectMap defined	
  by
– Exactly	
  one	
  ParentTripleMap,	
  which	
  must	
  be	
  a	
  
TripleMap
– May	
  have	
  one	
  or	
  more	
  JoinConditions
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person ];
rr:predicateObjectMap [
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;
]
]
]
.
<TriplesMap2>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”City" ];
rr:subjectMap [ rr:template "http://ex.com/City/{CID}";
rr:class ex:City ];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [ rr:column ”TITLE" ]
]
.
12
0
RefObjectMap
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
ParentTripleMap
• The	
  referencing	
  TripleMap
• rr:parentTriplesMap
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person ];
rr:predicateObjectMap [
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;
]
]
]
.
Parent	
  TriplesMap
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
JoinCondition
• Join	
  between	
  child	
  and	
  parent	
  attributes
• rr:joinCondition
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person ];
rr:predicateObjectMap [
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;
]
]
]
.
JoinCondition
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person ];
rr:predicateObjectMap [
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;
]
]
]
.
<TriplesMap2>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”City" ];
rr:subjectMap [ rr:template "http://ex.com/City/{CID}";
rr:class ex:City ];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [ rr:column ”TITLE" ]
]
.
12
3
RefObjectMap
Parent	
  TriplesMap
JoinCondition
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
JoinCondition
• Child	
  Column	
  which	
  must	
  
be	
  the	
  column	
  name	
  that	
  
exists	
  in	
  the	
  logical	
  table	
  
of	
  the	
  TriplesMap that	
  
contains	
  the	
  
RefObjectMap
• Parent	
  Column	
  which	
  
must	
  be	
  the	
  column	
  
name	
  that	
  exists	
  in	
  the	
  
logical	
  table	
  of	
  the	
  
RefObjectMap’sParent	
  
TriplesMap.
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
...
rr:predicateObjectMap [
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;]
]
] .
<TriplesMap2>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”City" ];
...
.
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
JoinCondition
• Child	
  Query
– The	
  Child	
  Query	
  of	
  a	
  
RefObjectMap is	
  the	
  
LogicalTable of	
  the	
  
TriplesMap containing	
  the	
  
RefObjectMap
• Parent	
  Query
– The	
  ParentQuery of	
  a	
  
RefObjectMap is	
  the	
  
LogicalTable of	
  the	
  Parent	
  
TriplesMap
• If	
  the	
  ChildQuery and	
  
ParentQuery are	
  not	
  
identical,	
  then	
  a	
  
JoinCondition must	
  exist
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
...
rr:predicateObjectMap [
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;]
]
] .
<TriplesMap2>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”City" ];
...
.
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name pid
1 Juan 100
2 Martin 200
pid name
100 Dan
200 Marcelo
Student
Professor
ex:Student1	
  rdf:type ex:Student .
ex:Student2	
  rdf:type ex:Student .
ex:Professor100	
  rdf:type ex:Professor .
ex:Professor200	
  rdf:type ex:Professor .
ex:Student1	
  ex:hasAdvisor ex:Professor100	
  .
ex:Student2	
  ex:hasAdvisor ex:Professor200
R2RML	
  Mapping
Example	
  7
ρ1 :	
  	
  	
  	
  Student(s,	
  x,	
  o)	
  ∧ Professor(o,	
  z)	
  ∧ p	
  =	
  ex:hasAdvisor →	
  Triple(s,	
  p,	
  o)	
  
ρ2 :	
  	
  	
  	
  Student(s,	
  x,	
  y)	
  ∧ p	
  =	
  rdf:type∧ o	
  =	
  ex:Student →	
  Triple(s,	
  p,	
  o)	
  
ρ3 :	
  Professor(s,	
  x)	
  ∧ p	
  =	
  rdf:type∧ o	
  =	
  ex:Professor →	
  Triple(s,	
  p,	
  o)	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns/>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:hasAdvisor;
rr:objectMap [
rr:parentTriplesMap <#TriplesMap2>;
rr:joinCondition [
rr:child “pid”;
rr:parent “pid”;
]
]
].
<#TriplesMap2>
rr:logicalTable [ rr:tableName ”Professor”];
rr:subjectMap [
rr:template "http://example.com/ns/{pid}";
rr:class ex:Professor;
].
RefObjectMap
Parent	
  TriplesMap
JoinCondition
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Summary
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Languages
• TermMap with	
  a	
  TermType of	
  rr:Literal may	
  
have	
  a	
  language	
  tag
• rr:language
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:comment;
rr:objectMap [
rr:column “comment”;
rr:language “en”;
];
].
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
sid name comment
1 Juan Excellent Student
2 Martin Wonderful	
  student
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
ex:Student1 ex:comment “Excellent Student”@en .
ex:Student2 rdf:type ex:Student .
ex:Student2 ex:comment “Wonderful Student”@en .
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Issue	
  with	
  Languages
• What	
  happens	
  if	
  language	
  value	
  is	
  in	
  the	
  data?
ID COUNTRY_ID LABEL LANG
1 1 United	
  States en
2 1 Estados Unidos es
3 2 England en
4 2 Inglaterra es
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
@prefix ex: <http://example.com/ns/>.
ex:country1 rdfs:label “United States”@en .
ex:country1 rdfs:label “Estados Unidos”@es .
ex:country2 rdfs:label “England”@en .
ex:country2 rdfs:label “Inglaterra”@es .
ID COUNTRY_ID LABEL LANG
1 1 United	
  States en
2 1 Estados Unidos es
3 2 England en
4 2 Inglaterra es
?
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Issue	
  with	
  Languages
• Mapping	
  for	
  each	
  language
<#TripleMap_Countries_EN>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery """SELECT COUNTRY_ID, LABEL FROM
COUNTRY WHERE LANG = ’en'""" ];
rr:subjectMap [
rr:template "http://example.com/country{COUNTRY_ID}"
];
rr:predicateObjectMap [
rr:predicate rdfs:label;
rr:objectMap [
rr:column “LABEL”;
rr:language “en”;
];
].
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Datatypes
• TermMap with	
  a	
  TermType of	
  rr:Literal
• TermMap does	
  not	
  have	
  rr:language
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
];
rr:predicateObjectMap [
rr:predicate ex:startDate;
rr:objectMap [
rr:column “start_date”;
rr:datatype xsd:date;
];
].
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Summary	
  of	
  Terminology
• R2RML	
  Mapping
• Logical	
  Table
• Input	
  Database
• R2RML	
  View
• TriplesMap
• Logical	
  Table	
  Row
• TermMap
• TermType
• SubjectMap
• PredicateObjectMap
• PredicateMap
• ObjectMap
• Constant-­‐valued	
  TermMap
• Column-­‐valued	
  TermMap
• Template-­‐valued	
  TermMap
• RefObjectMap
• JoinConditions
• ChildQuery
• ParentQuery
• Language
• Datatype
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
SEMANTIC	
  WEB	
  à RELATIONAL	
  
DATABASES:	
  DATA	
  ACCESS
13
7
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Semantic	
  Web	
  à Relational	
  Database
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
139
ETL
SPARQL
RDBMS RDF	
  Graph	
  
Triplestore
SPARQL
Results
ETL
Mapping
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
140
NoETL (Wrapper)
SPARQL
RDBMS Virtual	
  RDF
SQL
SQL	
  
Results
SPARQL
Results
NoETL
R2RML
Mapping
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
“Comparing the overall performance […] of the fastest rewriter with the fastest relational
database shows an overhead for query rewriting of 106%. This is an indicator that there is
still room for improving the rewriting algorithms”
Larger	
  
numbers	
  are	
  
better
100M	
  Triple	
  Dataset
[Bizer and	
  Schultz.	
  Berlin	
  SPARQL	
  Benchmark	
  2009]
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Current	
  rdb2rdf	
  systems	
  are	
  not	
  capable	
  of	
  
providing	
  the	
  query	
  execution	
  performance	
  
required	
  [...]	
  it	
  is	
  likely	
  that	
  with	
  more	
  work	
  
on	
  query	
  translation,	
  suitable	
  mechanisms	
  
for	
  translating	
  queries	
  could	
  be	
  developed.	
  
These	
  mechanisms	
  should	
  focus	
  on	
  
exploiting	
  the	
  underlying	
  database	
  system’s	
  
capabilities	
  to	
  optimize	
  queries	
  and	
  process	
  
large	
  quantities	
  of	
  structure	
  data	
  
[Gray	
  et	
  al.	
  2009]
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
https://sourceforge.net/p/d2rq-­‐map/mailman/message/28055191/
Sept	
  2011
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Why	
  was	
  this	
  happening	
  if	
  …
ISWC	
  2008
Hypothesis:	
  Existing	
  commercial	
  relational	
  databases	
  already	
  
subsume	
  algorithms	
  and	
  optimizations	
  needed	
  to	
  support	
  
effective	
  SPARQL	
  execution	
  on	
  relationally	
  stored	
  data
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Compile	
  Time
1. Translate	
  SQL	
  Schema	
  
to	
  OWL	
  and	
  Mapping
2. Define	
  RDF	
  Triples,
as	
  a	
  View
Run	
  Time
3. SPARQL	
  to	
  SQL	
  
translation
4. SQL	
  Optimizer	
  
creates	
  relational	
  
query	
  plan
14
5
Ultrawrap1:	
  SPARQL	
  to	
  SQL	
  under	
  Direct	
  Mapping
Ultrawrap:	
  SPARQL	
  execution	
  on	
  relational	
  data
Sequeda	
  &	
  Miranker.	
  J.	
  WebSem 2013
US	
  Patent	
  8719252,	
  9396283
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Creating	
  Tripleview
• For	
  every	
  ontology	
  element	
  (Class,	
  Object	
  
Property	
  and	
  Datatype	
  property),	
  create	
  a	
  SQL	
  
SELECT	
  query	
  that	
  outputs	
  triples
SELECT	
  'Product’+ptID as	
  s,	
  	
  ‘label’	
  as	
  p,	
  label	
  as	
  o
FROM	
  Product	
  WHERE	
  label	
  IS	
  NOT	
  NULL
S P O
Product1 label ACME	
  Inc
Product2 label Foo	
  Bars
ptID label prID
1 ACME	
  Inc 4
2 Foo	
  Bars 5
Product
14
6
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Creating	
  Tripleview
SELECT	
  ‘Product’+ptID as	
  s,	
  prID as	
  s_id,	
  ‘label’	
  as	
  p,	
  label	
  as	
  o,	
  NULL	
  as	
  o_id
FROM	
  Product	
  WHERE	
  label	
  IS	
  NOT	
  NULL
S S_id P O O_id
Product1 1 label ACME	
  Inc NULL
Product2 2 label Foo	
  Bars NULL
ptID label prID
1 ACME	
  Inc 4
2 Foo	
  Bars 5
Product
14
7
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Creating	
  Tripleview (…)
• Create	
  TripleViews (SQL	
  View),	
  which	
  are	
  
unions	
  of	
  the	
  SQL	
  SELECT	
  query	
  that	
  have	
  the	
  
same	
  datatype
CREATE	
  VIEW	
  Tripleview_varchar AS
SELECT	
  ‘Product’+ptID as	
  s,	
  ptID as	
  s_id,	
   ‘label’	
  as	
  p,	
  label	
  as	
  o,	
  NULL	
  as	
  o_id FROM	
  Product
UNION	
  ALL
SELECT	
  ‘Producer’+prID as	
  s,	
  prID as	
  s_id,	
   ‘title’	
  as	
  p,	
  title	
  as	
  o,	
  NULL	
  as	
  o_id FROM	
  Producer
UNION	
  ALL	
  …
S S_id P O O_id
Product1 1 label ACME	
  Inc NULL
Product2 2 label Foo	
  Bars NULL
Producer4 4 title Foo NULL
Producer5 5 Ttitle Bars NULL
14
8
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
SPARQL	
  and	
  SQL
• Translating	
  a	
  SPARQL	
  query	
  to	
  a	
  semantically	
  
equivalent	
  SQL	
  query
SELECT	
  ?label	
  ?pnum1
WHERE{	
  
?x	
  label	
  ?label.
?x	
  pnum1	
  ?pnum1.
}
à SELECT	
  label,	
  pnum1
FROM	
  product
SQL	
  on	
  Tripleview
SELECT	
  t1.o	
  AS	
  label,	
  t2.o	
  AS	
  pnum1
FROM	
  tripleview_varchar t1,	
  tripleview_int t2
WHERE	
   t1.p	
  =	
  'label'	
  AND	
  
t2.p	
  =	
  'pnum1'	
  AND
t1.s_id	
  =	
  t2.s_id
What	
  
is	
  the	
  
Query	
  
Plan?
14
9
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Tripleview_varchar t1
Product
π	
  Product+’id’	
  AS	
  s ,	
  ‘label’	
  AS	
  p,	
  label	
  AS	
  o	
  
σlabel ≠	
  NULL
Producer
π	
  Producer+’id’	
  AS	
  s ,	
  ‘title’	
  AS	
  p,	
  title	
  AS	
  o	
  
σtitle ≠	
  NULL
U
Tripleview_int t2
Product
π	
  Product+’id’	
  AS	
  s ,	
  ‘pnum1’	
  AS	
  p,	
  pnum1	
  AS	
  o	
  
σpnum1	
  ≠	
  NULL
Product
π	
  Product+’id’	
  AS	
  s ,	
  ‘pnum2’	
  AS	
  p,	
  pnum2	
  AS	
  o	
  
σpnum2	
  ≠	
  NULL
U
π	
  t1.o	
  AS	
  label,	
  t2.o	
  AS	
  pnum1
σp =	
  ‘label’
σp =	
  ‘pnum1’
CONTRADICTION
CONTRADICTION
15
0
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Detection	
  of	
  Unsatisfiable Conditions
• Determine	
  that	
  the	
  query	
  result	
  will	
  be	
  empty	
  
if	
  the	
  existence	
  of	
  another	
  answer	
  would	
  
violate	
  some	
  integrity	
  constraint	
  in	
  the	
  
database.	
  
• This	
  would	
  imply	
  that	
  the	
  answer	
  to	
  the	
  query	
  
is	
  null	
  and	
  therefore	
  the	
  database	
  does	
  not	
  
need	
  to	
  be	
  accessed
Chakravarthy,	
   Grant	
  and	
  Minker.	
  (1990)	
  Logic-­‐Based	
   Approach	
   to	
  Semantic	
  Query	
   Optimization.	
  
15
1
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Product
π	
  Product+’id’	
  AS	
  s ,	
  ‘label’	
  AS	
  p,	
  label	
  AS	
  o	
  
σlabel ≠	
  NULL
Product
π	
  Product+’id’	
  AS	
  s ,	
  ‘pnum1’	
  AS	
  p,	
  pnum1	
  AS	
  o	
  
σpnum1	
  ≠	
  NULL
π	
  t1.o	
  AS	
  label,	
  t2.o	
  AS	
  pnum1
Join	
  on	
  the	
  same	
  table?	
  à REDUNDANT
15
2
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Self	
  Join	
  Elimination
• If	
  attributes	
  from	
  the	
  same	
  table	
  are	
  projected	
  
separately	
  and	
  then	
  joined,	
  then	
  the	
  join	
  can	
  
be	
  dropped
SELECT	
  label,	
  pnum1	
  
FROM	
  product	
  
WHERE	
  
id	
  =	
  1
SELECT	
  p1.label,	
   p2.pnum1	
  
FROM	
  product	
  p1,	
  product	
  p2	
  
WHERE	
  
p1.id	
  =	
  1	
  and	
  
p1.id	
  =	
  p2.id
SELECT	
  p1.id	
  
FROM	
  product	
  p1,	
  product	
  p2	
  
WHERE	
  
p1.pnum1	
   >100	
  and	
  
p2.pnum2	
   <	
  500	
  and	
  
p1.id	
  =	
  p2.id
SELECT	
  id	
  
FROM	
  product	
  
WHERE	
  
pnum1	
   >	
  100	
  and
pnum2	
   <	
  500
Self	
  Join	
  Elimination	
  of	
  Projection
Self	
  Join	
  Elimination	
  of	
  Selection
15
3
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Product
σlabel ≠	
  NULL	
   AND	
  pnum1	
  ≠	
  NULL
π	
  label,	
  pnum1
15
4
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Evaluation
• Used	
  Two	
  Benchmarks	
  that	
  stores	
  data	
  in	
  
relational	
  databases,	
  provides	
  SPARQL	
  queries	
  
and	
  their	
  semantically	
  equivalent	
  SQL	
  queries
Detection	
  of	
  
Unsatisfiable
Conditions
Self
Join	
  
Elimination
MYSQL
MSSQL
ORACLE	
  
DB2
✖
✔
✖
✖
✖ ✔
✔ ✔
15
5
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Ultrawrap	
  Experiment
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Augmented	
  Ultrawrap	
  Experiment
• Implemented	
  DoUC
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Reflection	
  2
• We	
  studied	
  how	
  SQL	
  systems	
  can	
  be	
  used	
  to	
  
effectively	
  evaluate	
  SPARQL	
  queries
– HOW:	
  Defined	
  architecture	
  based	
  on	
  SQL	
  Views	
  
which	
  allows	
  RDBMS	
  to	
  do	
  the	
  optimization	
  and	
  
Identified	
  two	
  important	
  optimizations	
  that	
  
already	
  exist	
  in	
  commercial	
  RDBMS.
– EXTENT:	
  SPARQL	
  1.0	
  (relational	
  core)
Recall	
  the	
  Hypothesis:	
  
Existing	
  commercial	
  relational	
  databases	
  already	
  subsume	
  
algorithms	
  and	
  optimizations	
  needed	
  to	
  support	
  effective	
  
SPARQL	
  execution	
  on	
  relationally	
  stored	
  data
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Ontology	
  Based	
  Data	
  Access
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
UltrawrapOBDA:	
  Ontology-­‐Based Data	
  Access
• Given	
  
– a	
  source	
  relational	
  database	
  D
– a	
  target	
  OWL	
  ontology	
  O,	
  and
– a	
  mapping	
  from	
  the	
  source	
  database	
  to	
  the	
  target	
  
ontology	
  M
• Goal:	
  Answer	
  SPARQL	
  queries	
  in	
  terms	
  of	
  the	
  
target	
  ontology using	
  mappings and	
  the	
  
database
Hypothesis:	
  We	
  can	
  effect	
  optimizations	
  for	
  OBDA	
  by	
  push	
  
processing	
  into	
  the	
  RDBMS,	
  thus	
  acting	
  as	
  a	
  reasoner.	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
16
1
ID Name Age Job
1 Alice 40 CTO
2 Bob 41 Java
3 John 42 SysAd
Employee
Executive IT	
  
Employee
Programmer SysAdmin
EMP
subClassOf
subClassOf
subClassOf
subClassOf
EMP(s, y, z, ”CTO”)	
  à Triple(s, type, ”CTO”)	
  	
  
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)	
  
EMP(s, y, z, ”SysAd”)	
  à Triple(s, type, ”SysAdmin”)	
  
CTO
subClassOf
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
16
2
Forward	
  Chaining	
  – Materialization	
  
Backward	
  Chaining	
  – Query	
  Rewriting
RDBMS OWL	
  Ontology
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
State	
  of	
  the	
  Art:	
  Materialization
SELECT	
  ?x	
  WHERE	
  
{	
  ?x	
  type	
  
ITEmployee}
Triple(2,type,Programmer)
Triple(2,type,ITEmployee)
Triple(3,type,SysAdmin)
Triple(3,type, ITEmployee)
OWL	
  Ontology
Mapping
Relational	
  Database
Materialization
?x	
  =2
?x	
  =3
SPARQL	
  Query Ans
Programmer	
  ⊑ ITEmployee
SysAdmin ⊑ ITEmployee
RDF	
  Database
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)
EMP(s, y, z, ”SysAd”)	
  à Triple(s, type, ”SysAdmin”)	
  
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
State	
  of	
  the	
  Art:	
  Query	
  Rewriting
Rewriting
Unfolding
Evaluation
SPARQL	
  Query OWL	
  Ontology
Qo
Qsql
Mapping
Ans
Programmer	
  ⊑
ITEmployee
SysAdmin ⊑ ITEmployee
SELECT	
  ?x	
  WHERE	
  
{	
  ?x	
  type	
  ITEmployee}
SELECT	
  ?x	
  WHERE	
  
{?x	
  type	
  ITEmployee UNION	
  
?x	
  type	
  Programmer	
  UNION	
  
?x	
  type	
  SysAdmin}
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
?x	
  =2
?x	
  =3SELECT	
  SID	
  FROM	
  EMP	
  WHERE	
  JOB	
  =	
  ‘Java’	
  
UNION	
  
SELECT	
  SID	
  FROM	
  EMP	
  WHERE	
  JOB	
  =	
  
‘SysAd’ Database
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)	
  
EMP(s, y, z, ”SysAd”)	
  à Triple(s, type, ”SysAdmin”)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
16
5
RDB2RDF	
  
Mapping
SQL	
  Optimizer:
Views	
  &	
  Recursion
Inheritance	
  &	
  
Transitivity
Hybrid:	
  Backward	
  – Forward	
  Chaining
RDBMS OWL	
  Ontology
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Hybrid	
  Approach:	
  UltrawrapOBDA
OWL	
  Ontology
Mapping
Programmer	
  ⊑
ITEmployee
SysAdmin ⊑ ITEmployee
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)	
  
EMP(s, y, z, ”SysAd”)	
  à Triple(s, type, ”SysAdmin”)
Compiler
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
Saturated
Mapping
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
EMP(s, y, z, ”Java”)	
   à Triple(s, type, ”ITEmployee”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”ITEmployee”)	
   	
  
OBDA:	
  query	
  rewriting	
  or	
  materialization?	
  In	
  practice,	
  both!
Sequeda,	
  Arenas,	
  Miranker.	
  ISWC	
  2014	
  (Best	
  Paper)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RDFS	
  Subclass	
  Inference	
  Rule
16
7
Programmer	
  subClassOf ITEmployee X	
  type	
  Programmer	
  
X	
  type	
  ITEmployee
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Generating	
  Saturated	
  Mappings
Programmer	
  subClassOf ITEmployee X	
  type	
  Programmer	
  
X	
  type	
  ITEmployee
Programmer	
  subClassOf ITEmployee EMP(s,	
  x1,	
  “Java”)	
  ∧ p	
  =	
  type	
  ∧ o	
  =	
  Programmer→	
  Triple(s,	
  p,	
  o)	
  
EMP(s,	
  x1,	
  “Java”)	
  ∧ p	
  =	
  type	
  ∧ o	
  =	
  ITEmployee→	
  Triple(s,	
  p,	
  o)	
  
ρ :	
  	
  	
  	
  EMP(s,	
  y,	
  z,	
  ‘Java’)	
  ∧ p	
  =	
  type	
  ∧ o	
  =	
  Programmer→	
  Triple(s,	
  p,	
  o)	
  
Predicate	
   ObjectQuery	
  over	
  R	
  and	
  Subject
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Transitivity	
  and	
  SQL	
  Recursion
16
9
WITH	
  MANAGER (X,	
  Y)	
  AS(
SELECT	
  	
  ID,	
  MAN	
  FROM	
  EMP	
  
UNION	
  ALL	
  
SELECT	
  EMP.ID,	
  MANAGER.Y	
  	
  FROM	
  EMP,	
  MANAGER
WHERE	
  EMP.MAN=	
  MANAGER.X
)	
  SELECT	
  X,	
  Y	
  FROM	
  MANAGER
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
OWL  2  QL OWL  2  RL
OWL  2  EL
OWL  2  DL
EL QL RL
subClass X X X
subProp X X X
domain X X X
range X X X
eqClass X X X
eqProp X X X
inverseProp X X
symProp X X
transProp X X
17
0
Check	
  paper	
  for	
  
all	
  9	
  Inference	
  
rules
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Saturated	
  Mappings
17
1
OWL	
  Ontology
Qo
Mapping
Programmer	
  ⊑
ITEmployee
SysAdmin ⊑ ITEmployee
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)	
  
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
Compiler
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
Saturated
Mapping
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
EMP(s, y, z, ”Java”)	
   à Triple(s, type, ”ITEmployee”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”ITEmployee”)	
   	
  
Saturated	
  Mappings	
  are	
  
similar	
  to	
  T-­‐Mapping	
  per	
  
Rodriguez-­‐Muro et.	
  al.	
  
(AMW2011,	
  ISWC2013)
Saturation	
  is	
  
performed	
  by	
  
exhaustively applying	
  
the	
  inference	
  rules
We	
  present	
  a	
  
linear-­‐time	
  algorithm
Check	
  
paper
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Represent	
  Saturated	
  Mappings	
  as	
  SQL	
  Views
17
2
OWL	
  Ontology
Qo
Mapping
Programmer	
  ⊑
ITEmployee
SysAdmin ⊑ ITEmployee
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)	
  
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
Compiler
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
Saturated
Mapping
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
EMP(s, y, z, ”Java”)	
   à Triple(s, type, ”ITEmployee”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”ITEmployee”)	
   	
  
CREATE VIEW ITEmployeeView AS
SELECT
ID as S, “type” as P, “ITEmployee” as O
FROM EMP where JOB = ‘Java’ UNION …
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
From	
  Saturated	
  Mappings	
  to	
  SQL	
  Views
Mapping
Triplequery
Tripleview
EMP(s,	
  x1,	
  “Java”)	
  ∧ p	
  =	
  type	
  ∧ o	
  =	
  ITEmployee →	
  Triple(s,	
  p,	
  o)	
  
EMP(s,	
  x1,	
  “SysAd”)	
  ∧ p	
  =	
  type	
  ∧ o	
  =	
  ITEmployee →	
  Triple(s,	
  p,	
  o)	
  
SELECT	
  ID	
  as	
  S,	
  “type”	
  as	
  P,	
  “ITEmployee”	
  as	
  O	
  FROM	
  EMP	
  where	
  JOB	
  =	
  ‘Java’
SELECT	
  ID	
  as	
  S,	
  “type”	
  as	
  P,	
  “ITEmployee”	
  as	
  O	
  FROM	
  EMP	
  where	
  JOB	
  =	
  ‘SysAd’
CREATE	
  VIEW	
  ITEmployeeView AS
SELECT	
  ID	
  as	
  S,	
  “type”	
  as	
  P,	
  “ITEmployee”	
  as	
  O	
  FROM	
  EMP	
  where	
  JOB	
  =	
  ‘Java’
UNION
SELECT	
  ID	
  as	
  S,	
  “type”	
  as	
  P,	
  “ITEmployee”	
  as	
  O	
  FROM	
  EMP	
  where	
  JOB	
  =	
  ‘SysAd’
S P O
… … …
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
All	
  of	
  this	
  is	
  offline
OWL	
  Ontology
Qo
Mapping
Programmer	
  ⊑
ITEmployee
SysAdmin ⊑ ITEmployee
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)	
  
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
Compiler
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
Saturated
Mapping
EMP(s, y, z, ”Java”)	
  à Triple(s, type, ”Programmer”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”SysAdmin”)
EMP(s, y, z, ”Java”)	
   à Triple(s, type, ”ITEmployee”)
EMP(s, y, z, ”SysAd”)	
   à Triple(s, type, ”ITEmployee”)	
   	
  
CREATE VIEW ITEmployeeView AS
SELECT
ID as S, “type” as P, “ITEmployee” as O
FROM EMP where JOB = ‘Java’ UNION …
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Runtime:	
  SPARQL	
  Execution
17
5
EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)
SELECT	
  ?x	
  WHERE	
  
{	
  ?x	
  type	
  
ITEmployee}
s	
  =	
  2
s	
  =	
  3
SPARQL	
  Query Ans
CREATE VIEW ITEmployeeView AS
SELECT
ID as S, “type” as P, “ITEmployee” as O
FROM EMP where JOB = ‘Java’ UNION …
SELECT	
  s	
  FROM	
  Tripleview	
  
WHERE	
  p	
  =	
  type	
  and	
  o	
  =	
  ITEmployee
SQL	
  Query	
  on	
  Views
S P O
… … …
Theorem:	
  Given	
  a	
  RDB2RDF	
  Mapping	
  M,	
  every	
  SPARQL	
  query	
  is	
  SQL-­‐rewritable	
  under	
  M
Proof:	
  by	
  induction	
  on	
  the	
  structure	
  of	
  SPARQL
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Query	
  Optimization:	
  Materialize	
  Views?
17
6
Best	
  query	
  
response	
   time
Worst	
  query	
  
response	
   time
Less	
  Space
Consumed
Most	
  Space
Consumed
Materialize	
  
Everything
Materialize	
  
Nothing
Harinarayan et	
  al.	
  Implementing	
  Data	
  Cubes	
  Efficiently.	
  SIGMOD96
…..
Mami &	
  Bellahsene.	
  A	
  Survey	
  of	
  View	
  Selection	
  Methods.	
  SIGMOD	
  Record	
  2012
Hybrid	
  
Approach
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Cost	
  Model
Best	
  query	
  
response	
  
time
Worst	
  query	
  
response	
   time
Less	
  Space
Consumed
Most	
  Space
Consumed
Materialize	
  All
Materialize	
  
Nothing
Materialize	
  views	
  representing	
  
mappings	
  to	
  leaf	
  classes	
  
Query	
  cost	
  =	
  n	
  x	
  NR x	
  S(A,R)
Space	
  cost	
  =	
  NR +	
  (NR x	
  d)
Query	
  cost	
  =	
  n	
  x	
  NR
Space	
  cost	
  =	
  NR
Query	
  cost	
  =	
  n	
  x	
  NR x	
  S(A,R)
Space	
  cost	
  =	
  2NR
Hypothesis:	
  If	
  a	
  RDBMS	
  rewrites	
  
queries	
  in	
  terms	
  of	
  materialized	
  
views,	
  then	
  …
Check	
  
paper	
  
for	
  
details
n	
  is	
  the	
  number	
  of	
  leaf	
  classes	
  underneath	
  the	
  class	
  that	
  is	
  being	
  queried
NR is	
  the	
  number	
  of	
  tuples	
  of	
  the	
  relation	
  R	
  in	
  the	
  mapping	
  
S(A,	
  R)	
  is	
  the	
  selectivity	
   of	
  the	
  attribute	
  A	
  of	
  the	
  relation	
  R	
  in	
  the	
  mapping	
  
d	
  is	
  the	
  depth	
  of	
  the	
  ontology	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Texas	
  Benchmark
id …. TWO FIVE TEN TWENTY FIFTY HUNDRED
1 1 1 1 1 1
… … … … … …
2 5 10 20 50 100
1 100…
d	
  =	
  2
1 100…
d	
  =	
  5… …
Database
…
Ontologies
Goal:	
  Understand	
  the	
  behavior	
  when	
  
querying	
  for	
  instances	
  of	
  a	
  class	
  depending	
  
on	
  the	
  depth of	
  the	
  ontology	
  and	
  the	
  
selectivity
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Oracle	
  implements	
  
Query	
  Rewriting	
  with	
  Materialized	
  Views	
  
Seconds
www.obda-­‐benchmark.org
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
BSBM	
  Extension	
  for	
  Transitivity
SELECT	
  ?x	
  WHERE	
  {
?x	
  typeAncestor ProductType7	
  }
SELECT	
  ?product	
  ?x	
  WHERE	
  {
?x	
  typeAncestor ProductType7	
  .
?product	
  hasType ?x.	
  }
SELECT	
  ?product	
  ?x	
  WHERE	
  {
?x	
  typeAncestor ProductType7	
  .
?product	
  hasType ?x.	
  
?product	
  label	
  ?label	
  .
?product	
  numProp ?num.}
Product ProductType
hasType
typeAncestor
Literal
Literal
label
numProp
Ontology
Simple	
  Query Join	
  Query More	
  Join	
  Query
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Query	
  Plan	
  for	
  Transitivity
18
1
Unmaterialized	
   View Materialized	
  View
www.obda-­‐benchmark.org
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Reflection	
  3
• We	
  studied	
  how	
  SQL	
  systems	
  can	
  be	
  used	
  as	
  
reasoners for	
  SPARQL	
  queries	
  in	
  terms	
  of	
  
Ontologies	
  
– HOW:	
  Incorporate	
  semantics	
  of	
  ontologies	
  in	
  
Saturating	
  Mappings	
  and	
  take	
  advantages	
  of	
  query	
  
rewriting	
  using	
  materialized	
  views	
  and	
  recursion	
  
which	
  exist	
  in	
  RDBMS
– EXTENT:	
  OWL-­‐ SQL
Recall	
  the	
  Hypothesis:	
  
We	
  can	
  effect	
  optimizations	
  for	
  OBDA	
  by	
  push	
  processing	
  into	
  
the	
  RDBMS,	
  thus	
  acting	
  as	
  a	
  reasoner.	
  
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
RELATIONAL	
  DATABASES	
  AND	
  
SEMANTIC	
  WEB	
  IN	
  PRACTICE
18
3
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
past	
  à present	
  à FUTURE
• Federated	
  Semantic	
  Data	
  Management
– “Semantify”	
  data	
  by	
  mapping	
  to	
  ontologies
• Business	
  view	
  of	
  heterogeneous	
  data	
  
– Federation	
  (NoETL)	
  in	
  order	
  to	
  avoid	
  centralization	
  
(ETL)
– Dagstuhl seminar	
  on	
  this	
  topic	
  (June	
  2017)
• http://www.dagstuhl.de/17262
• “Start”	
  of	
  commercial	
  interest
– Startups:	
  Capsenta,	
  …	
  
– Industries:	
  Pharma,	
  Finance,	
  …	
  
– EU	
  Project:	
  Optique
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
IT Biz
Total	
  net	
  
sales	
  of	
  
all	
  Orders	
  
today
Reports
Real	
  World	
  Data	
  Integration	
  Problem
18
5
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
What	
  do	
  you	
  mean	
  by	
  …
How	
  many	
  orders	
  were	
  
placed	
  in	
  June	
  2017?
317,595
317,124
316,899
Billing
Shipping
E-­‐Commerce
18
6
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
It’s	
  a	
  Semantic	
  Problem!
What	
  is	
  an	
  Order?
When	
   a	
  user	
  
clicks	
  “Order”	
  on	
  
the	
  website
When	
   the	
  
customer	
   has	
  
received	
  the	
  
product
When	
   it	
  comes	
  
out	
  of	
  the	
  billing	
  
system	
  and	
  the	
  CC	
  
has	
  been	
  charged
Billing
Shipping
E-­‐Commerce
18
7
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Cross	
  Organizational	
  Data	
  Integration	
  
Organization	
  1
Organization	
  2
Organization	
  n
18
8
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
IT
Biz
Total	
  net	
  
sales	
  of	
  
all	
  Orders	
  
today
Data
Architect
SELECT	
  
..	
  
FROM	
  …
csv csv
csv
MS
Access
T=1
T=2T=3
XLS
Did	
  the	
  Biz	
  User	
  communicate	
   the	
  correct	
  
message	
  to	
  IT?	
  
Did	
  IT	
  understand	
   correctly	
  what	
  the	
  Biz	
  
User	
  wanted?	
  
Did	
  IT	
  deliver	
  the	
  correct/precise	
   results?	
   Reports
XLS
XLS
Status	
  Quo	
  1
18
9
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Enterprise
Data	
  Warehouse
IT Biz
Reports
Time	
  and	
  $
Total	
  net	
  
sales	
  of	
  
all	
  Orders	
  
today
ETL
ETL
ETL
Total	
  net	
  
sales	
  of	
  all	
  
Orders	
  
today	
  with	
  
FX
Status	
  Quo	
  2
Data
Architect
19
0
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Integrating	
  Data	
  using	
  Graphs	
  and	
  Semantics
19
1
HIVE
Impala,	
  etc
Oracle
SQL	
  
Server
Postgres
Unstructured
Semi-­‐
Structured
Mappings
Enterprise	
  Knowledge	
  Graph
Search ReportsAPI Dashboard
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Semantic	
  Technology	
  is	
  not	
  easy	
  
Who	
  creates	
  this?
Using	
  what	
  tools?
Funny	
  Note:	
  I	
  found	
  my	
  
presentation	
  from	
  2007	
  where	
  I	
  
asked	
  this	
  same	
  question
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Real	
  World
19
3
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Real	
  World	
  Mappings	
  are	
  not	
  easy	
  and	
  obvious
19
4
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Mappings	
  and	
  Ontologies	
  from	
  Questions
19
5
A	
  Pay-­‐As-­‐You-­‐Go	
  Methodology	
  for	
  Ontology-­‐Based	
  Data	
  Access
Sequeda	
  &	
  Miranker.	
  IEEE	
  Internet	
  Computing	
  2017
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Real	
  World	
  Example
SELECT
o.orderid, o.orderdate,
o.ordertotal
- ot.finaltax
- CASE
WHEN o.currencyid in (‘USD’, ‘CAD’) THEN
o.shippingcost
ELSE o.shippingcost - ot.shippingtax
END AS netsales,
o.currencyid
FROM order o, ordertax ot
WHERE o.orderid = ordertax.orderid
AND o.statusid NOT IN (4, 5)
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Reflection	
  4
• We	
  are	
  studying	
  how	
  non-­‐semantic	
  web	
  (and	
  
non-­‐technical)	
  users	
  can	
  integrate	
  data	
  using	
  
semantic	
  web	
  technologies
– HOW:	
  We	
  need	
  better	
  tools
– EXTENT:	
  I	
  don’t	
  know
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
CONCLUSION
19
8
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
HOW and	
  to	
  what	
  EXTENT can	
  RDB	
  be	
  integrated	
  with	
  the	
  SW?
1. RDB	
  can	
  be	
  automatically	
  directly	
  mapped	
  to	
  
RDF	
  and	
  OWL
– Monotone,	
  Information	
  and	
  Query	
  Preserving
– Monotone	
  is	
  obstacle	
  for	
  Semantics	
  Preserving
2. RDB	
  can	
  evaluate	
  and	
  optimize	
  SPARQL	
  1.0	
  
queries
– Two	
  important	
  optimizations
3. RDB	
  can	
  act	
  as	
  a	
  reasoner for	
  Ontologies	
  with	
  
inheritance	
  and	
  transitivity
– Saturated	
  mappings,	
  query	
  rewriting	
  using	
  mat	
  
views	
  and	
  recursion
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Tipping	
  Point
Relational	
  
Database
Semantic	
  
Web
• Semantics
• “Graphy”	
  Queries
• Data	
  Integration
• Flexible
• Metadata
• Provenance
• Graph	
  Visualizations
OWL  2  QL OWL  2  RL
OWL  2  EL
OWL  2  DL
OWL  SQL
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
HOW and	
  to	
  what	
  EXTENT can	
  RDB	
  be	
  integrated	
  with	
  the	
  SW?
20
1
Juan	
  Sequeda,	
  Ph.D
Co-­‐Founder	
  – Capsenta
juan@capsenta.com
@juansequeda
Sequeda	
  J.	
  Integrating	
  Relational	
  Databases	
  with	
  the	
  Semantic	
  Web.	
  IOS	
  Press.	
  2016
http://www.iospress.nl/book/integrating-­‐relational-­‐databases-­‐with-­‐the-­‐semantic-­‐web/
We	
  are	
  always	
  looking	
  for	
  
smart	
  people
THANK	
  YOU!
RDB	
  can	
  be	
  automatically	
  directly	
  
mapped	
  to	
  RDF	
  and	
  OWL	
  and	
  
preserve	
  information	
  and	
  queries
RDB	
  can	
  evaluate	
  
and	
  optimize	
  
SPARQL	
  1.0	
  queries
RDB	
  can	
  act	
  as	
  a	
  reasoner
for	
  Ontologies	
  with	
  
inheritance	
  and	
  transitivity

More Related Content

What's hot

Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
Lino Valdivia
 

What's hot (20)

Caching
CachingCaching
Caching
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Triplestore and SPARQL
Triplestore and SPARQLTriplestore and SPARQL
Triplestore and SPARQL
 
In-Memory Big Data Analytics
In-Memory Big Data AnalyticsIn-Memory Big Data Analytics
In-Memory Big Data Analytics
 
Big data visualization
Big data visualizationBig data visualization
Big data visualization
 
Big Data analytics
Big Data analyticsBig Data analytics
Big Data analytics
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Data engineering
Data engineeringData engineering
Data engineering
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
Azure+Databricks+Course+Slide+Deck+V4.pdf
Azure+Databricks+Course+Slide+Deck+V4.pdfAzure+Databricks+Course+Slide+Deck+V4.pdf
Azure+Databricks+Course+Slide+Deck+V4.pdf
 
Neanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeanex - Semantic Construction with Graphs
Neanex - Semantic Construction with Graphs
 

Similar to Integrating Relational Databases with the Semantic Web: A Reflection

Similar to Integrating Relational Databases with the Semantic Web: A Reflection (20)

Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Semantic Web in the Real World: A Journey between Two Cities Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Semantic Web in the Real World: A Journey between Two Cities
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
 
Virtualizing Relational Databases as Graphs: a multi-model approach
Virtualizing Relational Databases as Graphs: a multi-model approachVirtualizing Relational Databases as Graphs: a multi-model approach
Virtualizing Relational Databases as Graphs: a multi-model approach
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
 
Sap Leonardo - what is it, and why would I want one?
Sap Leonardo - what is it, and why would I want one?Sap Leonardo - what is it, and why would I want one?
Sap Leonardo - what is it, and why would I want one?
 
Data APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementData APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of Engagement
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate Data
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
 
Soa & The Next 1000 Days Of The Web
Soa & The Next 1000 Days Of The WebSoa & The Next 1000 Days Of The Web
Soa & The Next 1000 Days Of The Web
 
Data Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data WarehousingData Vault 2.0: Big Data Meets Data Warehousing
Data Vault 2.0: Big Data Meets Data Warehousing
 
Key Methodologies for Migrating from Oracle to Postgres
Key Methodologies for Migrating from Oracle to PostgresKey Methodologies for Migrating from Oracle to Postgres
Key Methodologies for Migrating from Oracle to Postgres
 
datavault2.pptx
datavault2.pptxdatavault2.pptx
datavault2.pptx
 
Databricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI Initiatives
 
Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
 
RWDG Slides: Data Governance and Three Levels of Metadata Management
RWDG Slides: Data Governance and Three Levels of Metadata ManagementRWDG Slides: Data Governance and Three Levels of Metadata Management
RWDG Slides: Data Governance and Three Levels of Metadata Management
 
Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?Has Traditional MDM Finally Met its Match?
Has Traditional MDM Finally Met its Match?
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
 

More from Juan Sequeda

WTF is the Semantic Web
WTF is the Semantic WebWTF is the Semantic Web
WTF is the Semantic Web
Juan Sequeda
 
Drupal 7 and Semantic Web Hands-on Tutorial
Drupal 7 and Semantic Web Hands-on TutorialDrupal 7 and Semantic Web Hands-on Tutorial
Drupal 7 and Semantic Web Hands-on Tutorial
Juan Sequeda
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011
Juan Sequeda
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
Juan Sequeda
 
Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011
Juan Sequeda
 

More from Juan Sequeda (20)

Graph Query Languages: update from LDBC
Graph Query Languages: update from LDBCGraph Query Languages: update from LDBC
Graph Query Languages: update from LDBC
 
Do I need a Graph Database?
Do I need a Graph Database?Do I need a Graph Database?
Do I need a Graph Database?
 
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
 
Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012
 
WTF is the Semantic Web and Linked Data
WTF is the Semantic Web and Linked DataWTF is the Semantic Web and Linked Data
WTF is the Semantic Web and Linked Data
 
WTF is the Semantic Web
WTF is the Semantic WebWTF is the Semantic Web
WTF is the Semantic Web
 
Drupal 7 and Semantic Web Hands-on Tutorial
Drupal 7 and Semantic Web Hands-on TutorialDrupal 7 and Semantic Web Hands-on Tutorial
Drupal 7 and Semantic Web Hands-on Tutorial
 
Free Money (a.k.a Fellowships)
Free Money (a.k.a Fellowships)Free Money (a.k.a Fellowships)
Free Money (a.k.a Fellowships)
 
Conclusions - Linked Data
Conclusions - Linked DataConclusions - Linked Data
Conclusions - Linked Data
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
 
Welcome to Linked Data 0/5 Semtech2011
Welcome to Linked Data 0/5 Semtech2011Welcome to Linked Data 0/5 Semtech2011
Welcome to Linked Data 0/5 Semtech2011
 
Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011
 
Introduccion a la Web Semantica
Introduccion a la Web SemanticaIntroduccion a la Web Semantica
Introduccion a la Web Semantica
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic Web
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010
 
Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010Welcome to Consuming Linked Data tutorial WWW2010
Welcome to Consuming Linked Data tutorial WWW2010
 
Introduction to Linked Data - WWW2010
Introduction to Linked Data - WWW2010 Introduction to Linked Data - WWW2010
Introduction to Linked Data - WWW2010
 
Consuming Linked Data by Humans - WWW2010
Consuming Linked Data by Humans - WWW2010Consuming Linked Data by Humans - WWW2010
Consuming Linked Data by Humans - WWW2010
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Integrating Relational Databases with the Semantic Web: A Reflection

  • 1. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Integrating  Relational  Databases   with  the  Semantic  Web: A  Reflection Juan  F.  Sequeda Joint  work  with  Daniel  P.  Miranker  (UT  Austin)  and  Marcelo  Arenas  (PUC  Chile) Thanks  to:  Oscar  Corcho,  Aibo Tian,  Mayank Kejriwal,  Hamid  Tirmizi 13th  Reasoning  Web  Summer  School  (RW  2017)  – July  7  to  11,  2017  – London,  UK
  • 2. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 2
  • 3. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Take  away  message  of  this  talk • Reflect  on  10  years  of  (our)  research  on   Integrating  Relational  Database  with  the   Semantic  Web – DISCLAIMER:  This  is  NOT  a  Survey – W3C  Relational  Database  to  RDF  Standards   (Science  vs  Engineering) • Provide  answer  to  the  research  question:   • Thesis: How and  to  what extent can  Relational  Databases  be   integrated  with  the  Semantic  Web? Much  of  the  existing  Relational  Database  infrastructure  can  be   reused  to  support  the  Semantic  Web
  • 4. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Data Logic RDBMS Semantic   Web Workshop   on   Logic  and  Data  Bases,   Toulouse   1977 Gallaire,  Nicolas  &   Minker SQL99 Recursion KL-­‐ONE Description   Logic RDF OWL Views Triggers Semantic Networks Japanese   5th Generation  Project MCC Austin,   TX Today1970s Relational   Algebra Workshops   on Expert  Systems Deductive  Databases KRDB 1980s 1990s 2000s Let’s  put  History  in  Today’s  Context 4
  • 5. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com What  is  the  relationship  between Relational  Model Table  Definition ConstraintsS Q L Relational  Databases RDF RDFS OWL S P A R Q L TIME Triggers Rules Semantic  Web Sequeda   et  al.  SQL  Databases  are  a  Moving  Target.  W3C  Workshop   on  RDF  Access  on  RDB.  2007 Progra mmer type 2 “Bob” name ITEmployee subClassOf SELECT  ?s  ?n  { ?s  type  ITEmployee. ?s  name  ?n } Literal name
  • 6. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Once  upon  a  time  … • D2R  (Map,Q,Server),  Virtuoso  RDF  Views,  SquirrelRDF,  R2D2,   Relational.OWL,  DB2OWL,  R2O,  Triplify,  Dartgrid,  RDBToOnto,   METAmorphoses,… https://www.w3.org/2007/03/RdfRDB/
  • 7. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com F2F  Meeting   ISWC  2008 March  2008 February  2009 1. Recommendation   to  standardize  a   mapping  language 2. RDB2RDF  Survey (2)  http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf (1)  http://www.w3.org/2005/Incubator/rdb2rdf/XGR-­‐rdb2rdf-­‐20090126/ October  2008
  • 8. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Sept  2009 Sept  2012
  • 9. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 0 50 100 150 200 250 Sep-­‐09 Oct-­‐09 Nov-­‐09 Dec-­‐09 Jan-­‐10 Feb-­‐10 Mar-­‐10 Apr-­‐10 May-­‐10 Jun-­‐10 Jul-­‐10 Aug-­‐10 Sep-­‐10 Oct-­‐10 Nov-­‐10 Dec-­‐10 Jan-­‐11 Feb-­‐11 Mar-­‐11 Apr-­‐11 May-­‐11 Jun-­‐11 Jul-­‐11 Aug-­‐11 Sep-­‐11 Oct-­‐11 Nov-­‐11 Dec-­‐11 Jan-­‐12 Feb-­‐12 Mar-­‐12 Apr-­‐12 May-­‐12 Jun-­‐12 Jul-­‐12 Aug-­‐12 Sep-­‐12 Oct-­‐12 First  F2F   @Semtech 2010 FPWD R2RML WD R2RML  +  DM Rec R2RML  +  DM Candidate  Rec R2RML  +  DM Proposed  Rec R2RML  +  DM FPWD DM WD R2RML+DM WD R2RML+DM Photo  from  cygri http://www.flickr.com/photos/cygri/4719458268/
  • 10. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com W3C  Relational  Database  to  RDF  (RDB2RDF)  Standards • Tools:  Ultrawrap,  Morph,  ontop,  … • Ontology  Based  Data  Access  (OBDA)
  • 11. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Outline • 9:00  – 10:30 – Intro – Relational  Database  à Semantic  Web:   Data  Mapping • 10:30  – 11:00 – Coffee  Break • 11:00  – 12:30 – Semantic  Web  à Relational  Database:   Data  Access – Conclusion 11
  • 12. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RELATIONAL  DATABASES  à SEMANTIC  WEB:  DATA  MAPPING 12
  • 13. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RDF W3C  Direct  Mapping  Overview Relational Database Direct   Mapping Engine Input:   Database  (Schema  and  Data) Primary  Keys Foreign  Keys Output RDF  graph
  • 14. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order LineItem LineItem/lineid=6789 Order#orderid 6789 Input • Relational  Schema • Primary  Keys  PK  and   Foreign  Keys  FK over  R • Relational  Data Output • RDF  graphDirect  Mapping W3C  Direct  Mapping 14 lineid price quantity product orderid 6789 30 2 Shoes Foo 1234 6790 20 2 Tshirt Bar 1234 <LineItem/lineid=6790> <Order/orderid=1234> LineItem#ref-­‐orderid LineItem#ref-­‐orderid 1234 2017-­‐07-­‐07 100 USD 1 30 2 Shoes  Foo 1234 6790 20 2Tshirt Bar 1234 Order#date Order#total Order#currency Order#status Lineitem#lineid Lineitem#price Lineitem#quantity Lineitem#product Lineitem#orderid Lineitem#lineid Lineitem#price Lineitem#quantity Lineitem#product
  • 15. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com What  do  we  need  to  automatically  generate? • Generate  Identifiers – IRI – Blank  Nodes • Generate  Triples – Table – Literal – Reference
  • 16. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Generating  Identifiers • Identifier  for  rows,  tables,  columns  and   foreign  keys • If  a  table  has  a  primary  key,   – then  the  row  identifier  will  be  an  IRI,   – otherwise  a  blank  node • The  identifiers  for  table,  columns  and  foreign   keys  are  IRIs • IRIs  are  generated  by  appending  to  a  given   base  IRI • All  strings  are  percent  encoded
  • 17. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Row  Node 1)  <http://www.ex.com/Person/ID=1> Base  IRI “Table  Name”/“PK  attr”=“PK  value” 2)  <http://www.ex.com/Person/ID=1;SID=123> Base  IRI “Table  Name”/“PK  attr”=“PK  value” 3)  Fresh  Blank  Node
  • 18. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com More  IRI 1)  <http://www.ex.com/Person> Base  IRI “Table  Name” 2)  <http://www.ex.com/Person#NAME> Base  IRI “Table  Name”#“Attribute” 3)  <http://www.ex.com/Person#ref-­‐CID> Base  IRI “Table  Name”#ref-­‐“Attribute”
  • 19. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com ID (pk) NAME AGE 1 Alice 25 2 Bob NULL Person Table  Triple 19 <http://www.ex.com/Person/ID=1> <http://www.ex.com/Person> rdf:type
  • 20. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com <http://www.ex.com/Person/ID=1> <http://www.ex.com/Person#NAME> “Alice”  . Literal  Triples 20 ID (pk) NAME AGE 1 Alice 25 2 Bob NULL Person
  • 21. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com ID (pk) NAME AGE CID (fk) 1 Alice 25 100 2 Bob NULL 200 Person CID (pk) TITLE 100 Austin 200 Madrid City Reference  Triples 21 <http://www.ex.com/Person/ID=1> <http://www.ex.com/Person#ref-­‐CID> <http://www.ex.com/City/CID=100>.  
  • 22. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Direct  Mapping  Result 22 ID NAME AGE CID 1 Alice 25 100 2 Bob NULL 100 Person CID NAME 100 Austin 200 Madrid City <Person/ID=1> <City/CID=100> Alice 25 Austin <Person/ID=2> Alice <City/CID=200> Madrid <Person#NAME> <Person#AGE> <Person#NAME> <Person#NAME> <Person#NAME> <Person#ref-­‐CID> <Person#ref-­‐CID>
  • 23. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Summary  of  W3C  Direct  Mapping • Default  and  Automatic  Mapping • URIs  are  automatically  generated – <table> – <table#attribute> – <table#ref-­‐attribute> – <Table/pkAttr=pkValue> • RDF  represents  the  same  relational  schema • RDF  can  be  transformed  by   SPARQL  CONSTRUCT – RDF  represents  the  structure  and  ontology  of  mapping   author’s  choice 23
  • 24. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Issues  with  the  W3C  Direct  Mapping 1. Mapping  is  only  from  Relational  data  to  RDF   data.   – The  relational  schema  is  not  taken  in  account.   Hence  no  relational  schema  to  OWL 2. Semantics  is  not  defined  for  NULL  values – “The  direct  mapping  does  not  generate  triples   for  NULL  values.  Note  that  it  is  not  known  how  to   relate  the  behavior  of  the  obtained  RDF  graph   with  the  standard  SQL  semantics  of  the  NULL   values  of  the  source  RDB.”   24
  • 25. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Research  Problem  with  Direct  Mapping • How  can  a  relational  database  schema  and   data,  including  nulls,  be  automatically mapped  to  RDF  and  OWL? • How  can  we  assure  correctness  of  mapping? – Information  Preservation:  no  information  is  lost – Query  Preservation:  no  queries  are  lost – Monotonicity:  inserts  does  not  affect – Semantics  Preservation:  constraints  are  not  lost 25 Hypothesis:  Relational  Databases  can  be  automatically   mapped  to  RDF  and  OWL  under  a  correct  mapping
  • 26. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order LineItem LineItem/lineid=6789 Order#orderid 6789 Input • Relational  Schema  R • Set  Σ of  Primary  Keys  PK   and  Foreign  Keys  FK over  R • Instance  I of  R Output • RDF  graph • OWL  Ontology  as   RDF  graph Direct  Mapping Direct  Mapping  as  Ontology  Overview 26 lineid price quantity product orderid 6789 30 2 Shoes Foo 1234 6790 20 2 Tshirt Bar 1234 <LineItem/lineid=6790> <Order/orderid=1234> LineItem#ref-­‐orderid 1234 2017-­‐07-­‐07 100 USD 1 30 2 Shoes  Foo 1234 6790 20 2Tshirt Bar 1234 Order#date Order#total Order#currency Order#status Lineitem#lineid Lineitem#price Lineitem#quantity Lineitem#product Lineitem#orderid Lineitem#lineid Lineitem#price Lineitem#quantity Lineitem#product <Order> <LineItem> LineItem#ref-­‐orderid owl:Class rdf:type rdf:type We  need  to  be  careful  about  two  issues • Binary  Relations • NULLs
  • 27. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com NULLs • What  should  we  do  with  NULLs? – Generate  a  Blank  Node – Don’t  generate  a  triple 27 How  do  we   reconstruct  the   NULL? lineid product comment 6789 Shoes  Foo “…” 6790 Tshirt Bar NULL LineItem/lineid=6789 “…” _:a LineItem/lineid=6789 “…” pr:title
  • 28. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Direct  Mapping Input:  A  relational  schema  R a  set  of  Σ of   primary  keys  and  foreign  keys  and  a  database   instance  I of  this  schema Output:  An  RDF  Graph 28 Definition: A  direct  mapping  M is  a  total  function  from  the   set  of  all  (R,  Σ,  I)  to  the  set  of  all  RDF  graphs
  • 29. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com I R,  Σ Predicates  to   store  (R,  Σ,  I) Predicates  to   Store  Ontology  O Datalog  Rules   to  generate   O  from  R,  Σ Datalog  Rules   to  generate   RDF  from  O  and  I Datalog  Rules   to  generate   OWL  from  O OWL RDF Direct  Mapping  RDB  to  RDF  and  OWL 29 Rel(r) Attr(a,  r) PKn(a1,  …  ,  an,  r) Value(v,  a,  t,  r) … Class(X)  ←  Rel(X),   ¬IsBinRel(X) Triple(U,"rdf:type","owl:Class")   ←  Class(R),  ClassIRI(R,  U) Triple(s,  p,  o)  ←  …   On  Directly  Mapping  Relational  Databases  to  RDF  and  OWL Sequeda,  Arenas,  Miranker.  WWW  2012 Class(C) DtP(p,  C) ObjP(p,  S,  T) …
  • 30. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Input:  Relational  Schema • Rel(r) :   – Rel(Order) • Attr(a,  r) :   – Attr(total,  Order) • PKn(a1,  …  ,  an,  r) :   – PK1(orderid,  Order) • FKn(a1,  …  ,  an,  r,  b1,  …  ,  bn,  s)  :   – FK1(orderid,  LineItem,  orderid,  Order) 30 orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order LineItem lineid price quantity product orderid 6789 30 2 Shoes Foo 1234 6790 20 2 Tshirt Bar 1234
  • 31. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Input:  Relational  Schema • Value(v,  a,  t,  r) 31 orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order
  • 32. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Input:  Relational  Schema • Value(v,  a,  t,  r) – Value(  1234,  orderid,  t1,  Order) 32 orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order
  • 33. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Input:  Relational  Schema • Value(v,  a,  t,  r) – Value(  1234,  orderid,  t1,  Order) – Value(  2017-­‐07-­‐07,  date,  t1,  Order) 33 orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order
  • 34. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Input:  Relational  Schema • Value(v,  a,  t,  r) – Value(  1234,  orderid,  t1,  Order) – Value(  2017-­‐07-­‐07,  date,  t1,  Order) – Value(  100,  total,  t1,  Order) 34 orderid date total currency status 1234 2017-­‐07-­‐07 100 USD 1 Order
  • 35. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Mapping  to  OWL 35 Triple(http://ex.org/Order,  rdf:type,  owl:Class) Triple(U,"rdf:type","owl:Class")  ←  Class(R),  ClassIRI(R,  U) ClassIRI(R,  X)  ←  Class(R),  Concat2(base,  R,  X) Class(X)  ←  Rel(X),  ¬IsBinRel(X) IsBinRel(X)  ←  BinRel(X,  A,  B,  S,  C,  T,  D) BinRel(R,  A,  B,  S,  C,  T,  D)  ←   PK2(A,  B,  R),  ¬ThreeAttr(R),  FK1(A,R,C,S),R  ≠  S,  FK1(B,R,D,T),R  ≠  T, ¬TwoFK(A,  R),  ¬TwoFK (B,  R),  ¬OneFK(A,  B,  R),  ¬FKTo(R)
  • 36. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Generating  IRIs  for  Tuples 36 Generate  IRIs  for  the  tuples  of  the  relations  having  a  primary  key:   Generate  blank  nodes  for  the  tuples  of  the  relations  not  having  a  primary  key   Generate  an  identifier  X  of  a  tuple  T  of  a  relation  R,  which  is  an  IRI  if  R  has  a  primary  key  or  a   blank  node.  
  • 37. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Mapping  to  RDF:  Table  Triples 37 Table  triples:  for  each  relation,  store  the  tuples  that  belongs  to  it Triple(http://ex.org/Order#orderid=1234 ,  rdf:type,  http://ex.org/Order )
  • 38. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Mapping  to  RDF:  Literal  Triples 38 Literal  triples:  for  each  tuple,  store  the  values  in  each  of  its  attributes Triple(http://ex.org/person#ssn=123 ,  http://ex.org/person#name ,  “Juan”) Generate  for  every  tuple  t  in  a  relation  R  and  for  every  attribute  A  of  R,  a  triple   storing  the  value  of  t  in  A,  which  is  called  a  literal  triple.  
  • 39. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Mapping  to  RDF:  Reference  Triples 39 Reference  triples:  store  the  references  generated  by  the  FKs Triple(http://ex.org/student#id=3 ,   http://ex.org/student,person#ssn,ssn ,   http://ex.org/person#ssn=123 ) Construct  reference  triples  for  object   properties  that  are  generated  from  binary   relations   Construct  reference  triples  for  object   properties  that  are  generated  from  foreign   keys  
  • 40. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Information  Preservation 40 I R,  Σ DM(R,  Σ,  I) DM-­‐ (DM(R,  Σ,  I))   Proof:  Provide  a  computable  mapping  DM-­‐ Theorem:  The  Direct  Mapping  DM is  information  preserving
  • 41. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Query  Preservation 41 I R,  Σ DM(R,  Σ,  I) eval(Q*,  DM(R,  Σ,  I))eval(Q,  I)
  • 42. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Relational  Algebra  tuples  vs.    SPARQL  mappings 42 ssn name age 789 Daniel NULL person t.ssn  =  789 t.name  =  Daniel t.age  =  NULL Then,  tr(t)  =  μ  : • Domain  of  μ  is  {?ssn,  ?name} • μ(?ssn)  =  789 • μ(?name)  =  Daniel
  • 43. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Query  Preservation 43 I R,  Σ DM(R,  Σ,  I) eval(Q*,  DM(R,  Σ,  I))tr(eval(Q,  I)) = Proof:  By  induction  on  the  structure  of  Q Bottom-­‐up  algorithm  for  translating  Q  into  Q* Theorem:  The  Direct  Mapping  is  query  preserving
  • 44. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Monotonicity 44 DM(R,  Σ,  I1)   DM(R,  Σ,  I2) I1 ⊆ I2 DM(R,  Σ,  I1)  ⊆ DM(R,  Σ,  I2) I1 R,  Σ I2 R,  Σ Proof:  All  negative  atoms  in  the  Datalog  rules  refer  to  the  schema,  where  the  schema  is  fixed Theorem:  The  Direct  Mapping  DM  is  Monotone
  • 45. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Semantics  Preservation 45 DM(R,  Σ,  I)   DM(R,  Σ,  I)   I R,  Σ I R,  Σ I satisfies   Σ   I does  not  satisfies   Σ Consistent   under  OWL  semantics Not  consistent   under  OWL  semantics ssn name 123 Juan 123 Marcelo person ssn is  the  PK #ssn=123 Juan Marcelo DM(R,  Σ,  I)   12 3 person#ssn I does  not satisfy  Σ however DM(R,  Σ,  I)  is  consistent under  OWL  semantics Proposition:  The  direct  mapping  DM is  not  semantics  preserving.   Theorem:  No  monotone  direct  mapping  is  semantics  preserving  
  • 46. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Extending  DM for  Semantics  Preservation • Family  of  Datalog rules  to  determine  violation   – Primary  Keys – Foreign  Keys – Create  artificial  triple  that  will  generate   contradiction • Non-­‐monotone  direct  mapping • Information  Preserving • Query  Preserving • Semantics  Preserving 46
  • 47. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Reflection  1 • We  studied  how  Relational  Databases  can  be   automatically  and  correctly  mapped  to  the   Semantic  Web – HOW:  Defined  a  Direct  Mapping  using  Datalog   rules – EXTENT:  Information  and  Query  Preserving.   Monotonicity  is  an  obstacle  for  Semantics   Preservation   Recall  the  Hypothesis:   Relational  Databases  can  be  automatically  mapped   to  RDF  and  OWL  under  a  correct  mapping Information  Preserving,  Query  Preserving  and  Monotone or   Information,  Query  and  Semantics  Preserving
  • 48. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RDF W3C  R2RML Relational Database R2RML Mapping Engine OWL Ontologies   (e.g FOAF,  etc) R2RML File Input Database  (schema  and  data) Target  Ontologies Mappings  between  the  Database  and   Target  Ontologies  in  R2RML Output RDF  graph Direct  Mapping  helps  to  “bootstrap”  
  • 49. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Direct  Mapping  as  R2RML 49 ID NAME AGE CID 1 Alice 25 100 2 Bob NULL 100 Person CID NAME 100 Austin 200 Madrid City <Person/ID=1> <City/CID=100> Alice 25 Austin <Person/ID=2> Alice <City/CID=200> Madrid <Person#NAME> <Person#AGE> <Person#NAME> <Person#NAME> <Person#NAME> <Person#ref-­‐CID> <Person#ref-­‐CID> How  can  this  be   represented  as  R2RML?  
  • 50. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/ID={ID}"; rr:class <http://www.ex.com/Person> ]; rr:predicateObjectMap [ rr:predicate <http://www.ex.com/Person#NAME> ; rr:objectMap [rr:column ”NAME" ] ]. Direct  Mapping  as  R2RML
  • 51. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/ID={ID}"; rr:class <http://www.ex.com/Person> ]; rr:predicateObjectMap [ rr:predicate <http://www.ex.com/Person#NAME> ; rr:objectMap [rr:column ”NAME" ] ]. Direct  Mapping  as  R2RML 51 Logical  Table:  What  is  being  mapped?   SubjectMap:  How  to  generate  the  Subject? PredicateObjectMap:  How  to  generate  the  Predicate  and  Object?
  • 52. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/ID={ID}"; rr:class <http://www.ex.com/Person> ]; rr:predicateObjectMap [ rr:predicate <http://www.ex.com/Person#NAME> ; rr:objectMap [rr:column ”NAME" ] ] . Logical  Table 52 What  is  being  mapped?
  • 53. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/ID={ID}"; rr:class <http://www.ex.com/Person> ]; rr:predicateObjectMap [ rr:predicate <http://www.ex.com/Person#NAME> ; rr:objectMap [rr:column ”NAME" ] ] . Subject  URI  Template 53 Subject  URI <Subject  URI>  rdf:type <Class  URI>  
  • 54. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/ID={ID}"; rr:class <http://www.ex.com/Person> ]; rr:predicateObjectMap [ rr:predicate <http://www.ex.com/Person#NAME> ; rr:objectMap [rr:column ”NAME" ] ] . Predicate  URI  Constant 54 Predicate  URI
  • 55. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/ID={ID}"; rr:class <http://www.ex.com/Person> ]; rr:predicateObjectMap [ rr:predicate <http://www.ex.com/Person#NAME> ; rr:objectMap [rr:column ”NAME" ] ] . Object  Column  Value 55 Object  Literal
  • 56. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com <http://www.ex.com/Person/ID=1> <http://www.ex.com/Person#NAME> <http://www.ex.com/Person/1> foaf:name “Ugly”  vs “Cool”  URIs 56 foaf:Person <http://www.ex.com/Person>
  • 57. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [rr:column ”NAME" ] ] . Customization 57 Customized  Subject  URI Customized  Class Customized  Property
  • 58. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com What  if  …   58 ID NAME GENDER 1 Alice F 2 Bob M Person <Person/1> Alice foaf:name <Woman> rdf:type SELECT  ID,  NAME   FROM  Person   WHERE  GENDER  =  "F" R2RML  View
  • 59. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:sqlQuery “””SELECT ID, NAME FROM Person WHERE gender = “F” “””]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class <http://www.ex.com/Woman> ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [rr:column ”NAME" ] ] . R2RML  View 59 Query  instead  of  table
  • 60. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Quick  Overview  of  R2RML • Manual  and  Customizable  Language • Learning  Curve • Direct  Mapping  bootstraps  R2RML • RDF  represents  the  structure  and  ontology  of   mapping  author’s  choice 60
  • 61. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com W3C  R2RML  Details • Logical  Tables:  What  is  being  mapped • Term  Maps:  How  to  create  RDF  terms • How  to  create  Triples  from  a  table • How  to  create  Triples  between  two  tables • Languages • Datatypes
  • 62. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com R2RML  Mapping Input  Database Logical  Table   Logical  Table  =  existing  table  or  view  in  database R2RML  View  =  SQL  Query R2RML  Mapping
  • 63. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 pid name 100 Dan 200 Marcelo Student Professor ex:Student1  rdf:type ex:Student . ex:Student2  rdf:type ex:Student . ex:Professor100  rdf:type ex:Professor . ex:Professor200  rdf:type ex:Professor . ex:Student1  foaf:name “Juan”. … R2RML  Mapping R2RML  Mapping
  • 64. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com R2RML  Mapping • A  R2RML  Mapping  M consists  of  a  finite  set   TM TripleMaps. • Each  TM  ∈TM  consists  of  a  tuple   (LT,  SM,  POM) – LT:  LogicalTable – SM:  SubjectMap – POM:  PredicateObjectMap • Each  POM∈POM  consists  of  a  pair  (PM,  OM)* – PM:  PredicateMap – OM:  ObjectMap *  For  simplicity
  • 65. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com R2RML  Mapping • An  R2RML  Mapping  is  represented  as  an  RDF   Graph  itself. • Associated  RDFS  schema – http://www.w3.org/ns/r2rml • Turtle  is  the  recommended  syntax
  • 66. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com LogicalTable • Tabular  data  mapped  to  RDF – rr:logicalTable 1. Existing  Relational  table  or  view – rr:tableName 2. R2RML  (SQL)  View – rr:sqlQuery
  • 67. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [rr:column ”NAME" ] ] . 67
  • 68. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:sqlQuery “””SELECT ID, NAME FROM Person WHERE gender = “F” “””]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class <http://www.ex.com/Woman> ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [rr:column ”NAME" ] ] .
  • 69. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
  • 70. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com How  to  create  RDF  terms  that  define  RDF  Triples? • RDF  term  is  either  an  IRI,  a  blank  node,  or  a   literal • Answer 1. Constant  Value 2. Value  in  the  database a. Raw  Value  in  a  Column b. Column  Value  applied  to  a  template
  • 71. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com TermMap • A  TermMap is  a  function  that  generates  an   RDF  Term  from  a  logical  table  row. • RDF  Term  is  either  a  IRI,  or  a  Blank  Node,  or  a   Literal Logical  Table  Row TermMap IRI Bnode Literal RDF  Term
  • 72. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com TermMap • A  TermMap must  be  exactly  on  of  the   following – Constant-­‐valued  TermMap – Column-­‐valued  TermMap – Template-­‐valued  TermMap • If  TermMaps are  used  to  create  S,  P,  O,  then – 3  ways  to  create  a  subject – 3  ways  to  create  a  predicate – 3  ways  to  create  an  object
  • 73. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Stemplate Ptemplate Otemplate Oconstant Ocolumn PConstant Otemplate Oconstant Ocolumn Pcolumn Otemplate Oconstant Ocolumn Sconstant Ptemplate Otemplate Oconstant Ocolumn PConstant Otemplate Oconstant Ocolumn Pcolumn Otemplate Oconstant Ocolumn Scolumn Ptemplate Otemplate Oconstant Ocolumn PConstant Otemplate Oconstant Ocolumn Pcolumn Otemplate Oconstant Ocolumn How  many  ways  to  create  a  Triple?
  • 74. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Constant-­‐valued  TermMap • A  TermMap that  ignores  the  logical  table  row   and  always  generates  the  same  RDF  term • rr:constant • Commonly  used  to  generate  constant  IRIs  as   the  predicate
  • 75. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [rr:column ”NAME" ] ] . 75
  • 76. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Column-­‐valued  TermMap • A  TermMap that  maps  a  column  value  of  a   column  name  in  a  logical  table  row • rr:column • Commonly  used  to  generate  Literals  as  the   object
  • 77. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [rr:column ”NAME" ] ] . 77
  • 78. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Template-­‐valued  TermMap • A  TermMap that  maps  the  column  values  of  a   set  of  column  names  to  a  string  template. • A  string  template is  a  format  that  can  be  used   to  build  strings  from  multiple  components. • rr:template • Commonly  used  to  generate  IRIs  as  the   subject  or  concatenate  different  attributes
  • 79. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [rr:column ”NAME" ] ] . 79
  • 80. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Commonly used… • …  but  any  of  these  TermMaps can  be  used  to   create  any  RDF  Term  (s,p,o).  Recall: – 3  ways  to  create  a  subject – 3  ways  to  create  a  predicate – 3  ways  to  create  an  object • Template-­‐valued  TermMap are  commonly   used  to  create  an  IRI  for  a  subject,  but  can  be   used  to  create  Literal  for  an  object. • How  to  specify  the  term  (IRI  or  Literal  in  this   case)?
  • 81. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com TermType • Specify  the  type  of  a  term  that  a  TermMap should  generate • Force what  the  RDF  term  should  be • Three  types  of  TermType: – rr:IRI – rr:BlankNode – rr:Literal
  • 82. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [ rr:template ”{FIRST_NAME} {LAST_NAME}”; rr:termType rr:Literal; ] ] . 82
  • 83. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template ”person{ID}"; rr:termType rr:BlankNode; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [rr:column ”NAME" ] ] . 83
  • 84. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com TermType (cont…) • Can  only  be  applied  to  Template  and  Column   valued  TermMap • Applying  to  Constant-­‐valued  TermMap has  no   effect – i.e If  the  constant  is  an  IRI,  the  term  type  is   automatically  an  IRI
  • 85. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com TermType Rules • If  the  Term  Map  is  for  a   1. Subject  à TermType =  IRI  or  Blank  Node 2. Predicate  à TermType =  IRI   3. Object  à TermType =  IRI or  Blank  Node  or  Literal
  • 86. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com TermType is  Optional • If  a  TermType is  not  specified  then – Default  =  IRI – Unless  it’s  for  an  object  being  defined  by  a   Column-­‐based  TermMap or  has  a  language  tag  or   specified  datatype,  then  the  TermType is  a  Literal • That’s  why  if  there  is  a  template  in  an   ObjectMap,  it  will  always  generate  an  IRI,   unless  a  TermType to  Literal  is  specified.
  • 87. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [ rr:template ”{FIRST_NAME} {LAST_NAME}”; rr:termType rr:Literal; ] ] 87 rr:predicateObjectMap [ rr:predicateMap [rr:constant ex:role ] rr:objectMap [ rr:template ”http://ex.com/role/{role}” ] ] rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name ] rr:objectMap [ rr:template ”{FIRST_NAME} {LAST_NAME}” ] ]
  • 88. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
  • 89. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com NOW  WE  HAVE  THE  ELEMENTS  TO   CREATE  TRIPLES
  • 90. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Generating  an  RDF  Triple • TermMap that  specifies  what  RDF  term  should   be  for  S,  P,  O – SubjectMap – PredicateMap – ObjectMap
  • 91. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com SubjectMap • SubjectMap is  a  TermMap • rr:subjectMap • Specifies  what  the  subject  of  a  triple  should  be • 3  ways  to  create  a  subject – Template-­‐valued  Term  Map – Column-­‐valued  Term  Map – Constant-­‐valued  Term  Map • Has  to  be  an  IRI  or  Blank  Node
  • 92. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com SubjectMap • SubjectMaps are  usually Template-­‐valued   TermMap • Use-­‐case  for  Column-­‐valued  TermMap – Use  a  column  value  to  create  a  blank  node – URI  exist  as  a  column  value • Use-­‐case  for  Constant-­‐valued  TermMap – For  all  tuples:  <CompanyABC>  <consistsOf>  <Dep{id}>
  • 93. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com SubjectMap • Optionally,  a  SubjectMap may  have  one  or   more  Class  IRIs  associated – This  will  generate  rdf:type triples • rr:class
  • 94. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [rr:column ”NAME" ] ] . 94 Optional
  • 95. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com PredicateObjectMap • A  function  that  creates  one  or  more  predicate-­‐ object  pairs  for  each  logical  table  row. • rr:predicateObjectMap • It  is  used  in  conjunction  with  a  SubjectMap to   generate  RDF  triples  in  a  TriplesMap. • A  predicate-­‐object  pair  consists  of – One  or  more  PredicateMaps – One  or  more  ObjectMaps or   ReferencingObjectMaps
  • 96. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name]; rr:objectMap [rr:column ”NAME" ] ] . 96
  • 97. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com PredicateMap • PredicateMap is  a  TermMap • rr:predicateMap • Specifies  what  the  predicate  of  a  RDF  triple   should  be • 3  ways  to  create  a  predicate – Template-­‐valued  Term  Map – Column-­‐valued  Term  Map – Constant-­‐valued  Term  Map • Has  to  be  an  IRI
  • 98. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com PredicateMap • PredicateMaps are  usually Constant-­‐valued   TermMap • Use-­‐case  for  Column-­‐valued  TermMap – …   • Use-­‐case  for  Template-­‐valued  TermMap – …
  • 99. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name]; rr:objectMap [rr:column ”NAME" ] ] . 99
  • 100. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [rr:column ”NAME" ] ] . 10 0 Shortcut!
  • 101. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Constant  Shortcut  Properties • ?x  rr:predicate ?y • ?x  rr:predicateMap [  rr:constant ?y  ] • ?x  rr:subject ?y • ?x  rr:subjectMap [  rr:constant ?y  ] • ?x  rr:object ?y • ?x  rr:objectMap [  rr:constant ?y  ]
  • 102. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com ObjectMap • ObjectMap is  a  TermMap • rr:objectMap • Specifies  what  the  object  of  a  triple  should  be • 3  ways  to  create  a  predicate – Template-­‐valued  Term  Map – Column-­‐valued  Term  Map – Constant-­‐valued  Term  Map • Has  to  be  an  IRI  or  Literal  or  Blank  Node
  • 103. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com ObjectMap • ObjectMaps are  usually Column-­‐valued   TermMap • Use-­‐case  for  Template-­‐valued  TermMap – Concatenate  values – Create  IRIs • Use-­‐case  for  Constant-­‐valued  TermMap – All  rows  in  a  table  share  a  role
  • 104. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”Person”]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicateMap [rr:constant foaf:name]; rr:objectMap [rr:column ”NAME" ] ] . 10 4
  • 105. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
  • 106. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student2 rdf:type ex:Student . TripleMap Example  1 • We  now  have  sufficient  elements  to  create  a   mapping  that  will  generate – A  Subject  IRI – rdf:Type triple(s)
  • 107. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Example  1 @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]. Logical  Table  is  a  Table  Name SubjectMap is  a Template-­‐valued   TermMap And  it  has  one  Class  IRI       sid name pid 1 Juan 100 2 Martin 200 Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student2 rdf:type ex:Student . TripleMap ρ1 :        Student(s,  x,  y)  ∧ p  =  rdf:type∧ o  =  ex:Student →  Triple(s,  p,  o)   Predicate   ObjectQuery  over  R  and  Subject
  • 108. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Class  RDB2RDF  Rule • Given  a  relational  schema  R such,  a  class   RDB2RDF-­‐rule  ρ over  R  is  a  first-­‐order  formula   of  the  form:   ∀s∀p∀o∀x̄  α(s,x ̄)  ∧ p  =  type ∧ o  =  c  →  triple(s,p,o)   where  α(s,x ̄) is  a  query  over  R  and  c  ∈ D and  D is  a  a  countably infinite  domain  of  constants   10 8 ρ1 :        Student(s,  x,  y)  ∧ p  =  rdf:type∧ o  =  ex:Student →  Triple(s,  p,  o)  
  • 109. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student1 ex:name “Juan” . ex:Student2 rdf:type ex:Student . ex:Student2 ex:name “Martin” . TripleMap Example  2
  • 110. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Example  2 @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column “name”]; ]. Logical  Table  is  a  Table  Name SubjectMap is  a Template-­‐valued   TermMap And  it  has  one  Class  IRI       PredicateObjectMap PredicateMap which  is  a   Constant-­‐valued  TermMap ObjectMap which  is  a   Column-­‐valued  TermMap ρ2 :        Student(s,  o,  y)  ∧ p  =  ex:name →  Triple(s,  p,  o)  
  • 111. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Predicate  RDB2RDF  Rule • Given  a  relational  schema  R such,  a  class   RDB2RDF-­‐rule  ρ over  R  is  a  first-­‐order  formula   of  the  form:   ∀s∀p∀o∀x̄  β(s,  o,  x  ̄)  ∧ p  =  c  →  triple(s,  p,  o)   where  β(s,  o,  x  ̄)  is  a  query  over  R  and  c  ∈ D and   D is  a  a  countably infinite  domain  of  constants   11 1 ρ2 :        Student(s,  o,  y)  ∧ p  =  ex:name →  Triple(s,  p,  o)  
  • 112. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RDB2RDF  Mapping • An  RDB2RDF  mapping  M over  R is  a  finite  set   of  class  or  predicate  RDB2RDF  rules  over  R 11 2 M =  {ρ1, ρ2} ρ1 :        Student(s,  x,  y)  ∧ p  =  rdf:type∧ o  =  ex:Student →  Triple(s,  p,  o)   ρ2 :        Student(s,  o,  y)  ∧ p  =  ex:name →  Triple(s,  p,  o)   @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column “name”]; ].
  • 113. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student1 ex:comment “Juan is a Student” . ex:Student2 rdf:type ex:Student . ex:Student2 ex:comment “Martin is a Student” . TripleMap Example  3
  • 114. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Example  3 @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:comment; rr:objectMap [ rr:template “{name} is a Student”; rr:termType rr:Literal; ]; ]. Logical  Table  is  a  Table  Name SubjectMap is  a Template-­‐valued   TermMap And  it  has  one  Class  IRI       PredicateObjectMap PredicateMap which  is  a   Constant-­‐valued  TermMap ObjectMap which  is  a   Template-­‐valued   TermMap TermType
  • 115. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student1 ex:webpage <http://ex.com/Juan>. ex:Student2 rdf:type ex:Student . ex:Student2 ex:webpage <http://ex.com/Martin>. TripleMap Example  4
  • 116. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Example  4 @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:webpage; rr:objectMap [ rr:template “http://ex.com/{name}”; ]; ]. Logical  Table  is  a  Table  Name SubjectMap is  a Template-­‐valued   TermMap And  it  has  one  Class  IRI       PredicateObjectMap PredicateMap which  is  a   Constant-­‐valued  TermMap ObjectMap which  is  a   Template-­‐valued   TermMap Note  that  there  is  not  TermType
  • 117. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student1 ex:studentType ex:GradStudent. ex:Student2 rdf:type ex:Student . ex:Student2 ex:studentType ex:GradStudent. TripleMap Example  5
  • 118. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Example  6 @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:studentType; rr:object ex:GradStudent ; ]. Logical  Table  is  a  Table  Name SubjectMap is  a Template-­‐valued   TermMap And  it  has  one  Class  IRI       PredicateObjectMap PredicateMap which  is  a   Constant-­‐valued  TermMap ObjectMap which  is  a   Constant-­‐valued  TermMap
  • 119. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RefObjectMap • A  RefObjectMap (Referencing  ObjectMap)   allows  using  the  subject  of  another   TriplesMap as  the  object  generated  by  a   ObjectMap. • rr:objectMap • A  RefObjectMap defined  by – Exactly  one  ParentTripleMap,  which  must  be  a   TripleMap – May  have  one  or  more  JoinConditions
  • 120. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”; ] ] ] . <TriplesMap2> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”City" ]; rr:subjectMap [ rr:template "http://ex.com/City/{CID}"; rr:class ex:City ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [ rr:column ”TITLE" ] ] . 12 0 RefObjectMap
  • 121. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com ParentTripleMap • The  referencing  TripleMap • rr:parentTriplesMap <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”; ] ] ] . Parent  TriplesMap
  • 122. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com JoinCondition • Join  between  child  and  parent  attributes • rr:joinCondition <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”; ] ] ] . JoinCondition
  • 123. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}"; rr:class foaf:Person ]; rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”; ] ] ] . <TriplesMap2> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”City" ]; rr:subjectMap [ rr:template "http://ex.com/City/{CID}"; rr:class ex:City ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [ rr:column ”TITLE" ] ] . 12 3 RefObjectMap Parent  TriplesMap JoinCondition
  • 124. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com JoinCondition • Child  Column  which  must   be  the  column  name  that   exists  in  the  logical  table   of  the  TriplesMap that   contains  the   RefObjectMap • Parent  Column  which   must  be  the  column   name  that  exists  in  the   logical  table  of  the   RefObjectMap’sParent   TriplesMap. <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; ... rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”;] ] ] . <TriplesMap2> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”City" ]; ... .
  • 125. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com JoinCondition • Child  Query – The  Child  Query  of  a   RefObjectMap is  the   LogicalTable of  the   TriplesMap containing  the   RefObjectMap • Parent  Query – The  ParentQuery of  a   RefObjectMap is  the   LogicalTable of  the  Parent   TriplesMap • If  the  ChildQuery and   ParentQuery are  not   identical,  then  a   JoinCondition must  exist <TriplesMap1> a rr:TriplesMap; rr:logicalTable [ rr:tableName”Person" ]; ... rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [ rr:parentTripelMap <TripleMap2>; rr:joinCondition [ rr:child “CID”; rr:parent “CID”;] ] ] . <TriplesMap2> a rr:TriplesMap; rr:logicalTable [ rr:tableName ”City" ]; ... .
  • 126. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
  • 127. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name pid 1 Juan 100 2 Martin 200 pid name 100 Dan 200 Marcelo Student Professor ex:Student1  rdf:type ex:Student . ex:Student2  rdf:type ex:Student . ex:Professor100  rdf:type ex:Professor . ex:Professor200  rdf:type ex:Professor . ex:Student1  ex:hasAdvisor ex:Professor100  . ex:Student2  ex:hasAdvisor ex:Professor200 R2RML  Mapping Example  7 ρ1 :        Student(s,  x,  o)  ∧ Professor(o,  z)  ∧ p  =  ex:hasAdvisor →  Triple(s,  p,  o)   ρ2 :        Student(s,  x,  y)  ∧ p  =  rdf:type∧ o  =  ex:Student →  Triple(s,  p,  o)   ρ3 :  Professor(s,  x)  ∧ p  =  rdf:type∧ o  =  ex:Professor →  Triple(s,  p,  o)  
  • 128. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix ex: <http://example.com/ns/>. <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:hasAdvisor; rr:objectMap [ rr:parentTriplesMap <#TriplesMap2>; rr:joinCondition [ rr:child “pid”; rr:parent “pid”; ] ] ]. <#TriplesMap2> rr:logicalTable [ rr:tableName ”Professor”]; rr:subjectMap [ rr:template "http://example.com/ns/{pid}"; rr:class ex:Professor; ]. RefObjectMap Parent  TriplesMap JoinCondition
  • 129. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Summary
  • 130. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Languages • TermMap with  a  TermType of  rr:Literal may   have  a  language  tag • rr:language <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:comment; rr:objectMap [ rr:column “comment”; rr:language “en”; ]; ].
  • 131. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com sid name comment 1 Juan Excellent Student 2 Martin Wonderful  student Student @prefix ex: <http://example.com/ns/>. ex:Student1 rdf:type ex:Student . ex:Student1 ex:comment “Excellent Student”@en . ex:Student2 rdf:type ex:Student . ex:Student2 ex:comment “Wonderful Student”@en .
  • 132. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Issue  with  Languages • What  happens  if  language  value  is  in  the  data? ID COUNTRY_ID LABEL LANG 1 1 United  States en 2 1 Estados Unidos es 3 2 England en 4 2 Inglaterra es
  • 133. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com @prefix ex: <http://example.com/ns/>. ex:country1 rdfs:label “United States”@en . ex:country1 rdfs:label “Estados Unidos”@es . ex:country2 rdfs:label “England”@en . ex:country2 rdfs:label “Inglaterra”@es . ID COUNTRY_ID LABEL LANG 1 1 United  States en 2 1 Estados Unidos es 3 2 England en 4 2 Inglaterra es ?
  • 134. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Issue  with  Languages • Mapping  for  each  language <#TripleMap_Countries_EN> a rr:TriplesMap; rr:logicalTable [ rr:sqlQuery """SELECT COUNTRY_ID, LABEL FROM COUNTRY WHERE LANG = ’en'""" ]; rr:subjectMap [ rr:template "http://example.com/country{COUNTRY_ID}" ]; rr:predicateObjectMap [ rr:predicate rdfs:label; rr:objectMap [ rr:column “LABEL”; rr:language “en”; ]; ].
  • 135. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Datatypes • TermMap with  a  TermType of  rr:Literal • TermMap does  not  have  rr:language <#TriplesMap1> rr:logicalTable [ rr:tableName ”Student”]; rr:subjectMap [ rr:template "http://example.com/ns/{sid}"; rr:class ex:Student; ]; rr:predicateObjectMap [ rr:predicate ex:startDate; rr:objectMap [ rr:column “start_date”; rr:datatype xsd:date; ]; ].
  • 136. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Summary  of  Terminology • R2RML  Mapping • Logical  Table • Input  Database • R2RML  View • TriplesMap • Logical  Table  Row • TermMap • TermType • SubjectMap • PredicateObjectMap • PredicateMap • ObjectMap • Constant-­‐valued  TermMap • Column-­‐valued  TermMap • Template-­‐valued  TermMap • RefObjectMap • JoinConditions • ChildQuery • ParentQuery • Language • Datatype
  • 137. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com SEMANTIC  WEB  à RELATIONAL   DATABASES:  DATA  ACCESS 13 7
  • 138. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Semantic  Web  à Relational  Database
  • 139. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 139 ETL SPARQL RDBMS RDF  Graph   Triplestore SPARQL Results ETL Mapping
  • 140. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 140 NoETL (Wrapper) SPARQL RDBMS Virtual  RDF SQL SQL   Results SPARQL Results NoETL R2RML Mapping
  • 141. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com “Comparing the overall performance […] of the fastest rewriter with the fastest relational database shows an overhead for query rewriting of 106%. This is an indicator that there is still room for improving the rewriting algorithms” Larger   numbers  are   better 100M  Triple  Dataset [Bizer and  Schultz.  Berlin  SPARQL  Benchmark  2009]
  • 142. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Current  rdb2rdf  systems  are  not  capable  of   providing  the  query  execution  performance   required  [...]  it  is  likely  that  with  more  work   on  query  translation,  suitable  mechanisms   for  translating  queries  could  be  developed.   These  mechanisms  should  focus  on   exploiting  the  underlying  database  system’s   capabilities  to  optimize  queries  and  process   large  quantities  of  structure  data   [Gray  et  al.  2009]
  • 143. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com https://sourceforge.net/p/d2rq-­‐map/mailman/message/28055191/ Sept  2011
  • 144. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Why  was  this  happening  if  … ISWC  2008 Hypothesis:  Existing  commercial  relational  databases  already   subsume  algorithms  and  optimizations  needed  to  support   effective  SPARQL  execution  on  relationally  stored  data
  • 145. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Compile  Time 1. Translate  SQL  Schema   to  OWL  and  Mapping 2. Define  RDF  Triples, as  a  View Run  Time 3. SPARQL  to  SQL   translation 4. SQL  Optimizer   creates  relational   query  plan 14 5 Ultrawrap1:  SPARQL  to  SQL  under  Direct  Mapping Ultrawrap:  SPARQL  execution  on  relational  data Sequeda  &  Miranker.  J.  WebSem 2013 US  Patent  8719252,  9396283
  • 146. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Creating  Tripleview • For  every  ontology  element  (Class,  Object   Property  and  Datatype  property),  create  a  SQL   SELECT  query  that  outputs  triples SELECT  'Product’+ptID as  s,    ‘label’  as  p,  label  as  o FROM  Product  WHERE  label  IS  NOT  NULL S P O Product1 label ACME  Inc Product2 label Foo  Bars ptID label prID 1 ACME  Inc 4 2 Foo  Bars 5 Product 14 6
  • 147. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Creating  Tripleview SELECT  ‘Product’+ptID as  s,  prID as  s_id,  ‘label’  as  p,  label  as  o,  NULL  as  o_id FROM  Product  WHERE  label  IS  NOT  NULL S S_id P O O_id Product1 1 label ACME  Inc NULL Product2 2 label Foo  Bars NULL ptID label prID 1 ACME  Inc 4 2 Foo  Bars 5 Product 14 7
  • 148. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Creating  Tripleview (…) • Create  TripleViews (SQL  View),  which  are   unions  of  the  SQL  SELECT  query  that  have  the   same  datatype CREATE  VIEW  Tripleview_varchar AS SELECT  ‘Product’+ptID as  s,  ptID as  s_id,   ‘label’  as  p,  label  as  o,  NULL  as  o_id FROM  Product UNION  ALL SELECT  ‘Producer’+prID as  s,  prID as  s_id,   ‘title’  as  p,  title  as  o,  NULL  as  o_id FROM  Producer UNION  ALL  … S S_id P O O_id Product1 1 label ACME  Inc NULL Product2 2 label Foo  Bars NULL Producer4 4 title Foo NULL Producer5 5 Ttitle Bars NULL 14 8
  • 149. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com SPARQL  and  SQL • Translating  a  SPARQL  query  to  a  semantically   equivalent  SQL  query SELECT  ?label  ?pnum1 WHERE{   ?x  label  ?label. ?x  pnum1  ?pnum1. } à SELECT  label,  pnum1 FROM  product SQL  on  Tripleview SELECT  t1.o  AS  label,  t2.o  AS  pnum1 FROM  tripleview_varchar t1,  tripleview_int t2 WHERE   t1.p  =  'label'  AND   t2.p  =  'pnum1'  AND t1.s_id  =  t2.s_id What   is  the   Query   Plan? 14 9
  • 150. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Tripleview_varchar t1 Product π  Product+’id’  AS  s ,  ‘label’  AS  p,  label  AS  o   σlabel ≠  NULL Producer π  Producer+’id’  AS  s ,  ‘title’  AS  p,  title  AS  o   σtitle ≠  NULL U Tripleview_int t2 Product π  Product+’id’  AS  s ,  ‘pnum1’  AS  p,  pnum1  AS  o   σpnum1  ≠  NULL Product π  Product+’id’  AS  s ,  ‘pnum2’  AS  p,  pnum2  AS  o   σpnum2  ≠  NULL U π  t1.o  AS  label,  t2.o  AS  pnum1 σp =  ‘label’ σp =  ‘pnum1’ CONTRADICTION CONTRADICTION 15 0
  • 151. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Detection  of  Unsatisfiable Conditions • Determine  that  the  query  result  will  be  empty   if  the  existence  of  another  answer  would   violate  some  integrity  constraint  in  the   database.   • This  would  imply  that  the  answer  to  the  query   is  null  and  therefore  the  database  does  not   need  to  be  accessed Chakravarthy,   Grant  and  Minker.  (1990)  Logic-­‐Based   Approach   to  Semantic  Query   Optimization.   15 1
  • 152. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Product π  Product+’id’  AS  s ,  ‘label’  AS  p,  label  AS  o   σlabel ≠  NULL Product π  Product+’id’  AS  s ,  ‘pnum1’  AS  p,  pnum1  AS  o   σpnum1  ≠  NULL π  t1.o  AS  label,  t2.o  AS  pnum1 Join  on  the  same  table?  à REDUNDANT 15 2
  • 153. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Self  Join  Elimination • If  attributes  from  the  same  table  are  projected   separately  and  then  joined,  then  the  join  can   be  dropped SELECT  label,  pnum1   FROM  product   WHERE   id  =  1 SELECT  p1.label,   p2.pnum1   FROM  product  p1,  product  p2   WHERE   p1.id  =  1  and   p1.id  =  p2.id SELECT  p1.id   FROM  product  p1,  product  p2   WHERE   p1.pnum1   >100  and   p2.pnum2   <  500  and   p1.id  =  p2.id SELECT  id   FROM  product   WHERE   pnum1   >  100  and pnum2   <  500 Self  Join  Elimination  of  Projection Self  Join  Elimination  of  Selection 15 3
  • 154. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Product σlabel ≠  NULL   AND  pnum1  ≠  NULL π  label,  pnum1 15 4
  • 155. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Evaluation • Used  Two  Benchmarks  that  stores  data  in   relational  databases,  provides  SPARQL  queries   and  their  semantically  equivalent  SQL  queries Detection  of   Unsatisfiable Conditions Self Join   Elimination MYSQL MSSQL ORACLE   DB2 ✖ ✔ ✖ ✖ ✖ ✔ ✔ ✔ 15 5
  • 156. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Ultrawrap  Experiment
  • 157. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Augmented  Ultrawrap  Experiment • Implemented  DoUC
  • 158. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Reflection  2 • We  studied  how  SQL  systems  can  be  used  to   effectively  evaluate  SPARQL  queries – HOW:  Defined  architecture  based  on  SQL  Views   which  allows  RDBMS  to  do  the  optimization  and   Identified  two  important  optimizations  that   already  exist  in  commercial  RDBMS. – EXTENT:  SPARQL  1.0  (relational  core) Recall  the  Hypothesis:   Existing  commercial  relational  databases  already  subsume   algorithms  and  optimizations  needed  to  support  effective   SPARQL  execution  on  relationally  stored  data
  • 159. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Ontology  Based  Data  Access
  • 160. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com UltrawrapOBDA:  Ontology-­‐Based Data  Access • Given   – a  source  relational  database  D – a  target  OWL  ontology  O,  and – a  mapping  from  the  source  database  to  the  target   ontology  M • Goal:  Answer  SPARQL  queries  in  terms  of  the   target  ontology using  mappings and  the   database Hypothesis:  We  can  effect  optimizations  for  OBDA  by  push   processing  into  the  RDBMS,  thus  acting  as  a  reasoner.  
  • 161. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 16 1 ID Name Age Job 1 Alice 40 CTO 2 Bob 41 Java 3 John 42 SysAd Employee Executive IT   Employee Programmer SysAdmin EMP subClassOf subClassOf subClassOf subClassOf EMP(s, y, z, ”CTO”)  à Triple(s, type, ”CTO”)     EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”)   EMP(s, y, z, ”SysAd”)  à Triple(s, type, ”SysAdmin”)   CTO subClassOf
  • 162. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 16 2 Forward  Chaining  – Materialization   Backward  Chaining  – Query  Rewriting RDBMS OWL  Ontology
  • 163. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com State  of  the  Art:  Materialization SELECT  ?x  WHERE   {  ?x  type   ITEmployee} Triple(2,type,Programmer) Triple(2,type,ITEmployee) Triple(3,type,SysAdmin) Triple(3,type, ITEmployee) OWL  Ontology Mapping Relational  Database Materialization ?x  =2 ?x  =3 SPARQL  Query Ans Programmer  ⊑ ITEmployee SysAdmin ⊑ ITEmployee RDF  Database EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”) EMP(s, y, z, ”SysAd”)  à Triple(s, type, ”SysAdmin”)   EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd)
  • 164. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com State  of  the  Art:  Query  Rewriting Rewriting Unfolding Evaluation SPARQL  Query OWL  Ontology Qo Qsql Mapping Ans Programmer  ⊑ ITEmployee SysAdmin ⊑ ITEmployee SELECT  ?x  WHERE   {  ?x  type  ITEmployee} SELECT  ?x  WHERE   {?x  type  ITEmployee UNION   ?x  type  Programmer  UNION   ?x  type  SysAdmin} EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd) ?x  =2 ?x  =3SELECT  SID  FROM  EMP  WHERE  JOB  =  ‘Java’   UNION   SELECT  SID  FROM  EMP  WHERE  JOB  =   ‘SysAd’ Database EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”)   EMP(s, y, z, ”SysAd”)  à Triple(s, type, ”SysAdmin”)
  • 165. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 16 5 RDB2RDF   Mapping SQL  Optimizer: Views  &  Recursion Inheritance  &   Transitivity Hybrid:  Backward  – Forward  Chaining RDBMS OWL  Ontology
  • 166. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Hybrid  Approach:  UltrawrapOBDA OWL  Ontology Mapping Programmer  ⊑ ITEmployee SysAdmin ⊑ ITEmployee EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”)   EMP(s, y, z, ”SysAd”)  à Triple(s, type, ”SysAdmin”) Compiler EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd) Saturated Mapping EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) EMP(s, y, z, ”Java”)   à Triple(s, type, ”ITEmployee”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”ITEmployee”)     OBDA:  query  rewriting  or  materialization?  In  practice,  both! Sequeda,  Arenas,  Miranker.  ISWC  2014  (Best  Paper)
  • 167. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RDFS  Subclass  Inference  Rule 16 7 Programmer  subClassOf ITEmployee X  type  Programmer   X  type  ITEmployee
  • 168. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Generating  Saturated  Mappings Programmer  subClassOf ITEmployee X  type  Programmer   X  type  ITEmployee Programmer  subClassOf ITEmployee EMP(s,  x1,  “Java”)  ∧ p  =  type  ∧ o  =  Programmer→  Triple(s,  p,  o)   EMP(s,  x1,  “Java”)  ∧ p  =  type  ∧ o  =  ITEmployee→  Triple(s,  p,  o)   ρ :        EMP(s,  y,  z,  ‘Java’)  ∧ p  =  type  ∧ o  =  Programmer→  Triple(s,  p,  o)   Predicate   ObjectQuery  over  R  and  Subject
  • 169. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Transitivity  and  SQL  Recursion 16 9 WITH  MANAGER (X,  Y)  AS( SELECT    ID,  MAN  FROM  EMP   UNION  ALL   SELECT  EMP.ID,  MANAGER.Y    FROM  EMP,  MANAGER WHERE  EMP.MAN=  MANAGER.X )  SELECT  X,  Y  FROM  MANAGER
  • 170. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com OWL  2  QL OWL  2  RL OWL  2  EL OWL  2  DL EL QL RL subClass X X X subProp X X X domain X X X range X X X eqClass X X X eqProp X X X inverseProp X X symProp X X transProp X X 17 0 Check  paper  for   all  9  Inference   rules
  • 171. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Saturated  Mappings 17 1 OWL  Ontology Qo Mapping Programmer  ⊑ ITEmployee SysAdmin ⊑ ITEmployee EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”)   EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) Compiler EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd) Saturated Mapping EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) EMP(s, y, z, ”Java”)   à Triple(s, type, ”ITEmployee”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”ITEmployee”)     Saturated  Mappings  are   similar  to  T-­‐Mapping  per   Rodriguez-­‐Muro et.  al.   (AMW2011,  ISWC2013) Saturation  is   performed  by   exhaustively applying   the  inference  rules We  present  a   linear-­‐time  algorithm Check   paper
  • 172. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Represent  Saturated  Mappings  as  SQL  Views 17 2 OWL  Ontology Qo Mapping Programmer  ⊑ ITEmployee SysAdmin ⊑ ITEmployee EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”)   EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) Compiler EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd) Saturated Mapping EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) EMP(s, y, z, ”Java”)   à Triple(s, type, ”ITEmployee”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”ITEmployee”)     CREATE VIEW ITEmployeeView AS SELECT ID as S, “type” as P, “ITEmployee” as O FROM EMP where JOB = ‘Java’ UNION …
  • 173. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com From  Saturated  Mappings  to  SQL  Views Mapping Triplequery Tripleview EMP(s,  x1,  “Java”)  ∧ p  =  type  ∧ o  =  ITEmployee →  Triple(s,  p,  o)   EMP(s,  x1,  “SysAd”)  ∧ p  =  type  ∧ o  =  ITEmployee →  Triple(s,  p,  o)   SELECT  ID  as  S,  “type”  as  P,  “ITEmployee”  as  O  FROM  EMP  where  JOB  =  ‘Java’ SELECT  ID  as  S,  “type”  as  P,  “ITEmployee”  as  O  FROM  EMP  where  JOB  =  ‘SysAd’ CREATE  VIEW  ITEmployeeView AS SELECT  ID  as  S,  “type”  as  P,  “ITEmployee”  as  O  FROM  EMP  where  JOB  =  ‘Java’ UNION SELECT  ID  as  S,  “type”  as  P,  “ITEmployee”  as  O  FROM  EMP  where  JOB  =  ‘SysAd’ S P O … … …
  • 174. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com All  of  this  is  offline OWL  Ontology Qo Mapping Programmer  ⊑ ITEmployee SysAdmin ⊑ ITEmployee EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”)   EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) Compiler EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd) Saturated Mapping EMP(s, y, z, ”Java”)  à Triple(s, type, ”Programmer”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”SysAdmin”) EMP(s, y, z, ”Java”)   à Triple(s, type, ”ITEmployee”) EMP(s, y, z, ”SysAd”)   à Triple(s, type, ”ITEmployee”)     CREATE VIEW ITEmployeeView AS SELECT ID as S, “type” as P, “ITEmployee” as O FROM EMP where JOB = ‘Java’ UNION …
  • 175. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Runtime:  SPARQL  Execution 17 5 EMP(ID,NAME,AGE,JOB) EMP(1,Alice,40,CTO) EMP(2,Bob,41,Java) EMP(3,John,42,SysAd) SELECT  ?x  WHERE   {  ?x  type   ITEmployee} s  =  2 s  =  3 SPARQL  Query Ans CREATE VIEW ITEmployeeView AS SELECT ID as S, “type” as P, “ITEmployee” as O FROM EMP where JOB = ‘Java’ UNION … SELECT  s  FROM  Tripleview   WHERE  p  =  type  and  o  =  ITEmployee SQL  Query  on  Views S P O … … … Theorem:  Given  a  RDB2RDF  Mapping  M,  every  SPARQL  query  is  SQL-­‐rewritable  under  M Proof:  by  induction  on  the  structure  of  SPARQL
  • 176. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Query  Optimization:  Materialize  Views? 17 6 Best  query   response   time Worst  query   response   time Less  Space Consumed Most  Space Consumed Materialize   Everything Materialize   Nothing Harinarayan et  al.  Implementing  Data  Cubes  Efficiently.  SIGMOD96 ….. Mami &  Bellahsene.  A  Survey  of  View  Selection  Methods.  SIGMOD  Record  2012 Hybrid   Approach
  • 177. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Cost  Model Best  query   response   time Worst  query   response   time Less  Space Consumed Most  Space Consumed Materialize  All Materialize   Nothing Materialize  views  representing   mappings  to  leaf  classes   Query  cost  =  n  x  NR x  S(A,R) Space  cost  =  NR +  (NR x  d) Query  cost  =  n  x  NR Space  cost  =  NR Query  cost  =  n  x  NR x  S(A,R) Space  cost  =  2NR Hypothesis:  If  a  RDBMS  rewrites   queries  in  terms  of  materialized   views,  then  … Check   paper   for   details n  is  the  number  of  leaf  classes  underneath  the  class  that  is  being  queried NR is  the  number  of  tuples  of  the  relation  R  in  the  mapping   S(A,  R)  is  the  selectivity   of  the  attribute  A  of  the  relation  R  in  the  mapping   d  is  the  depth  of  the  ontology  
  • 178. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Texas  Benchmark id …. TWO FIVE TEN TWENTY FIFTY HUNDRED 1 1 1 1 1 1 … … … … … … 2 5 10 20 50 100 1 100… d  =  2 1 100… d  =  5… … Database … Ontologies Goal:  Understand  the  behavior  when   querying  for  instances  of  a  class  depending   on  the  depth of  the  ontology  and  the   selectivity
  • 179. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Oracle  implements   Query  Rewriting  with  Materialized  Views   Seconds www.obda-­‐benchmark.org
  • 180. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com BSBM  Extension  for  Transitivity SELECT  ?x  WHERE  { ?x  typeAncestor ProductType7  } SELECT  ?product  ?x  WHERE  { ?x  typeAncestor ProductType7  . ?product  hasType ?x.  } SELECT  ?product  ?x  WHERE  { ?x  typeAncestor ProductType7  . ?product  hasType ?x.   ?product  label  ?label  . ?product  numProp ?num.} Product ProductType hasType typeAncestor Literal Literal label numProp Ontology Simple  Query Join  Query More  Join  Query
  • 181. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Query  Plan  for  Transitivity 18 1 Unmaterialized   View Materialized  View www.obda-­‐benchmark.org
  • 182. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Reflection  3 • We  studied  how  SQL  systems  can  be  used  as   reasoners for  SPARQL  queries  in  terms  of   Ontologies   – HOW:  Incorporate  semantics  of  ontologies  in   Saturating  Mappings  and  take  advantages  of  query   rewriting  using  materialized  views  and  recursion   which  exist  in  RDBMS – EXTENT:  OWL-­‐ SQL Recall  the  Hypothesis:   We  can  effect  optimizations  for  OBDA  by  push  processing  into   the  RDBMS,  thus  acting  as  a  reasoner.  
  • 183. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com RELATIONAL  DATABASES  AND   SEMANTIC  WEB  IN  PRACTICE 18 3
  • 184. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com past  à present  à FUTURE • Federated  Semantic  Data  Management – “Semantify”  data  by  mapping  to  ontologies • Business  view  of  heterogeneous  data   – Federation  (NoETL)  in  order  to  avoid  centralization   (ETL) – Dagstuhl seminar  on  this  topic  (June  2017) • http://www.dagstuhl.de/17262 • “Start”  of  commercial  interest – Startups:  Capsenta,  …   – Industries:  Pharma,  Finance,  …   – EU  Project:  Optique
  • 185. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com IT Biz Total  net   sales  of   all  Orders   today Reports Real  World  Data  Integration  Problem 18 5
  • 186. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com What  do  you  mean  by  … How  many  orders  were   placed  in  June  2017? 317,595 317,124 316,899 Billing Shipping E-­‐Commerce 18 6
  • 187. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com It’s  a  Semantic  Problem! What  is  an  Order? When   a  user   clicks  “Order”  on   the  website When   the   customer   has   received  the   product When   it  comes   out  of  the  billing   system  and  the  CC   has  been  charged Billing Shipping E-­‐Commerce 18 7
  • 188. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Cross  Organizational  Data  Integration   Organization  1 Organization  2 Organization  n 18 8
  • 189. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com IT Biz Total  net   sales  of   all  Orders   today Data Architect SELECT   ..   FROM  … csv csv csv MS Access T=1 T=2T=3 XLS Did  the  Biz  User  communicate   the  correct   message  to  IT?   Did  IT  understand   correctly  what  the  Biz   User  wanted?   Did  IT  deliver  the  correct/precise   results?   Reports XLS XLS Status  Quo  1 18 9
  • 190. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Enterprise Data  Warehouse IT Biz Reports Time  and  $ Total  net   sales  of   all  Orders   today ETL ETL ETL Total  net   sales  of  all   Orders   today  with   FX Status  Quo  2 Data Architect 19 0
  • 191. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Integrating  Data  using  Graphs  and  Semantics 19 1 HIVE Impala,  etc Oracle SQL   Server Postgres Unstructured Semi-­‐ Structured Mappings Enterprise  Knowledge  Graph Search ReportsAPI Dashboard
  • 192. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Semantic  Technology  is  not  easy   Who  creates  this? Using  what  tools? Funny  Note:  I  found  my   presentation  from  2007  where  I   asked  this  same  question
  • 193. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Real  World 19 3
  • 194. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Real  World  Mappings  are  not  easy  and  obvious 19 4
  • 195. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Mappings  and  Ontologies  from  Questions 19 5 A  Pay-­‐As-­‐You-­‐Go  Methodology  for  Ontology-­‐Based  Data  Access Sequeda  &  Miranker.  IEEE  Internet  Computing  2017
  • 196. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Real  World  Example SELECT o.orderid, o.orderdate, o.ordertotal - ot.finaltax - CASE WHEN o.currencyid in (‘USD’, ‘CAD’) THEN o.shippingcost ELSE o.shippingcost - ot.shippingtax END AS netsales, o.currencyid FROM order o, ordertax ot WHERE o.orderid = ordertax.orderid AND o.statusid NOT IN (4, 5)
  • 197. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Reflection  4 • We  are  studying  how  non-­‐semantic  web  (and   non-­‐technical)  users  can  integrate  data  using   semantic  web  technologies – HOW:  We  need  better  tools – EXTENT:  I  don’t  know
  • 198. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com CONCLUSION 19 8
  • 199. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com HOW and  to  what  EXTENT can  RDB  be  integrated  with  the  SW? 1. RDB  can  be  automatically  directly  mapped  to   RDF  and  OWL – Monotone,  Information  and  Query  Preserving – Monotone  is  obstacle  for  Semantics  Preserving 2. RDB  can  evaluate  and  optimize  SPARQL  1.0   queries – Two  important  optimizations 3. RDB  can  act  as  a  reasoner for  Ontologies  with   inheritance  and  transitivity – Saturated  mappings,  query  rewriting  using  mat   views  and  recursion
  • 200. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Tipping  Point Relational   Database Semantic   Web • Semantics • “Graphy”  Queries • Data  Integration • Flexible • Metadata • Provenance • Graph  Visualizations OWL  2  QL OWL  2  RL OWL  2  EL OWL  2  DL OWL  SQL
  • 201. Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com HOW and  to  what  EXTENT can  RDB  be  integrated  with  the  SW? 20 1 Juan  Sequeda,  Ph.D Co-­‐Founder  – Capsenta juan@capsenta.com @juansequeda Sequeda  J.  Integrating  Relational  Databases  with  the  Semantic  Web.  IOS  Press.  2016 http://www.iospress.nl/book/integrating-­‐relational-­‐databases-­‐with-­‐the-­‐semantic-­‐web/ We  are  always  looking  for   smart  people THANK  YOU! RDB  can  be  automatically  directly   mapped  to  RDF  and  OWL  and   preserve  information  and  queries RDB  can  evaluate   and  optimize   SPARQL  1.0  queries RDB  can  act  as  a  reasoner for  Ontologies  with   inheritance  and  transitivity