Inconsistencies of Connection for Heterogeneity and a New Rela,on Discovery Method
1. Inconsistencies
of
Connec,on
for
Heterogeneity
and
a
New
Rela,on
Discovery
Method
that
Solved
them
Takafumi
NAKANISHI
,
Kiyotaka
UCHIMOTO,
Yutaka
KIDAWARA
Na,onal
Ins,tute
of
Informa,on
and
Communica,on
Technology
(NICT),
Japan
2. What’s
Big
Data?
• Speed
up?
Processing
a
lot
of
data?
– What
differences
are
there
between
VLDB
and
Big
Data.
(Very
Large
Database)?
• Fragmental
data
exist
– Un,l
now,
scien,sts
work
such
data
for
simula,on.
• Heterogeneous
Database
Integra,on(Cross
database
search)
– S,ll
Considering?
3. Purposes
of
this
presenta,on
• We
should
consider
the
paradigm
shiV
in
computer
science.
– From
the
closed
assump,on
to
the
opened
assump,on
– What
are
there
any
problems?
• Businesspeople
require
not
only
EDW
(Enterprise
Data
Warehouse)
but
also
the
other
analysis
methods.
• Discovering
rela,on
between
heterogeneous
concept,
dataset,
etc.
• Three
Opened
Assump,on’s
Evils
4. True
Problem
Defini,ons
of
Big
Data
Rela,on
Discovery
in
Heterogeneity
Big
data
Speeding
Up,
Promo,on
of
Streamlining,
and
Increasing
Data
Volume
for
Processing
Schemaless
Data
and
New
Data
Processing
Method
Distributed
Parallel
Processing,
High
Performance
Compu,ng
(HPC),
Network
Delay,
etc.
Construc)on
of
Big
data
environment
(Hardware,
middleware
researches)
Big
data
analy)cs
(So=ware
researches)
Closed
Assump,on
System
à
Open
Assump,on
System
5. AI
Community
DB
Community
a1
a2
b10
b8
a9
a8
a7
a6
a5
a4
a3
b9
b6
b7
b4
b5
b2
b3
b1
Someone
adds
rela,onships
between
a3
and
b4
Rela,onships
among
persons
in
communi,es
AI
and
DB.
ai,
bj
are
researchers.
When
someone
adds
symmetric
and
transi,ve
rela,onships
between
a3
and
b4,
it
is
true
that
a1
is
related
to
b5
because
a1
is
related
to
a3,
a3
is
related
to
b4,
and
b4
is
related
to
b5.
6. Office
Community
Music
Community
a1
a2
b10
b8
a9
a8
a7
a6
a5
a4
a3
b9
b6
b7
b4
b5
b2
b3
b1
Someone
adds
rela,onships
between
a3
and
b4
Rela,onships
among
persons
in
workplace
and
music
communi,es.
ai
are
co-‐workers,
and
bj
are
musicians.
When
someone
adds
symmetric
and
transi,ve
rela,onships
between
a3
and
b4,
it
is
actually
not
true
that a1
is
related
to
b5.
In
graph
structure,
it
is
true
that
a1
is
related
to
b5. However,
realis,cally,
a1
and
b5
do
not
share
ground
without
other
defini,ons
or
analysis.
7. Difference
of
two
examples
• “AI Community” ∩
“DB Community” ≠ ∅.
à Closed Assumption
– Representation of relations in the previous methods
such as owl, RDF, etc.
• “Office
Community” ∩
“Music
Community”
=
∅.
àOpened Assumption
– unable representation of relations in the previous
method
8. Proof
of
inconsistency
of
order
rela,on
between
two
certain
sets
[1/2]
• A = {a1, a2, … , an}, B = {b1, b2, …, bm}
• A ∩ B = ∅.
• Both sets A and B may define the order
relations differently.
• prove that we cannot discover the relationship
between sets A and B or other relationships
when we get relationship f between a1 ∈ A
and b1 ∈ B. à b1=f(a1)
9. Proof
of
inconsistency
of
order
rela,on
between
two
certain
sets
[2/2]
• We prove that it is satisfied when bi = f(ai) is not
true by induction.
– b1 = f(a1) is true by the above condition when i = 1.
– We assume that bk = f(ak) is true when i = k.
– When i = k + 1, bk+1 = f(ak+1) is not true.
• set A has an order relation. set B has another order
relation.
– bk ≤ bk+1 may not be true, if ak ≤ ak+1 is true and vice
versa. Furthermore, both ak ≤ ak+1 and bk ≤ bk+1 may
not be true.
• Although b1 = f(a1) is true, bi = f(ai) is not.
10. Proof
of
inconsistency
of
the
transi,ve
rela,on
between
two
certain
sets[1/2]
• A = {a1, a2, … , an}, B = {b1, b2, …, bm}
• A ∩ B = ∅.
• Set B has order relation b1 ≤ b2 ≤ b3 ≤ b4…
– Transitive relation
– If b1 ≤ b2 and b2 ≤ b3 are true, b1 ≤ b3 is true
• Set A has its own order relation.
11. Proof
of
inconsistency
of
the
transi,ve
rela,on
between
two
certain
sets[2/2]
• Assume a1 = (1, 5), b1 =(2, 1), b2 = (3, 2), b3 = (4, 3).
• We prove that a1 ≤ b3 is true when we get relation a1 ≤ b1.
• To reveal the conclusion first, a1 ≤ b3 may not satisfy.
• The relationship of a1 and b1 focuses on each first element.
• Then a1 ≤ b1 is true.
• The order relation of set B focuses on more values of each second
element.
• Then b1 ≤ b2 ≤ b3, and if b1 ≤ b2 and b2 ≤ b3 is true, then b1 ≤ b3 is
true.
• However, a1 ≤ b3 is not true in the order set of set B.
• Like the relation of a1 and b1, an inconsistency occurs whose order
and transitive relations of set B are not guaranteed.
12. Inconsistencies
–
Three
Opened
Assump,on’s
Evils
• Inconsistency
is
shown
whose
rela,on
does
not
guarantee
the
future
• Inconsistency
where
any
transi,ve
rela,on
is
not
true,
when
anyone
connects
links
for
heterogeneous
fields
• Inconsistency
where
any
rela,on
in
heterogeneous
fields
cannot
be
discovered
in
set
theory
13. Misconcep,on
of
Future
Informa,on
Systems
• A
user
Do
Not
want
to
retrieve
some
data,
need
some
solu,ons
– A
system
solve
some
clues
for
a
user
from
data
by
rela,vely
comparing
– It
is
important
to
rela,vely
compare
between
data.
• We
can
Not
write
anymore
rela,onships
– dynamical
changing
depending
on
user,
situa,on,
etc.
– when
data
are
changing,
rela,onships
are
changing
• We
cannot
create
indexes.
• We
cannot
discover
without
wri,ng
rela,onships
– However,
a
system
can
compare
on
the
basis.
14. Functional Predicate
Set Theory
Coordinates System
• commutative property
• associative property
• distributive property
• reflexive relation
• antisymmetric relation
• transitive relation
• axis adaptability evaluation
• uniqueness evaluation
• certainty evaluation
• predicate satisfaction evaluation
Incomplete
Mutual
Map
Transforma,on
Framework
between
set
theory
and
the
Cartesian
system
of
coordinates.
Mutual
mapping
by
mathema,cal
rule,
formula,
etc.
(Because
the
mathema,cal
rule
and
formula
are
closed
assump,on)
15. Overview
of
our
method
Sampling
Data
• A
query
given
by
a
user
• Sampling
the
data
set
depend
on
a
query
Selec,on
of
Basis
• A
system
selects
some
basis
for
solu,on
of
query
• Order
rela,onships?,
con,nues
or
equal
interval
Sampling?
Mapping
from
set
theory
to
the
Cartesian
system
of
coordinates
• Mathema,cal
rule/formula
à
closed
assump,on
• Crea,on
transforma,on
opera,on
on
the
closed
assump,on
manually.
Discovery
of
rela,onships
on
the
the
Cartesian
system
of
coordinates
• Predefini,on
of
func,onal
predicates
• Sa,sfying
each
func,on
predicates
Re-‐mapping
to
set
theory
•
Representa,on
of
predicate
in
predicate
func,ons
•
Representa,on
of
reasons
in
basis
1
2
3
4
5
16. Example:
Crea,on
Func,onal
Predicate
–
dependOn
• ”dependOn” means that set A relies on set X.
– The value of element ai of set A should only
change with the variation of the value of element xj
of set X.
• ”dependOn” is represented in {A}(X), when set
A depends on set X.
19. Conclusion
• Three
opened
assump,on
evils
– We
represented
the
inconsistencies
of
past
researches
that
contributed
to
the
interconnec,on
of
such
heterogeneous
fields
as
Linked
Data,
and
our
past
researches.
• Map
transforma,on
framework
from
set
theory
to
the
Cartesian
system
of
coordinates
– defining
such
predicate
func,ons
as
disjoint, meet, overlap,
coveredBy, covers, equal, contain, inside, correlate, moreThan,
lessThan, alongWith, join, etc.
• A
preliminary
evalua,on
of
predicate
func,on
”dependOn”