2. Problem
• Text documents naturally include dependency relations
among textual elements
• Such dependencies enable readers to cognitively infer the
flow of thoughts and how the various elements are
semantically affected
• Automatically identifying textual dependencies has been
the focus of various approaches. However we observe
that aggregating, accessing and reusing dependencies for
further processing is still a challenge
3. Question
• Our aim is to answer the following question: how can we
make text dependencies more accessible for consumption
and reuse in text analysis?
• For that we focus on the following requirements:
• To have unique references to textual elements
• To preserve dependency links across text sources
• To store and serve the data for further consumption
7. Example
NS:sentence/4c7aa81ba8fbcd3ad42996eb6bac18dc
RDFS:hasDescription
It is an efficient service
NS:term/PRP/It/4c7aa81ba8fbcd3ad42996eb6bac18dc_1
It
RDFS:label
NS:term/VBZ/is/4c7aa81ba8fbcd3ad42996eb6bac18dc_2
Is
RDFS:label
NS:term/DT/an/4c7aa81ba8fbcd3ad42996eb6bac18dc_3
an
RDFS:label
NS:term/JJ/efficient/4c7aa81ba8fbcd3ad42996eb6bac18dc_4
efficient
RDFS:label
NS:term/NN/service/4c7aa81ba8fbcd3ad42996eb6bac18dc_5
service
RDFS:label
DCT:hasPart
STD:nsubj
STD:det
STD:PRP STD:VBZ
STD:DT
STD:JJ
STD:NN
ISA
ISA
ISA
ISA
ISA
9. Processing Dependencies through
SPARQL – Example 1
• What were the adjectives used by users to describe their
experience from the most frequent, to the less frequent?
12. So What?
• This graph based manipulation of dependencies would
add potential benefits such as:
• Aggregating and transforming distributed pieces of text as a
coherent query enabled dependency layer
• The possibility of “hardwiring” text dependency patterns at a query
level, and hook them to further analytical tools and techniques (e.g.
visualization)
• The ability to easily extend the text-based graph to capture further
data entities such as polarity dictionaries and perform further
analytics
13. Future Directions
• At the level of dependency RDF generator, the extractor
can be improved by providing filtering mechanisms that
can be controlled by the analyst
• We are building an online tool that would enable users to
upload a corpus, and generate the corresponding
dependency RDF to be downloaded or pushed to a
triplestore
• We are planning to focus next on exploiting this graph
representation to perform business analytics around
decision models (e.g. user satisfaction and performance
models)
14. Conclusions
• We presented our work on generating a linked
dependency layer on top of text documents
• We highlighted the preliminary value of this layer by
applying the linking process on 3,140 disparate user
comments
• We believe that this layer will open the path for improving
the consumption and reuse of text dependencies in the
context of text and business analytics