This presentation describes three contributions of my PhD work:
1. Distributional Semantics for Entity Relatedness (DiSER)
2. Wikipedia Features for Entity Recommendations (WiFER)
3. Non-Orthogonal Explicit Semantic Analysis (NESA) for Word Relatedness
Further, it presents some of our work in collaboration with IBM Watson and Yahoo Research.
3. Motivation
Semantic Web
Technologies:
1. RDF
2. SPARQL
3. Ontology
4. Linked data
5. Turtle (syntax)
Entity Recommendation
Companies:
1. Metaweb
2. Ontoprise GmbH
3. OpenLink Software
4. Ontotext
5. Powerset (company)
Myosin
Proteins and cells:
1. Actin
2. Muscle contraction
3. Sarcomere
4. Myofibril
5. Cytoskeleton
Biologists:
1. Hugh Huxley
2. James Spudich
3. Ronald Vale
4. Manuel Morales
5. Brunó Ferenc Straub
3
4. Determine the degree of relatedness between two entities
Brad Pitt Tom Cruise
?
Entity Relatedness
4
5. Person, location,
organization
Time, date, money,
percent
Event, movie, disease,
symptom, side effect,
law, license and more
Background
Entity
• Many such types are covered in
Wikipedia
• More than 2K classes in DBpedia
• More than 350k classes in Yago
• Every Wikipedia article is
considered about an entity
5
7. Outline
• Motivation
• Entity Relatedness
• Distributional Semantics for Entity Relatedness (DiSER)
• Evaluation
• Entity Recommendation
• Wikipedia-based Features for Entity Recommendation (WiFER)
• Evaluation
• Text Relatedness
• Non-Orthogonal Explicit Semantic Analysis (NESA)
• Evaluation
• Application and Industry Use Cases
• Conclusion
7
8. Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Thesis Overview
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter V
Chapter IV
Chapter VI
8
9. Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
Thesis Overview
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter IV
9
11. Entity Relatedness: State of the Art
• Graph-based methods
• Path distance in Wikipedia graph (Strube and Ponzetto, 2006)
• Normalized Google Distance on Wikipedia graph (Witten and Milne, 2008)
• Personalized pagerank on Wikipedia graph (Agirre et. al, 2015)
• Path-based measures on DBpedia graph (Hulpus et. al, 2015)
• Corpus-based methods
• Key-phrase Overlap for Related Entities (KORE): partial overlaps between key-
phrases in corresponding Wikipedia articles (Hoffart et. al, 2012)
• Text relatedness measures: use colocation information in text
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
11
12. Explicit Semantic Analysis (ESA)
Uses explicit (manually defined) concepts like Wikipedia articles where every article
is considered describing a single concept (Gabrilovich and Markovitch, 2007)
Entity Relatedness: State of the Art
Distributional Semantics
word1 W11 W12 W13 W14 …....... W1n
word2 W21 W22 W23 W24 …....... W2n
word3 W31 W32 W33 W34 …........ W3n
wordm Wm1 Wm2 Wm3 Wm4 …... Wmn
...
doc1 doc2 doc3 doc4 ….... docn
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
12
13. word1 W11 W12 W13 W14 …....... W1n
word2 W21 W22 W23 W24 …....... W2n
word3 W31 W32 W33 W34 …........ W3n
wordm Wm1 Wm2 Wm3 Wm4 …... Wmn
...
Entity Relatedness: State of the Art
Distributional Semantics
doc1 doc2 doc3 doc4 ….... docn
Implicit/Latent Semantic Analysis (LSA)
Transforms sparse document space into a dense latent topic space
Latent Dirichlet
Allocation (LDA)
(Blei et al., 2003)
Latent Semantic
Analysis (LSA)
(Deerwester et al.,
1990)
Neural Embeddings
(Word2Vec)
(Mikolov et al., 2013)
n ~ 1M
word1 W11 W12 ……..... W1k
word2 W21 W22 ……..... W2k
wordm Wm1 Wm2 ……..... Wmk
...
topic1 topic2 … topick
Dimensionality
Reduction
k < 1000
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
13
14. Limitation of Text Relatedness Measures
• Compositionality
• Most of the entities are multiword expressions
• Vector(Brad Pitt) = Vector(Brad) + Vector(Pitt) ?
• Ambiguity
• Vector of an entity with ambiguous name like “Nice” (French city)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
14
15. Chapter IV
Distributional Semantics for Entity
Relatedness (DiSER)
entity1 W11 W12 W13 W14 …....... W1n
entity2 W21 W22 W23 W24 …....... W2n
entity3 W31 W32 W33 W34 …........ W3n
entityn Wn1 Wn2 Wn3 Wn4 …... Wnn
...
doc1 doc2 doc3 doc4 ….... docn
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
[Steve Jobs] co-founded Apple in 1976 to sell
Wozniak’s [Apple I] [Personal Computer]. [Steve
Jobs | Jobs] was CEO of [Apple Inc. | Apple] and
largest shareholder of [Pixar]. Jobs is widely
recognized as a pioneer of the [Microcomputer
Revolution], along [Steve Wozniak | Wozniak].
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
Annotated
Wikipedia with
entities
One sense per document
Wikipedia entities
[Steve Jobs] [Apple Inc.| Apple] [Steve Wozniak |
Wozniak]’ [Apple I] [Personal Computer]. [Steve
Jobs | Jobs] was CEO of [Apple Inc. | Apple] and
largest shareholdef [Pixar]. [Steve Jobs | Jobs] is
widely recognizpioneer of the [Microcomputer
Revolution], along [Steve Wozniak | Wozniak].
15
16. The Tree of Life (film)
Falmouth, Cornwall
World War Z (film)
What Just Happened
A Mighty Heart (film)
Plan B Entertainment
Jamaican Patois
Richard: A Novel
Sobriquet
I Want a Famous Face
Brad Pitt (DiSER)
Damiani (jewelry company)
University of Pittsburgh Band
Brad Pitt
Make It Right Foundation
Pittsburgh men’s basketball
Brangelina
Pittsburgh Panthers baseball
Pitt (Comics)
Pitt River
Brad Pitt filmography
Brad Pitt (ESA)
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
ESA vs DiSER Vector
Chapter IV
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
16
18. • Absolute relatedness score
• Relatedness between “Apple Inc.” and “Steve Jobs”
• Very low inter-annotator agreement
• Relative relatedness score
• Is “Steve Jobs” more related with “Apple Inc.” than “Bill Gates”
• High inter-annotator agreement
• KORE (Hoffart et al., 2012)
• 21 seed entities
• Every entity has list of 20 entities with their relatedness score
• 420 entity pairs in total
Entity Relatedness: Dataset
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
18
19. Approaches
Spearman Rank
Correlation
Graph-based
measures
Path-DBpedia (Hulpus et al., 2015) 0.610
WLM (Witten and Milne, 2008) 0.659
PPR (Agirre et al., 2015) 0.662
Corpus-based
measures
Word2Vec (Mikolov et al., 2013) 0.181
GloVe (Pennington et al., 2014) 0.194
LSA (Landauer et al., 1998) 0.375
KORE (Hoffart et al., 2012) 0.679
ESA (Gabrilovich and Markovitch, 2007) 0.691
DiSER 0.781
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
Results: KORE Dataset
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
19
20. DiSER Vector for non-Wikipedia Entities
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
20
22. Abortion
Abortion-rights
movement
The Irish Times
United States pro-
life movement
Vincent
Browne
Michael D.
Higgins
Context-DiSER
Irish abortion law
Death of Savita
Galway University
Hospital
Miscarriage
Catholic Country
…….
Savita
Halappanavar
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
22
23. Approaches
Spearman Rank
Correlation
KORE (state of the art) 0.679
Context-ESA 0.684
Context-DiSER (Manual linking) 0.769
Context-DiSER (Automatic linking) 0.719
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
Context-DiSER: Results on KORE Dataset
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
23
24. Thesis Overview
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter V
Chapter IV
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
24
26. • Classical Recommendation Systems
• Focus on personalized recommendation
• Require user-item preferences
• Entity Recommendation in Web Search (Blanco et al.,
2013)
• Co-occurrence features: query logs, query session, Flickr tags, tweets
• Graph-based features: shared connections in Yahoo knowledge graph and
others domain specific knowledge bases
• Entity and Relation type in Knowledge graph
• More than 100 features
• Combines features using learning to rank
Entity Recommendation: State of the Art
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
26
27. Features:
Prior Probability of Entity1
Prior Probability of Entity2
Joint Probability
Conditional Probability
Reverse Conditional Probability
Cosine Similarity
Pointwise Mutual Information
Distributional Semantic Model
Learning to
Rank
Leveraging Wikipedia Knowledge for Entity Recommendations
In: ISWC 2015
Wikipedia-based Features for Entity
Recommendation (WiFER)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
27
28. Prior Probability of Entity1
Prior Probability of Entity2
Joint Probability
Conditional Probability
Cosine Similarity
Pointwise Mutual Information
Reverse Conditional Probability
Distributional Semantic Model (ESA)
Wikipedia Text Wikipedia Entities
Prior Probability of Entity1
Prior Probability of Entity2
Joint Probability
Conditional Probability
Cosine Similarity
Pointwise Mutual Information
Reverse Conditional Probability
Distributional Semantic Model (DiSER)
Wikipedia-based Features for Entity
Recommendation (WiFER)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
28
29. • Learning to Rank
• Gradient Boosted Decision Trees (GBDT) (Li Hang, 2011)
• It builds the model in a stage-wise fashion
• Dataset: Entity recommendation in web search
• 4,797 web search queries (entities)
• Every entity query has a list of entity candidates (47,623 entity-pairs)
• All candidates are tagged on 5 label scales: Excellent, Prefer, Good, Fair,
and Bad
Combining Features
Type Total instances Percentage
Location 22,062 46.32
People 21,626 45.41
Movies 3,031 6.36
TV Shows 280 0.58
Album 563 1.18
Total 47,623 100
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
29
30. • Evaluation
• Normalized discounted cumulative gain (NDCG@10)
• 10 fold cross validation
Features All Person Location
Spark (Blanco
et al., 2013)
0.9276 0.9479 0.8882
WiFER 0.9173 0.9431 0.8795
Spark+WiFER 0.9325 0.9505 0.8987
Insights into Entity Recommendation in Web Search
In: IESD at ISWC, 2015
Entity Recommendation: Results
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
30
31. Insights into Entity Recommendation in Web Search
In: IESD at ISWC, 2015
Entity Recommendation: Feature Analysis in
Spark+WiFER
Relation type
Cosine similarity over Flickr tags
Probability of target entity over Wikipedia
text corpus
CF7 over Flickr tags
DSM over Wikipedia entities corpus
(DiSER)
Conditional user probability over query terms
DSM over Wikipedia text corpus (ESA)
Probability of source entity over
Wikipedia entities corpus
Probability of target entity over Flickr tags
Probability of target entity over Wikipedia
entities corpus
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
31
32. Thesis Overview
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter V
Chapter IV
Chapter VI
32
34. ESA assumes that related words share highly
weighted concepts in their distributional vector
Chapter VI
Improving ESA with Document Similarity
In: ECIR-2013
“soccer”
History of Soccer in the United States
Soccer in the United States
United States Soccer Federation
North American Soccer League
United Soccer Leagues
“football”
FIFA
Football
History of association football
Football in England
Association football
ESA(football, soccer) = 0.0
Orthogonality in ESA
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
34
35. Chapter VI
Improving ESA with Document Similarity
In: ECIR-2013
“soccer”
History of Soccer in the United States
Soccer in the United States
United States Soccer Federation
North American Soccer League
United Soccer Leagues
“football”
FIFA
Football
History of association football
Football in England
Association football
NESA(football, soccer) = (FIFA x Soccer in the United States +
FIFA x United Soccer Leagues ….) = 0.38
Non-Orthogonal Explicit Semantic Analysis
(NESA)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
35
36. • ESA: v1 and v2 are the n-dimensional vectors for words w1 and w2
• relESA (w1, w2) = v1
T . v2
• NESA: Correlation between vector dimensions
• relNESA (w1,w2) = v1
T . C . v2
• C(n,n) = ET . E
• Dimension correlation methods
• DiSER scores between corresponding Wikipedia article
Non-Orthogonal Explicit Semantic Analysis
(NESA)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
36
37. • WN353
• 353 word pairs annotated by 13-15 experts on a scale of 1-10.
• RG65
• 65 word pairs annotated by 51 experts on scale of 0-4
• MC30
• 30 word pairs annotated by 38 experts on scale of 0-1
• MT287
• 287 word pairs annotated by 10-12 experts on scale of 0-1
Word Relatedness Datasets
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
37
38. Non-Orthogonal Explicit Semantic Analysis
In: *SEM-2015
Chapter VI
WN353 MC30 RG65 MT287
LSA 0.579 0.667 0.616 0.555
LSA (Wiki) 0.538 0.744 0.697 0.353
Word2Vec 0.663 0.824 0.751 0.560
ESA 0.66 0.765 0.826 0.507
NESA 0.696 0.784 0.839 0.572
Spearman rank correlation with word similarity gold standard datasets
NESA: Results
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
38
39. Non-Orthogonal Explicit Semantic Analysis
In: *SEM-2015
Chapter VI
NESA: Results
• Word similarity vs relatedness (Agirre et al., 2009)
• WN353Rel: 202 word pairs from WN353
• WN353Sim: 252 word pairs from WN353
Spearman rank correlation with word similarity vs relatedness datasets
WN353Rel WN353Sim
LSA 0.521 0.662
LSA (Wiki) 0.506 0.559
Word2Vec 0.601 0.741
ESA 0.643 0.663
NESA 0.663 0.719
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
39
40. Outline
• Motivation
• Entity Relatedness
• Distributional Semantics for Entity Relatedness (DiSER)
• Evaluation
• Entity Recommendation
• Wikipedia-based Features for Entity Recommendation (WiFER)
• Evaluation
• Text Relatedness
• Non-Orthogonal Explicit Semantic Analysis (NESA)
• Evaluation
• Application and Industry Use Cases
• Conclusion
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
40
42. Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
42
EnRG SPARQL Endpoint
National University of Ireland, Galway
43. Industrial Use Cases
Medical entity linking for question-answering
and relationship explanation in Knowledge
Graph
Entity Recommendation in Web Search
Company name disambiguation for social
profiling
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
43
44. • Entity Relatedness
• Distributional Semantics for Entity Relatedness (DiSER)
• Outperformed state of the art entity relatedness measures
• Entity Recommendation
• Wikipedia-based Features for Entity Recommendation (WiFER)
• Effective features for entity recommendation in web search
• Text Relatedness
• Non-Orthogonal Explicit Semantic Analysis (NESA)
• Outperformed other existing word relatedness measures
• Entity Relatedness Graph (EnRG)
• Contains all Wikipedia entities and their pre-computed relatedness scores
• Contains distributional vectors for all Wikipedia entities
Conclusion
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
45
45. • Relationship explanation for recommended entities
• Best path in knowledge graph
• Best natural language description
• Knowledge discovery
• Analogy querying over knowledge graph
e.g. Google to Motorola => Microsoft to ?
• Example based querying
e.g. Google to Motorola => ? to ?
Future Research Directions
46
Don’t call them concept try to be specific like technologies for SemWeb
Change box in horizontal boxes not vertical onces
Entity describe definitions in different communities and the definition we will carry out in our presentation
Relatedness describe it with the notion of similarity vs relatedness by illustrating wordnet relations (taxonomic vs others), further describe a simple van diagram
We do not distinguish between entity and concept like football player
Entity describe definitions in different communities and the definition we will carry out in our presentation
Relatedness describe it with the notion of similarity vs relatedness by illustrating wordnet relations (taxonomic vs others), further describe a simple van diagram
Relatedness reflects the degree of associativity, connectivity
Relatedness score: University => Student, building
Similarity score => Student, building
Relatedness score: University => Student, bio lab
Similarity score => Student, bio lab
Change box to chapter names
Entity Relatedness => DiSER
Context-VSM and Context-ESA
- Vector similarity between corresponding Wikipedia articles
- ESA score between corresponding Wikipedia article
Change to diser explanation
Change to diser explanation
Backup slide on Vector composition
Lucene based “Brad Pitt”
Add wiki markups to show one sense per document
Highlight the relevant articles in both vectors
Merge next slide with one
Explain entity disambiguation for context-diser
Wikipedia text and entity tagged
One thing to notice:
We only get the articles that contain the given entity as wikipedia links not only world
So, It performs better than text DSM
Describe GBDT
Change table
Change table
Change x to pairwise sim symbol
Change to equation
Consistency in subscript and superscript
\textbf{MC30} It contains 30 pairs of noun and their relatedness score are on the scale of 0-4. This dataset was prepared by Miller and Charles(1991). The score was provided by 38 human experts.\\\\
WN353 It contains 353 pairs of word annotated by 13-15 human experts on a scale of 0-10. 10 stands for highly related and 0 stands for unrelated. It containes has generic words as well as named entities.\\\\
\textbf{WN353Sim and WN353Rel} The WN353 dataset was refined by Agirre at el. (2009). It contained similar and related pair of words. Two words are similar if they are connected through the taxonomic relation like synonym or hyponym. Two words are related if they are connected through relations like meronym or holonym. WN353Rel and WN353Sim contain 252 and 203 pair of words respectively.\\\\
\textbf{RG65} It contains 65 pair of non-technical word pair. It was annotated by 51 human experts.\\\\
\textbf{MT771} It has 771 pairs of words and their relatedness score. The words are very generic and varying from all kinds of domains.\\\\
\textbf{MT287} It has 287 pairs of words and their relatedness score, prepared by using Amazon Mechanical Turck (MT).
Backup slide on similarity vs relatedness
Context-VSM and Context-ESA
- Vector similarity between corresponding Wikipedia articles
- ESA score between corresponding Wikipedia article
Change screenshot with better quality
Change to screen shot from a Sparql editor with color encoding
Remove Similarity and relatedness thing and explain more on relationship explanation