SlideShare a Scribd company logo
1 of 46
Leveraging Wikipedia-based
Features for Entity Relatedness and
Recommendations
Nitish Aggarwal
Supervised by Dr. Paul Buitelaar
PhD Viva
Brad Pitt
Motivation
2
Motivation
Semantic Web
Technologies:
1. RDF
2. SPARQL
3. Ontology
4. Linked data
5. Turtle (syntax)
Entity Recommendation
Companies:
1. Metaweb
2. Ontoprise GmbH
3. OpenLink Software
4. Ontotext
5. Powerset (company)
Myosin
Proteins and cells:
1. Actin
2. Muscle contraction
3. Sarcomere
4. Myofibril
5. Cytoskeleton
Biologists:
1. Hugh Huxley
2. James Spudich
3. Ronald Vale
4. Manuel Morales
5. Brunó Ferenc Straub
3
Determine the degree of relatedness between two entities
Brad Pitt Tom Cruise
?
Entity Relatedness
4
Person, location,
organization
Time, date, money,
percent
Event, movie, disease,
symptom, side effect,
law, license and more
Background
Entity
• Many such types are covered in
Wikipedia
• More than 2K classes in DBpedia
• More than 350k classes in Yago
• Every Wikipedia article is
considered about an entity
5
Motor vehicle
Car
Motorcycle
Automobile
Auto
Car seat
Car window
s
s
h h
m m
Background
Relatedness
Synonym
s
Similar
Related
Substitutability
6
Outline
• Motivation
• Entity Relatedness
• Distributional Semantics for Entity Relatedness (DiSER)
• Evaluation
• Entity Recommendation
• Wikipedia-based Features for Entity Recommendation (WiFER)
• Evaluation
• Text Relatedness
• Non-Orthogonal Explicit Semantic Analysis (NESA)
• Evaluation
• Application and Industry Use Cases
• Conclusion
7
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Thesis Overview
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter V
Chapter IV
Chapter VI
8
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
Thesis Overview
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter IV
9
Entity Relatedness
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
10
Entity Relatedness: State of the Art
• Graph-based methods
• Path distance in Wikipedia graph (Strube and Ponzetto, 2006)
• Normalized Google Distance on Wikipedia graph (Witten and Milne, 2008)
• Personalized pagerank on Wikipedia graph (Agirre et. al, 2015)
• Path-based measures on DBpedia graph (Hulpus et. al, 2015)
• Corpus-based methods
• Key-phrase Overlap for Related Entities (KORE): partial overlaps between key-
phrases in corresponding Wikipedia articles (Hoffart et. al, 2012)
• Text relatedness measures: use colocation information in text
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
11
Explicit Semantic Analysis (ESA)
Uses explicit (manually defined) concepts like Wikipedia articles where every article
is considered describing a single concept (Gabrilovich and Markovitch, 2007)
Entity Relatedness: State of the Art
Distributional Semantics
word1 W11 W12 W13 W14 …....... W1n
word2 W21 W22 W23 W24 …....... W2n
word3 W31 W32 W33 W34 …........ W3n
wordm Wm1 Wm2 Wm3 Wm4 …... Wmn
...
doc1 doc2 doc3 doc4 ….... docn
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
12
word1 W11 W12 W13 W14 …....... W1n
word2 W21 W22 W23 W24 …....... W2n
word3 W31 W32 W33 W34 …........ W3n
wordm Wm1 Wm2 Wm3 Wm4 …... Wmn
...
Entity Relatedness: State of the Art
Distributional Semantics
doc1 doc2 doc3 doc4 ….... docn
Implicit/Latent Semantic Analysis (LSA)
Transforms sparse document space into a dense latent topic space
Latent Dirichlet
Allocation (LDA)
(Blei et al., 2003)
Latent Semantic
Analysis (LSA)
(Deerwester et al.,
1990)
Neural Embeddings
(Word2Vec)
(Mikolov et al., 2013)
n ~ 1M
word1 W11 W12 ……..... W1k
word2 W21 W22 ……..... W2k
wordm Wm1 Wm2 ……..... Wmk
...
topic1 topic2 … topick
Dimensionality
Reduction
k < 1000
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
13
Limitation of Text Relatedness Measures
• Compositionality
• Most of the entities are multiword expressions
• Vector(Brad Pitt) = Vector(Brad) + Vector(Pitt) ?
• Ambiguity
• Vector of an entity with ambiguous name like “Nice” (French city)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
14
Chapter IV
Distributional Semantics for Entity
Relatedness (DiSER)
entity1 W11 W12 W13 W14 …....... W1n
entity2 W21 W22 W23 W24 …....... W2n
entity3 W31 W32 W33 W34 …........ W3n
entityn Wn1 Wn2 Wn3 Wn4 …... Wnn
...
doc1 doc2 doc3 doc4 ….... docn
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
[Steve Jobs] co-founded Apple in 1976 to sell
Wozniak’s [Apple I] [Personal Computer]. [Steve
Jobs | Jobs] was CEO of [Apple Inc. | Apple] and
largest shareholder of [Pixar]. Jobs is widely
recognized as a pioneer of the [Microcomputer
Revolution], along [Steve Wozniak | Wozniak].
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
Annotated
Wikipedia with
entities
One sense per document
Wikipedia entities
[Steve Jobs] [Apple Inc.| Apple] [Steve Wozniak |
Wozniak]’ [Apple I] [Personal Computer]. [Steve
Jobs | Jobs] was CEO of [Apple Inc. | Apple] and
largest shareholdef [Pixar]. [Steve Jobs | Jobs] is
widely recognizpioneer of the [Microcomputer
Revolution], along [Steve Wozniak | Wozniak].
15
The Tree of Life (film)
Falmouth, Cornwall
World War Z (film)
What Just Happened
A Mighty Heart (film)
Plan B Entertainment
Jamaican Patois
Richard: A Novel
Sobriquet
I Want a Famous Face
Brad Pitt (DiSER)
Damiani (jewelry company)
University of Pittsburgh Band
Brad Pitt
Make It Right Foundation
Pittsburgh men’s basketball
Brangelina
Pittsburgh Panthers baseball
Pitt (Comics)
Pitt River
Brad Pitt filmography
Brad Pitt (ESA)
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
ESA vs DiSER Vector
Chapter IV
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
16
Entity Relatedness: Evaluation
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
17
• Absolute relatedness score
• Relatedness between “Apple Inc.” and “Steve Jobs”
• Very low inter-annotator agreement
• Relative relatedness score
• Is “Steve Jobs” more related with “Apple Inc.” than “Bill Gates”
• High inter-annotator agreement
• KORE (Hoffart et al., 2012)
• 21 seed entities
• Every entity has list of 20 entities with their relatedness score
• 420 entity pairs in total
Entity Relatedness: Dataset
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
18
Approaches
Spearman Rank
Correlation
Graph-based
measures
Path-DBpedia (Hulpus et al., 2015) 0.610
WLM (Witten and Milne, 2008) 0.659
PPR (Agirre et al., 2015) 0.662
Corpus-based
measures
Word2Vec (Mikolov et al., 2013) 0.181
GloVe (Pennington et al., 2014) 0.194
LSA (Landauer et al., 1998) 0.375
KORE (Hoffart et al., 2012) 0.679
ESA (Gabrilovich and Markovitch, 2007) 0.691
DiSER 0.781
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
Results: KORE Dataset
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
19
DiSER Vector for non-Wikipedia Entities
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
20
BBC: http://www.bbc.com/news/world-europe-22204377
Article about Savita
Context-DiSER
Noun phrase extraction:
StanfordNLP
Entity linking:
Prior probability
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
21
Abortion
Abortion-rights
movement
The Irish Times
United States pro-
life movement
Vincent
Browne
Michael D.
Higgins
Context-DiSER
Irish abortion law
Death of Savita
Galway University
Hospital
Miscarriage
Catholic Country
…….
Savita
Halappanavar
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
22
Approaches
Spearman Rank
Correlation
KORE (state of the art) 0.679
Context-ESA 0.684
Context-DiSER (Manual linking) 0.769
Context-DiSER (Automatic linking) 0.719
Wikipedia-based Distributional Semantics for Entity Relatedness
In: AAAI-FSS-2014
Context-DiSER: Results on KORE Dataset
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
23
Thesis Overview
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter V
Chapter IV
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
24
Entity Recommendation
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
25
• Classical Recommendation Systems
• Focus on personalized recommendation
• Require user-item preferences
• Entity Recommendation in Web Search (Blanco et al.,
2013)
• Co-occurrence features: query logs, query session, Flickr tags, tweets
• Graph-based features: shared connections in Yahoo knowledge graph and
others domain specific knowledge bases
• Entity and Relation type in Knowledge graph
• More than 100 features
• Combines features using learning to rank
Entity Recommendation: State of the Art
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
26
Features:
Prior Probability of Entity1
Prior Probability of Entity2
Joint Probability
Conditional Probability
Reverse Conditional Probability
Cosine Similarity
Pointwise Mutual Information
Distributional Semantic Model
Learning to
Rank
Leveraging Wikipedia Knowledge for Entity Recommendations
In: ISWC 2015
Wikipedia-based Features for Entity
Recommendation (WiFER)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
27
Prior Probability of Entity1
Prior Probability of Entity2
Joint Probability
Conditional Probability
Cosine Similarity
Pointwise Mutual Information
Reverse Conditional Probability
Distributional Semantic Model (ESA)
Wikipedia Text Wikipedia Entities
Prior Probability of Entity1
Prior Probability of Entity2
Joint Probability
Conditional Probability
Cosine Similarity
Pointwise Mutual Information
Reverse Conditional Probability
Distributional Semantic Model (DiSER)
Wikipedia-based Features for Entity
Recommendation (WiFER)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
28
• Learning to Rank
• Gradient Boosted Decision Trees (GBDT) (Li Hang, 2011)
• It builds the model in a stage-wise fashion
• Dataset: Entity recommendation in web search
• 4,797 web search queries (entities)
• Every entity query has a list of entity candidates (47,623 entity-pairs)
• All candidates are tagged on 5 label scales: Excellent, Prefer, Good, Fair,
and Bad
Combining Features
Type Total instances Percentage
Location 22,062 46.32
People 21,626 45.41
Movies 3,031 6.36
TV Shows 280 0.58
Album 563 1.18
Total 47,623 100
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
29
• Evaluation
• Normalized discounted cumulative gain (NDCG@10)
• 10 fold cross validation
Features All Person Location
Spark (Blanco
et al., 2013)
0.9276 0.9479 0.8882
WiFER 0.9173 0.9431 0.8795
Spark+WiFER 0.9325 0.9505 0.8987
Insights into Entity Recommendation in Web Search
In: IESD at ISWC, 2015
Entity Recommendation: Results
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
30
Insights into Entity Recommendation in Web Search
In: IESD at ISWC, 2015
Entity Recommendation: Feature Analysis in
Spark+WiFER
Relation type
Cosine similarity over Flickr tags
Probability of target entity over Wikipedia
text corpus
CF7 over Flickr tags
DSM over Wikipedia entities corpus
(DiSER)
Conditional user probability over query terms
DSM over Wikipedia text corpus (ESA)
Probability of source entity over
Wikipedia entities corpus
Probability of target entity over Flickr tags
Probability of target entity over Wikipedia
entities corpus
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
31
Thesis Overview
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
Wikipedia Features for
Entity Recommendation
(WiFER)
Feature
Extraction
Distributional Semantic for
Entity Relatedness (DiSER)
Distributional
Representation
Non-Orthogonal Explicit
Semantic Analysis (NESA)
Chapter V
Chapter IV
Chapter VI
32
Text Relatedness:
Non-Orthogonal Explicit Semantic Analysis (NESA)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
33
ESA assumes that related words share highly
weighted concepts in their distributional vector
Chapter VI
Improving ESA with Document Similarity
In: ECIR-2013
“soccer”
History of Soccer in the United States
Soccer in the United States
United States Soccer Federation
North American Soccer League
United Soccer Leagues
“football”
FIFA
Football
History of association football
Football in England
Association football
ESA(football, soccer) = 0.0
Orthogonality in ESA
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
34
Chapter VI
Improving ESA with Document Similarity
In: ECIR-2013
“soccer”
History of Soccer in the United States
Soccer in the United States
United States Soccer Federation
North American Soccer League
United Soccer Leagues
“football”
FIFA
Football
History of association football
Football in England
Association football
NESA(football, soccer) = (FIFA x Soccer in the United States +
FIFA x United Soccer Leagues ….) = 0.38
Non-Orthogonal Explicit Semantic Analysis
(NESA)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
35
• ESA: v1 and v2 are the n-dimensional vectors for words w1 and w2
• relESA (w1, w2) = v1
T . v2
• NESA: Correlation between vector dimensions
• relNESA (w1,w2) = v1
T . C . v2
• C(n,n) = ET . E
• Dimension correlation methods
• DiSER scores between corresponding Wikipedia article
Non-Orthogonal Explicit Semantic Analysis
(NESA)
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
36
• WN353
• 353 word pairs annotated by 13-15 experts on a scale of 1-10.
• RG65
• 65 word pairs annotated by 51 experts on scale of 0-4
• MC30
• 30 word pairs annotated by 38 experts on scale of 0-1
• MT287
• 287 word pairs annotated by 10-12 experts on scale of 0-1
Word Relatedness Datasets
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
37
Non-Orthogonal Explicit Semantic Analysis
In: *SEM-2015
Chapter VI
WN353 MC30 RG65 MT287
LSA 0.579 0.667 0.616 0.555
LSA (Wiki) 0.538 0.744 0.697 0.353
Word2Vec 0.663 0.824 0.751 0.560
ESA 0.66 0.765 0.826 0.507
NESA 0.696 0.784 0.839 0.572
Spearman rank correlation with word similarity gold standard datasets
NESA: Results
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
38
Non-Orthogonal Explicit Semantic Analysis
In: *SEM-2015
Chapter VI
NESA: Results
• Word similarity vs relatedness (Agirre et al., 2009)
• WN353Rel: 202 word pairs from WN353
• WN353Sim: 252 word pairs from WN353
Spearman rank correlation with word similarity vs relatedness datasets
WN353Rel WN353Sim
LSA 0.521 0.662
LSA (Wiki) 0.506 0.559
Word2Vec 0.601 0.741
ESA 0.643 0.663
NESA 0.663 0.719
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
39
Outline
• Motivation
• Entity Relatedness
• Distributional Semantics for Entity Relatedness (DiSER)
• Evaluation
• Entity Recommendation
• Wikipedia-based Features for Entity Recommendation (WiFER)
• Evaluation
• Text Relatedness
• Non-Orthogonal Explicit Semantic Analysis (NESA)
• Evaluation
• Application and Industry Use Cases
• Conclusion
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
40
Chapter VIIhttp://enrg.insight-centre.org/
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
41
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
42
EnRG SPARQL Endpoint
National University of Ireland, Galway
Industrial Use Cases
Medical entity linking for question-answering
and relationship explanation in Knowledge
Graph
Entity Recommendation in Web Search
Company name disambiguation for social
profiling
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
43
• Entity Relatedness
• Distributional Semantics for Entity Relatedness (DiSER)
• Outperformed state of the art entity relatedness measures
• Entity Recommendation
• Wikipedia-based Features for Entity Recommendation (WiFER)
• Effective features for entity recommendation in web search
• Text Relatedness
• Non-Orthogonal Explicit Semantic Analysis (NESA)
• Outperformed other existing word relatedness measures
• Entity Relatedness Graph (EnRG)
• Contains all Wikipedia entities and their pre-computed relatedness scores
• Contains distributional vectors for all Wikipedia entities
Conclusion
Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion
45
• Relationship explanation for recommended entities
• Best path in knowledge graph
• Best natural language description
• Knowledge discovery
• Analogy querying over knowledge graph
e.g. Google to Motorola => Microsoft to ?
• Example based querying
e.g. Google to Motorola => ? to ?
Future Research Directions
46
Related Queries?

More Related Content

Viewers also liked

Combining sequence motifs and protein interactions to unravel complex phospho...
Combining sequence motifs and protein interactions to unravel complex phospho...Combining sequence motifs and protein interactions to unravel complex phospho...
Combining sequence motifs and protein interactions to unravel complex phospho...Lars Juhl Jensen
 
Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...jodischneider
 
Towards Social semantic journalism
Towards Social semantic journalismTowards Social semantic journalism
Towards Social semantic journalismBahareh Heravi
 
Linked data in the digital humanities skills workshop for realising the oppo...
Linked data in the digital humanities  skills workshop for realising the oppo...Linked data in the digital humanities  skills workshop for realising the oppo...
Linked data in the digital humanities skills workshop for realising the oppo...jodischneider
 
Harrower Heravi RDA P4 Social media
Harrower Heravi RDA P4 Social mediaHarrower Heravi RDA P4 Social media
Harrower Heravi RDA P4 Social mediadri_ireland
 
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Ronak Shah
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesPradeeban Kathiravelu, Ph.D.
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD VivaAidan Hogan
 
Sabrina Kirrane INSIGHT Viva Presentation
Sabrina Kirrane INSIGHT Viva Presentation Sabrina Kirrane INSIGHT Viva Presentation
Sabrina Kirrane INSIGHT Viva Presentation Sabrina Kirrane
 
2016 07 12_purdue_bigdatainomics_seandavis
2016 07 12_purdue_bigdatainomics_seandavis2016 07 12_purdue_bigdatainomics_seandavis
2016 07 12_purdue_bigdatainomics_seandavisSean Davis
 
Industry Report: The State of Customer Data Integration in 2013
Industry Report: The State of Customer Data Integration in 2013Industry Report: The State of Customer Data Integration in 2013
Industry Report: The State of Customer Data Integration in 2013Scribe Software Corp.
 
Data Journalism - Start working with Data
Data Journalism  - Start working with DataData Journalism  - Start working with Data
Data Journalism - Start working with DataBahareh Heravi
 
Systematic discovery of phosphorylation networks - Combining linear motifs an...
Systematic discovery of phosphorylation networks - Combining linear motifs an...Systematic discovery of phosphorylation networks - Combining linear motifs an...
Systematic discovery of phosphorylation networks - Combining linear motifs an...Lars Juhl Jensen
 
Protein interaction networks from yeast to human
Protein interaction networks from yeast to humanProtein interaction networks from yeast to human
Protein interaction networks from yeast to humanLars Juhl Jensen
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networksLars Juhl Jensen
 
Introduction to Network Mapping
Introduction to Network MappingIntroduction to Network Mapping
Introduction to Network MappingDmitry Grapov
 
Data Journalism - Finding Data
Data Journalism - Finding DataData Journalism - Finding Data
Data Journalism - Finding DataBahareh Heravi
 
Data Journalism - Introduction
Data Journalism - IntroductionData Journalism - Introduction
Data Journalism - IntroductionBahareh Heravi
 

Viewers also liked (19)

Combining sequence motifs and protein interactions to unravel complex phospho...
Combining sequence motifs and protein interactions to unravel complex phospho...Combining sequence motifs and protein interactions to unravel complex phospho...
Combining sequence motifs and protein interactions to unravel complex phospho...
 
Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...Identifying, annotating, and filtering arguments and opinions on the social w...
Identifying, annotating, and filtering arguments and opinions on the social w...
 
Towards Social semantic journalism
Towards Social semantic journalismTowards Social semantic journalism
Towards Social semantic journalism
 
Linked data in the digital humanities skills workshop for realising the oppo...
Linked data in the digital humanities  skills workshop for realising the oppo...Linked data in the digital humanities  skills workshop for realising the oppo...
Linked data in the digital humanities skills workshop for realising the oppo...
 
Harrower Heravi RDA P4 Social media
Harrower Heravi RDA P4 Social mediaHarrower Heravi RDA P4 Social media
Harrower Heravi RDA P4 Social media
 
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
Protein-Protein Interaction using SVM based kernel,Jacob Coefficient and Gene...
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data Lakes
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Sabrina Kirrane INSIGHT Viva Presentation
Sabrina Kirrane INSIGHT Viva Presentation Sabrina Kirrane INSIGHT Viva Presentation
Sabrina Kirrane INSIGHT Viva Presentation
 
2016 07 12_purdue_bigdatainomics_seandavis
2016 07 12_purdue_bigdatainomics_seandavis2016 07 12_purdue_bigdatainomics_seandavis
2016 07 12_purdue_bigdatainomics_seandavis
 
Industry Report: The State of Customer Data Integration in 2013
Industry Report: The State of Customer Data Integration in 2013Industry Report: The State of Customer Data Integration in 2013
Industry Report: The State of Customer Data Integration in 2013
 
Data Journalism - Start working with Data
Data Journalism  - Start working with DataData Journalism  - Start working with Data
Data Journalism - Start working with Data
 
Systematic discovery of phosphorylation networks - Combining linear motifs an...
Systematic discovery of phosphorylation networks - Combining linear motifs an...Systematic discovery of phosphorylation networks - Combining linear motifs an...
Systematic discovery of phosphorylation networks - Combining linear motifs an...
 
Semantic annotation of biomedical data
Semantic annotation of biomedical dataSemantic annotation of biomedical data
Semantic annotation of biomedical data
 
Protein interaction networks from yeast to human
Protein interaction networks from yeast to humanProtein interaction networks from yeast to human
Protein interaction networks from yeast to human
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
 
Introduction to Network Mapping
Introduction to Network MappingIntroduction to Network Mapping
Introduction to Network Mapping
 
Data Journalism - Finding Data
Data Journalism - Finding DataData Journalism - Finding Data
Data Journalism - Finding Data
 
Data Journalism - Introduction
Data Journalism - IntroductionData Journalism - Introduction
Data Journalism - Introduction
 

Similar to Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations

Web Science - ISoLA 2012
Web Science - ISoLA 2012Web Science - ISoLA 2012
Web Science - ISoLA 2012Mark Wilkinson
 
Project Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of CreditProject Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of CreditCASRAI
 
On the nature of Credit
On the nature of CreditOn the nature of Credit
On the nature of Creditmhaendel
 
How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?Besnik Fetahu
 
Web Science, SADI, and the Singularity
Web Science, SADI, and the SingularityWeb Science, SADI, and the Singularity
Web Science, SADI, and the SingularityMark Wilkinson
 
Social media and Pharmacovigilance
Social media and Pharmacovigilance Social media and Pharmacovigilance
Social media and Pharmacovigilance Michael Ibara
 
Transhumanism & Education - Kevin Jain - H+ Summit @ Harvard
Transhumanism & Education - Kevin Jain - H+ Summit @ HarvardTranshumanism & Education - Kevin Jain - H+ Summit @ Harvard
Transhumanism & Education - Kevin Jain - H+ Summit @ HarvardHumanity Plus
 
Entity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph CompletionEntity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph CompletionJennifer D'Souza
 
Reading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what tReading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what tDIPESH30
 
Reading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what tReading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what tMARK547399
 
Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...
Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...
Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...Quora
 
Argumentative Essay Writing Graphic Organizers (6Th-12
Argumentative Essay Writing Graphic Organizers (6Th-12Argumentative Essay Writing Graphic Organizers (6Th-12
Argumentative Essay Writing Graphic Organizers (6Th-12Kara Bell
 
The Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docx
The Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docxThe Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docx
The Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docxjmindy
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...Stefan Dietze
 
How SADI & SHARE help restore the Scientific Method to in silico science
How SADI & SHARE help restore the Scientific Method to in silico scienceHow SADI & SHARE help restore the Scientific Method to in silico science
How SADI & SHARE help restore the Scientific Method to in silico scienceMark Wilkinson
 
Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...
Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...
Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...Carley Kelley
 
5-pln-1520-Conlon
5-pln-1520-Conlon5-pln-1520-Conlon
5-pln-1520-Conlonmed20su
 
Justice For All Act Of 2004
Justice For All Act Of 2004Justice For All Act Of 2004
Justice For All Act Of 2004Tiffany Graham
 
Semantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaSemantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaFabrizio Orlandi
 

Similar to Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations (20)

Web Science - ISoLA 2012
Web Science - ISoLA 2012Web Science - ISoLA 2012
Web Science - ISoLA 2012
 
Project Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of CreditProject Credit: Melissa Haendel - On the Nature of Credit
Project Credit: Melissa Haendel - On the Nature of Credit
 
On the nature of Credit
On the nature of CreditOn the nature of Credit
On the nature of Credit
 
How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?How much is Wikipedia lagging behind News?
How much is Wikipedia lagging behind News?
 
Web Science, SADI, and the Singularity
Web Science, SADI, and the SingularityWeb Science, SADI, and the Singularity
Web Science, SADI, and the Singularity
 
Social media and Pharmacovigilance
Social media and Pharmacovigilance Social media and Pharmacovigilance
Social media and Pharmacovigilance
 
Transhumanism & Education - Kevin Jain - H+ Summit @ Harvard
Transhumanism & Education - Kevin Jain - H+ Summit @ HarvardTranshumanism & Education - Kevin Jain - H+ Summit @ Harvard
Transhumanism & Education - Kevin Jain - H+ Summit @ Harvard
 
Entity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph CompletionEntity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph Completion
 
Reading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what tReading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what t
 
Reading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what tReading questions mackie, evil and omnipotence” 1. what t
Reading questions mackie, evil and omnipotence” 1. what t
 
Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...
Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...
Quora ML Workshop: Engineering at the Intersection of Productive Efficiency, ...
 
Argumentative Essay Writing Graphic Organizers (6Th-12
Argumentative Essay Writing Graphic Organizers (6Th-12Argumentative Essay Writing Graphic Organizers (6Th-12
Argumentative Essay Writing Graphic Organizers (6Th-12
 
The Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docx
The Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docxThe Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docx
The Intelligence Gathering Debate[WLOs 1, 3] [CLOs 1, 3, 4].docx
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
 
How SADI & SHARE help restore the Scientific Method to in silico science
How SADI & SHARE help restore the Scientific Method to in silico scienceHow SADI & SHARE help restore the Scientific Method to in silico science
How SADI & SHARE help restore the Scientific Method to in silico science
 
Resources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the WebResources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the Web
 
Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...
Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...
Abortion Persuasive Essays. Argumentative essays for abortion - writefiction5...
 
5-pln-1520-Conlon
5-pln-1520-Conlon5-pln-1520-Conlon
5-pln-1520-Conlon
 
Justice For All Act Of 2004
Justice For All Act Of 2004Justice For All Act Of 2004
Justice For All Act Of 2004
 
Semantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaSemantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in Wikipedia
 

Recently uploaded

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations

  • 1. Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations Nitish Aggarwal Supervised by Dr. Paul Buitelaar PhD Viva
  • 3. Motivation Semantic Web Technologies: 1. RDF 2. SPARQL 3. Ontology 4. Linked data 5. Turtle (syntax) Entity Recommendation Companies: 1. Metaweb 2. Ontoprise GmbH 3. OpenLink Software 4. Ontotext 5. Powerset (company) Myosin Proteins and cells: 1. Actin 2. Muscle contraction 3. Sarcomere 4. Myofibril 5. Cytoskeleton Biologists: 1. Hugh Huxley 2. James Spudich 3. Ronald Vale 4. Manuel Morales 5. Brunó Ferenc Straub 3
  • 4. Determine the degree of relatedness between two entities Brad Pitt Tom Cruise ? Entity Relatedness 4
  • 5. Person, location, organization Time, date, money, percent Event, movie, disease, symptom, side effect, law, license and more Background Entity • Many such types are covered in Wikipedia • More than 2K classes in DBpedia • More than 350k classes in Yago • Every Wikipedia article is considered about an entity 5
  • 6. Motor vehicle Car Motorcycle Automobile Auto Car seat Car window s s h h m m Background Relatedness Synonym s Similar Related Substitutability 6
  • 7. Outline • Motivation • Entity Relatedness • Distributional Semantics for Entity Relatedness (DiSER) • Evaluation • Entity Recommendation • Wikipedia-based Features for Entity Recommendation (WiFER) • Evaluation • Text Relatedness • Non-Orthogonal Explicit Semantic Analysis (NESA) • Evaluation • Application and Industry Use Cases • Conclusion 7
  • 8. Wikipedia Features for Entity Recommendation (WiFER) Feature Extraction Thesis Overview Distributional Semantic for Entity Relatedness (DiSER) Distributional Representation Non-Orthogonal Explicit Semantic Analysis (NESA) Chapter V Chapter IV Chapter VI 8
  • 9. Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion Thesis Overview Wikipedia Features for Entity Recommendation (WiFER) Feature Extraction Distributional Semantic for Entity Relatedness (DiSER) Distributional Representation Non-Orthogonal Explicit Semantic Analysis (NESA) Chapter IV 9
  • 10. Entity Relatedness Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 10
  • 11. Entity Relatedness: State of the Art • Graph-based methods • Path distance in Wikipedia graph (Strube and Ponzetto, 2006) • Normalized Google Distance on Wikipedia graph (Witten and Milne, 2008) • Personalized pagerank on Wikipedia graph (Agirre et. al, 2015) • Path-based measures on DBpedia graph (Hulpus et. al, 2015) • Corpus-based methods • Key-phrase Overlap for Related Entities (KORE): partial overlaps between key- phrases in corresponding Wikipedia articles (Hoffart et. al, 2012) • Text relatedness measures: use colocation information in text Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 11
  • 12. Explicit Semantic Analysis (ESA) Uses explicit (manually defined) concepts like Wikipedia articles where every article is considered describing a single concept (Gabrilovich and Markovitch, 2007) Entity Relatedness: State of the Art Distributional Semantics word1 W11 W12 W13 W14 …....... W1n word2 W21 W22 W23 W24 …....... W2n word3 W31 W32 W33 W34 …........ W3n wordm Wm1 Wm2 Wm3 Wm4 …... Wmn ... doc1 doc2 doc3 doc4 ….... docn Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 12
  • 13. word1 W11 W12 W13 W14 …....... W1n word2 W21 W22 W23 W24 …....... W2n word3 W31 W32 W33 W34 …........ W3n wordm Wm1 Wm2 Wm3 Wm4 …... Wmn ... Entity Relatedness: State of the Art Distributional Semantics doc1 doc2 doc3 doc4 ….... docn Implicit/Latent Semantic Analysis (LSA) Transforms sparse document space into a dense latent topic space Latent Dirichlet Allocation (LDA) (Blei et al., 2003) Latent Semantic Analysis (LSA) (Deerwester et al., 1990) Neural Embeddings (Word2Vec) (Mikolov et al., 2013) n ~ 1M word1 W11 W12 ……..... W1k word2 W21 W22 ……..... W2k wordm Wm1 Wm2 ……..... Wmk ... topic1 topic2 … topick Dimensionality Reduction k < 1000 Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 13
  • 14. Limitation of Text Relatedness Measures • Compositionality • Most of the entities are multiword expressions • Vector(Brad Pitt) = Vector(Brad) + Vector(Pitt) ? • Ambiguity • Vector of an entity with ambiguous name like “Nice” (French city) Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 14
  • 15. Chapter IV Distributional Semantics for Entity Relatedness (DiSER) entity1 W11 W12 W13 W14 …....... W1n entity2 W21 W22 W23 W24 …....... W2n entity3 W31 W32 W33 W34 …........ W3n entityn Wn1 Wn2 Wn3 Wn4 …... Wnn ... doc1 doc2 doc3 doc4 ….... docn Wikipedia-based Distributional Semantics for Entity Relatedness In: AAAI-FSS-2014 [Steve Jobs] co-founded Apple in 1976 to sell Wozniak’s [Apple I] [Personal Computer]. [Steve Jobs | Jobs] was CEO of [Apple Inc. | Apple] and largest shareholder of [Pixar]. Jobs is widely recognized as a pioneer of the [Microcomputer Revolution], along [Steve Wozniak | Wozniak]. Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion Annotated Wikipedia with entities One sense per document Wikipedia entities [Steve Jobs] [Apple Inc.| Apple] [Steve Wozniak | Wozniak]’ [Apple I] [Personal Computer]. [Steve Jobs | Jobs] was CEO of [Apple Inc. | Apple] and largest shareholdef [Pixar]. [Steve Jobs | Jobs] is widely recognizpioneer of the [Microcomputer Revolution], along [Steve Wozniak | Wozniak]. 15
  • 16. The Tree of Life (film) Falmouth, Cornwall World War Z (film) What Just Happened A Mighty Heart (film) Plan B Entertainment Jamaican Patois Richard: A Novel Sobriquet I Want a Famous Face Brad Pitt (DiSER) Damiani (jewelry company) University of Pittsburgh Band Brad Pitt Make It Right Foundation Pittsburgh men’s basketball Brangelina Pittsburgh Panthers baseball Pitt (Comics) Pitt River Brad Pitt filmography Brad Pitt (ESA) Wikipedia-based Distributional Semantics for Entity Relatedness In: AAAI-FSS-2014 ESA vs DiSER Vector Chapter IV Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 16
  • 17. Entity Relatedness: Evaluation Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 17
  • 18. • Absolute relatedness score • Relatedness between “Apple Inc.” and “Steve Jobs” • Very low inter-annotator agreement • Relative relatedness score • Is “Steve Jobs” more related with “Apple Inc.” than “Bill Gates” • High inter-annotator agreement • KORE (Hoffart et al., 2012) • 21 seed entities • Every entity has list of 20 entities with their relatedness score • 420 entity pairs in total Entity Relatedness: Dataset Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 18
  • 19. Approaches Spearman Rank Correlation Graph-based measures Path-DBpedia (Hulpus et al., 2015) 0.610 WLM (Witten and Milne, 2008) 0.659 PPR (Agirre et al., 2015) 0.662 Corpus-based measures Word2Vec (Mikolov et al., 2013) 0.181 GloVe (Pennington et al., 2014) 0.194 LSA (Landauer et al., 1998) 0.375 KORE (Hoffart et al., 2012) 0.679 ESA (Gabrilovich and Markovitch, 2007) 0.691 DiSER 0.781 Wikipedia-based Distributional Semantics for Entity Relatedness In: AAAI-FSS-2014 Results: KORE Dataset Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 19
  • 20. DiSER Vector for non-Wikipedia Entities Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 20
  • 21. BBC: http://www.bbc.com/news/world-europe-22204377 Article about Savita Context-DiSER Noun phrase extraction: StanfordNLP Entity linking: Prior probability Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 21
  • 22. Abortion Abortion-rights movement The Irish Times United States pro- life movement Vincent Browne Michael D. Higgins Context-DiSER Irish abortion law Death of Savita Galway University Hospital Miscarriage Catholic Country ……. Savita Halappanavar Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 22
  • 23. Approaches Spearman Rank Correlation KORE (state of the art) 0.679 Context-ESA 0.684 Context-DiSER (Manual linking) 0.769 Context-DiSER (Automatic linking) 0.719 Wikipedia-based Distributional Semantics for Entity Relatedness In: AAAI-FSS-2014 Context-DiSER: Results on KORE Dataset Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 23
  • 24. Thesis Overview Wikipedia Features for Entity Recommendation (WiFER) Feature Extraction Distributional Semantic for Entity Relatedness (DiSER) Distributional Representation Non-Orthogonal Explicit Semantic Analysis (NESA) Chapter V Chapter IV Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 24
  • 25. Entity Recommendation Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 25
  • 26. • Classical Recommendation Systems • Focus on personalized recommendation • Require user-item preferences • Entity Recommendation in Web Search (Blanco et al., 2013) • Co-occurrence features: query logs, query session, Flickr tags, tweets • Graph-based features: shared connections in Yahoo knowledge graph and others domain specific knowledge bases • Entity and Relation type in Knowledge graph • More than 100 features • Combines features using learning to rank Entity Recommendation: State of the Art Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 26
  • 27. Features: Prior Probability of Entity1 Prior Probability of Entity2 Joint Probability Conditional Probability Reverse Conditional Probability Cosine Similarity Pointwise Mutual Information Distributional Semantic Model Learning to Rank Leveraging Wikipedia Knowledge for Entity Recommendations In: ISWC 2015 Wikipedia-based Features for Entity Recommendation (WiFER) Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 27
  • 28. Prior Probability of Entity1 Prior Probability of Entity2 Joint Probability Conditional Probability Cosine Similarity Pointwise Mutual Information Reverse Conditional Probability Distributional Semantic Model (ESA) Wikipedia Text Wikipedia Entities Prior Probability of Entity1 Prior Probability of Entity2 Joint Probability Conditional Probability Cosine Similarity Pointwise Mutual Information Reverse Conditional Probability Distributional Semantic Model (DiSER) Wikipedia-based Features for Entity Recommendation (WiFER) Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 28
  • 29. • Learning to Rank • Gradient Boosted Decision Trees (GBDT) (Li Hang, 2011) • It builds the model in a stage-wise fashion • Dataset: Entity recommendation in web search • 4,797 web search queries (entities) • Every entity query has a list of entity candidates (47,623 entity-pairs) • All candidates are tagged on 5 label scales: Excellent, Prefer, Good, Fair, and Bad Combining Features Type Total instances Percentage Location 22,062 46.32 People 21,626 45.41 Movies 3,031 6.36 TV Shows 280 0.58 Album 563 1.18 Total 47,623 100 Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 29
  • 30. • Evaluation • Normalized discounted cumulative gain (NDCG@10) • 10 fold cross validation Features All Person Location Spark (Blanco et al., 2013) 0.9276 0.9479 0.8882 WiFER 0.9173 0.9431 0.8795 Spark+WiFER 0.9325 0.9505 0.8987 Insights into Entity Recommendation in Web Search In: IESD at ISWC, 2015 Entity Recommendation: Results Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 30
  • 31. Insights into Entity Recommendation in Web Search In: IESD at ISWC, 2015 Entity Recommendation: Feature Analysis in Spark+WiFER Relation type Cosine similarity over Flickr tags Probability of target entity over Wikipedia text corpus CF7 over Flickr tags DSM over Wikipedia entities corpus (DiSER) Conditional user probability over query terms DSM over Wikipedia text corpus (ESA) Probability of source entity over Wikipedia entities corpus Probability of target entity over Flickr tags Probability of target entity over Wikipedia entities corpus Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 31
  • 32. Thesis Overview Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion Wikipedia Features for Entity Recommendation (WiFER) Feature Extraction Distributional Semantic for Entity Relatedness (DiSER) Distributional Representation Non-Orthogonal Explicit Semantic Analysis (NESA) Chapter V Chapter IV Chapter VI 32
  • 33. Text Relatedness: Non-Orthogonal Explicit Semantic Analysis (NESA) Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 33
  • 34. ESA assumes that related words share highly weighted concepts in their distributional vector Chapter VI Improving ESA with Document Similarity In: ECIR-2013 “soccer” History of Soccer in the United States Soccer in the United States United States Soccer Federation North American Soccer League United Soccer Leagues “football” FIFA Football History of association football Football in England Association football ESA(football, soccer) = 0.0 Orthogonality in ESA Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 34
  • 35. Chapter VI Improving ESA with Document Similarity In: ECIR-2013 “soccer” History of Soccer in the United States Soccer in the United States United States Soccer Federation North American Soccer League United Soccer Leagues “football” FIFA Football History of association football Football in England Association football NESA(football, soccer) = (FIFA x Soccer in the United States + FIFA x United Soccer Leagues ….) = 0.38 Non-Orthogonal Explicit Semantic Analysis (NESA) Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 35
  • 36. • ESA: v1 and v2 are the n-dimensional vectors for words w1 and w2 • relESA (w1, w2) = v1 T . v2 • NESA: Correlation between vector dimensions • relNESA (w1,w2) = v1 T . C . v2 • C(n,n) = ET . E • Dimension correlation methods • DiSER scores between corresponding Wikipedia article Non-Orthogonal Explicit Semantic Analysis (NESA) Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 36
  • 37. • WN353 • 353 word pairs annotated by 13-15 experts on a scale of 1-10. • RG65 • 65 word pairs annotated by 51 experts on scale of 0-4 • MC30 • 30 word pairs annotated by 38 experts on scale of 0-1 • MT287 • 287 word pairs annotated by 10-12 experts on scale of 0-1 Word Relatedness Datasets Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 37
  • 38. Non-Orthogonal Explicit Semantic Analysis In: *SEM-2015 Chapter VI WN353 MC30 RG65 MT287 LSA 0.579 0.667 0.616 0.555 LSA (Wiki) 0.538 0.744 0.697 0.353 Word2Vec 0.663 0.824 0.751 0.560 ESA 0.66 0.765 0.826 0.507 NESA 0.696 0.784 0.839 0.572 Spearman rank correlation with word similarity gold standard datasets NESA: Results Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 38
  • 39. Non-Orthogonal Explicit Semantic Analysis In: *SEM-2015 Chapter VI NESA: Results • Word similarity vs relatedness (Agirre et al., 2009) • WN353Rel: 202 word pairs from WN353 • WN353Sim: 252 word pairs from WN353 Spearman rank correlation with word similarity vs relatedness datasets WN353Rel WN353Sim LSA 0.521 0.662 LSA (Wiki) 0.506 0.559 Word2Vec 0.601 0.741 ESA 0.643 0.663 NESA 0.663 0.719 Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 39
  • 40. Outline • Motivation • Entity Relatedness • Distributional Semantics for Entity Relatedness (DiSER) • Evaluation • Entity Recommendation • Wikipedia-based Features for Entity Recommendation (WiFER) • Evaluation • Text Relatedness • Non-Orthogonal Explicit Semantic Analysis (NESA) • Evaluation • Application and Industry Use Cases • Conclusion Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 40
  • 41. Chapter VIIhttp://enrg.insight-centre.org/ Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 41
  • 42. Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 42 EnRG SPARQL Endpoint National University of Ireland, Galway
  • 43. Industrial Use Cases Medical entity linking for question-answering and relationship explanation in Knowledge Graph Entity Recommendation in Web Search Company name disambiguation for social profiling Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 43
  • 44. • Entity Relatedness • Distributional Semantics for Entity Relatedness (DiSER) • Outperformed state of the art entity relatedness measures • Entity Recommendation • Wikipedia-based Features for Entity Recommendation (WiFER) • Effective features for entity recommendation in web search • Text Relatedness • Non-Orthogonal Explicit Semantic Analysis (NESA) • Outperformed other existing word relatedness measures • Entity Relatedness Graph (EnRG) • Contains all Wikipedia entities and their pre-computed relatedness scores • Contains distributional vectors for all Wikipedia entities Conclusion Motivation Entity Relatedness Entity Recommendation Text Relatedness Application Conclusion 45
  • 45. • Relationship explanation for recommended entities • Best path in knowledge graph • Best natural language description • Knowledge discovery • Analogy querying over knowledge graph e.g. Google to Motorola => Microsoft to ? • Example based querying e.g. Google to Motorola => ? to ? Future Research Directions 46

Editor's Notes

  1. Text on side does not look good
  2. Don’t call them concept try to be specific like technologies for SemWeb Change box in horizontal boxes not vertical onces
  3. Entity describe definitions in different communities and the definition we will carry out in our presentation Relatedness describe it with the notion of similarity vs relatedness by illustrating wordnet relations (taxonomic vs others), further describe a simple van diagram We do not distinguish between entity and concept like football player
  4. Entity describe definitions in different communities and the definition we will carry out in our presentation Relatedness describe it with the notion of similarity vs relatedness by illustrating wordnet relations (taxonomic vs others), further describe a simple van diagram Relatedness reflects the degree of associativity, connectivity Relatedness score: University => Student, building Similarity score => Student, building Relatedness score: University => Student, bio lab Similarity score => Student, bio lab
  5. Change box to chapter names Entity Relatedness => DiSER
  6. Context-VSM and Context-ESA - Vector similarity between corresponding Wikipedia articles - ESA score between corresponding Wikipedia article
  7. Change to diser explanation
  8. Change to diser explanation
  9. Backup slide on Vector composition Lucene based “Brad Pitt”
  10. Add wiki markups to show one sense per document
  11. Highlight the relevant articles in both vectors
  12. Merge next slide with one
  13. Explain entity disambiguation for context-diser
  14. Wikipedia text and entity tagged One thing to notice: We only get the articles that contain the given entity as wikipedia links not only world So, It performs better than text DSM
  15. Describe GBDT
  16. Change table
  17. Change table
  18. Change x to pairwise sim symbol
  19. Change to equation Consistency in subscript and superscript
  20. \textbf{MC30} It contains 30 pairs of noun and their relatedness score are on the scale of 0-4. This dataset was prepared by Miller and Charles(1991). The score was provided by 38 human experts.\\\\ WN353 It contains 353 pairs of word annotated by 13-15 human experts on a scale of 0-10. 10 stands for highly related and 0 stands for unrelated. It containes has generic words as well as named entities.\\\\ \textbf{WN353Sim and WN353Rel} The WN353 dataset was refined by Agirre at el. (2009). It contained similar and related pair of words. Two words are similar if they are connected through the taxonomic relation like synonym or hyponym. Two words are related if they are connected through relations like meronym or holonym. WN353Rel and WN353Sim contain 252 and 203 pair of words respectively.\\\\ \textbf{RG65} It contains 65 pair of non-technical word pair. It was annotated by 51 human experts.\\\\ \textbf{MT771} It has 771 pairs of words and their relatedness score. The words are very generic and varying from all kinds of domains.\\\\ \textbf{MT287} It has 287 pairs of words and their relatedness score, prepared by using Amazon Mechanical Turck (MT).
  21. Backup slide on similarity vs relatedness
  22. Context-VSM and Context-ESA - Vector similarity between corresponding Wikipedia articles - ESA score between corresponding Wikipedia article
  23. Change screenshot with better quality
  24. Change to screen shot from a Sparql editor with color encoding
  25. Remove Similarity and relatedness thing and explain more on relationship explanation