SlideShare a Scribd company logo
1 of 22
Download to read offline
Dynamic Collective Entity
Representations for Entity Ranking
David Graus, Manos Tsagkias, Wouter Weerkamp, Edgar Meij, Maarten de Rijke
2
3
4
Entity search?
Ò Index = Knowledge Base (= Wikipedia)
Ò Documents = Entities
Ò “Real world entities” have a single representation
(in KB)
5
Representation is not static
Ò Associations between words and entities change
over time
Ò “ferguson shooting” -> Ferguson, Missouri
Ò People talk about entities all the time
6
*****
7
Dynamic Collective Entity
Representations
Ò Use “collective intelligence” to mine entity
descriptions to enrich representation.
Ò Is like document expansion (add terms found
through explicit links)
Ò Is not query expansion (terms found through
predicted links)
8
Advantages
Ò Cheap: Change document in index, leverage tried &
tested retrieval algorithms
Ò Free “smoothing”: (e.g., tweets) may capture ‘newly
evolving’ word associations (Ferguson shooting) and
incorporate out-of-document terms
Ò “move relevant documents closer to queries” (= close
the gap between searcher vocabulary & docs in index)
9
Haven’t we seen this before?
Ò Anchors & queries in particular have been shown to
improve retrieval [1]
Ò Tweets have been shown to be similar to anchors [2]
Ò Social tags, same [3]
Ò But: in batch (i.e., add data, see if/how it improves
retrieval)
[1] T. Westerveld, W. Kraaij, and D. Hiemstra. Retrieving web pages using content, links, urls and anchors. TREC 2001
[2] G. Mishne and J. Lin. Twanchor text: A preliminary study of the value of tweets as anchor text. SIGIR ’12
[3] C.-J. Lee and W. B. Croft. Incorporating social anchors for ad hoc retrieval. OAIR ’13
10
Description sources
Description sources
KB
Wikipedia dump
(Aug ‘14)
57M descriptions for
4.8M entities.
Web anchors
Anchors from Google
WikiLinks corpus.
9.8M descriptions for
876,063 entities.
Tweets
Tweets w/ links to
Wikipedia pages
(2011-2014)
52,631 descriptions for
38,269 entities.
Queries
Queries from MSN query
logs that yield Wikipedia
clicks.
47,002 descriptions for
18,724 entities.
Social tags
Delicious tags for
Wiki pages, from the
SocialBM0311 corpus.
4.4M descriptions for
289,015 entities.
Dynamic sources
Static sources
11
Original entity representation
Tupac Shakur
Tupac Amaru Shakur
(Previously known
as Lesane Parish
Crooks)(too-pahk
shə-koor;[1] June
16, 1971 – Septem-
ber 13, 1996), also
known by his stage
names 2Pac and
(briefly) Makaveli,
was an American
rapper, author,
actor, and poet.[2]
As of 2007, Shakur
has sold over 75
million records
worldwide, making
him one of the
best-selling music
artists of all
time.[3] His double
disc albums All
Eyez on Me and his
Greatest Hits are
among the [...]
Original entity description
Entity description
12
Static description sources
KB Anchors
2Pac
Tupac
Makaveli
KB Linked entities
The Notorious B.I.G.
Black Panther Party
Muammar Gaddafi
KB Redirects
2pac Shakur
Thug Immortal
KB Categories
Murdered Rappers
Death Row Record Artists
American deists
Web Anchors
What job did Tupac have before
he was a rapper
Tupac
Tupac is arguably more
influential
Tupac Amaru Shakur
Tupac Shakur-style drive-by
shooting
Tupac Shakur
Tupac Shakur reciting Shake-
speare at art school
Description sources
KB
Wikipedia dump
(Aug ‘14)
57M descriptions for
4.8M entities.
Web anchors
Anchors from Google
WikiLinks corpus.
9.8M descriptions for
876,063 entities.
Tweets
Tweets w/ links to
Wikipedia pages
(2011-2014)
52,631 descriptions for
38,269 entities.
Queries
Queries from MSN query
logs that yield Wikipedia
clicks.
47,002 descriptions for
18,724 entities.
Social tags
Delicious tags for
Wiki pages, from th
SocialBM0311 corpu
4.4M descriptions fo
289,015 entities.
Dynamic sources
Static sources
Description sources
KB
Wikipedia dump
(Aug ‘14)
57M descriptions for
4.8M entities.
Web anchors
Anchors from Google
WikiLinks corpus.
9.8M descriptions for
876,063 entities.
Q
Q
l
c
4
1Static sources
KB Anchors
2Pac
Tupac
Makaveli
KB Linked entities
The Notorious B.I.G.
Black Panther Party
Muammar Gaddafi
KB Redirects
2pac Shakur
Thug Immortal
KB Categories
Murdered Rappers
Death Row Record Artists
American deists
Web Anchors
What job did Tupac have before
he was a rapper
Tupac
Tupac is arguably more
influential
Tupac Amaru Shakur
Tupac Shakur-style drive-by
shooting
Tupac Shakur
Tupac Shakur reciting Shake-
speare at art school
13
Dynamic description sources
Dynamic expansions
tupac and the law
hiphop/icons
dead rappers
people influenced by tupac
awesomeartist rapd
Happy Birthday
Tupac!!! 2Pac Gemini
RT: Las cenizas de Tupac, el
mejor rapero de la historia,-
fueron mezcladas con marihuana y
fumadas por miembros de Outlawz
Even more crazy that this was an-
nounced just one day before what
would have been Pac’s 40th birth-
day.
Tweets TagsQueries
tion sources
dump
iptions for
ies.
Web anchors
Anchors from Google
WikiLinks corpus.
9.8M descriptions for
876,063 entities.
Tweets
Tweets w/ links to
Wikipedia pages
(2011-2014)
52,631 descriptions for
38,269 entities.
Queries
Queries from MSN query
logs that yield Wikipedia
clicks.
47,002 descriptions for
18,724 entities.
Social tags
Delicious tags for
Wiki pages, from the
SocialBM0311 corpus.
4.4M descriptions for
289,015 entities.
Dynamic sources
Static sources
tion sources
dump
iptions for
ies.
Web anchors
Anchors from Google
WikiLinks corpus.
9.8M descriptions for
876,063 entities.
Tweets
Tweets w/ links to
Wikipedia pages
(2011-2014)
52,631 descriptions for
38,269 entities.
Queries
Queries from MSN query
logs that yield Wikipedia
clicks.
47,002 descriptions for
18,724 entities.
Social tags
Delicious tags for
Wiki pages, from the
SocialBM0311 corpus.
4.4M descriptions for
289,015 entities.
Dynamic sources
Static sources
tion sources
dump
iptions for
ies.
Web anchors
Anchors from Google
WikiLinks corpus.
9.8M descriptions for
876,063 entities.
Tweets
Tweets w/ links to
Wikipedia pages
(2011-2014)
52,631 descriptions for
38,269 entities.
Queries
Queries from MSN query
logs that yield Wikipedia
clicks.
47,002 descriptions for
18,724 entities.
Social tags
Delicious tags for
Wiki pages, from the
SocialBM0311 corpus.
4.4M descriptions for
289,015 entities.
Dynamic sources
Static sources
14
Challenge
Ò Heterogeneity
1. Description sources
2. Entities
Ò Dynamic nature
Ò Content changes over time
15
Adaptive ranking
Ò Supervised single-field weighting model
Ò Features:
Ò field similarity: retrieval score per field.
Ò field “importance”: length, novel terms, etc.
Ò entity “importance”: time since last update.
Ò Learn optimal field weights from clicks
Supervised single-field weighting model
Eeach field’s contribution towards the final score is
individually weighted, learned from clicks at set intervals.
16
Experimental setup
1. Data:
Ò MSN Query log (62,841 queries that yield entity clicks)
Ò For each query:
Ò Produce ranking
Ò Observe click
Ò Evaluate ranking (MAP/P@1)
Ò Expand entities (w/ descriptions from dynamic
sources)
Ò [re-train ranker]
17
Results
Ò Comparing effectiveness of diff. description
sources
Ò Comparing adaptive vs. non-adaptive ranker
performance
18
Description sources
0.60
0.50
0.51
0.52
0.53
0.54
0.55
0.56
0.57
0.58
0.59
0 5000 10000 15000 20000 25000 30000
19
Feature weights over time
20
Adaptive vs. non-adaptive ranking
0.60
0.50
0.51
0.52
0.53
0.54
0.55
0.56
0.57
0.58
0.59
0 5000 10000 15000 20000 25000 30000
21
In summary
Ò Expanding entity representations with different
sources enables better matching of queries to
entities
Ò As new content comes in, it is beneficial to retrain
the ranker
Ò Informing ranker of “expansion state” further
improves performance
22
Thank you

More Related Content

Similar to Dynamic Entity Ranking with Collective Representations

Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
WTF is Semantic Web?
WTF is Semantic Web?WTF is Semantic Web?
WTF is Semantic Web?milesw
 
Information retrieval and extraction
Information retrieval and extractionInformation retrieval and extraction
Information retrieval and extractionAnkit Sharma
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4jSimon Jupp
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsNitish Aggarwal
 
Information_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIITInformation_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIITAnkit Sharma
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending InfluenceRichard Wallis
 
Year of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkeyYear of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkeyPeter Mika
 
Semantic Web Austin Yahoo
Semantic Web Austin YahooSemantic Web Austin Yahoo
Semantic Web Austin YahooPeter Mika
 
On the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in WikipediaOn the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in WikipediaNattiya Kanhabua
 
Linked data for librarians
Linked data for librariansLinked data for librarians
Linked data for librarianstrevorthornton
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending BenefitsRichard Wallis
 
Watson at RPI - Summer 2013
Watson at RPI - Summer 2013Watson at RPI - Summer 2013
Watson at RPI - Summer 2013James Hendler
 
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...sboisen
 
MW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open DataMW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open DataDavid Henry
 
FSU SLIS Wk 4 Info Services: Databases & Indexes
FSU SLIS Wk 4 Info Services: Databases & IndexesFSU SLIS Wk 4 Info Services: Databases & Indexes
FSU SLIS Wk 4 Info Services: Databases & IndexesLorri Mon
 
FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Dataaschwarte
 

Similar to Dynamic Entity Ranking with Collective Representations (20)

Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
WTF is Semantic Web?
WTF is Semantic Web?WTF is Semantic Web?
WTF is Semantic Web?
 
Information retrieval and extraction
Information retrieval and extractionInformation retrieval and extraction
Information retrieval and extraction
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and RecommendationsLeveraging Wikipedia-based Features for Entity Relatedness and Recommendations
Leveraging Wikipedia-based Features for Entity Relatedness and Recommendations
 
Information_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIITInformation_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIIT
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending Influence
 
Year of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkeyYear of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkey
 
Semantic Web Austin Yahoo
Semantic Web Austin YahooSemantic Web Austin Yahoo
Semantic Web Austin Yahoo
 
DBpedia InsideOut
DBpedia InsideOutDBpedia InsideOut
DBpedia InsideOut
 
On the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in WikipediaOn the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in Wikipedia
 
Linked data for librarians
Linked data for librariansLinked data for librarians
Linked data for librarians
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending Benefits
 
Watson at RPI - Summer 2013
Watson at RPI - Summer 2013Watson at RPI - Summer 2013
Watson at RPI - Summer 2013
 
Resources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the WebResources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the Web
 
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
Deploying Semantic Technologies for Digital Publishing: A Case Study from Log...
 
MW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open DataMW2014 Workshop - Intro to Linked Open Data
MW2014 Workshop - Intro to Linked Open Data
 
FSU SLIS Wk 4 Info Services: Databases & Indexes
FSU SLIS Wk 4 Info Services: Databases & IndexesFSU SLIS Wk 4 Info Services: Databases & Indexes
FSU SLIS Wk 4 Info Services: Databases & Indexes
 
FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Data
 

More from David Graus

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsDavid Graus
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in RecommendationsDavid Graus
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.David Graus
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactDavid Graus
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsDavid Graus
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyDavid Graus
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesDavid Graus
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamDavid Graus
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDavid Graus
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoDavid Graus
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenDavid Graus
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email TrafficDavid Graus
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)David Graus
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsDavid Graus
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-DiscoveryDavid Graus
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseDavid Graus
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationDavid Graus
 

More from David Graus (18)

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in Recommendations
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron Database
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualization
 

Recently uploaded

Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 

Recently uploaded (20)

Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 

Dynamic Entity Ranking with Collective Representations

  • 1. Dynamic Collective Entity Representations for Entity Ranking David Graus, Manos Tsagkias, Wouter Weerkamp, Edgar Meij, Maarten de Rijke
  • 2. 2
  • 3. 3
  • 4. 4 Entity search? Ò Index = Knowledge Base (= Wikipedia) Ò Documents = Entities Ò “Real world entities” have a single representation (in KB)
  • 5. 5 Representation is not static Ò Associations between words and entities change over time Ò “ferguson shooting” -> Ferguson, Missouri Ò People talk about entities all the time
  • 7. 7 Dynamic Collective Entity Representations Ò Use “collective intelligence” to mine entity descriptions to enrich representation. Ò Is like document expansion (add terms found through explicit links) Ò Is not query expansion (terms found through predicted links)
  • 8. 8 Advantages Ò Cheap: Change document in index, leverage tried & tested retrieval algorithms Ò Free “smoothing”: (e.g., tweets) may capture ‘newly evolving’ word associations (Ferguson shooting) and incorporate out-of-document terms Ò “move relevant documents closer to queries” (= close the gap between searcher vocabulary & docs in index)
  • 9. 9 Haven’t we seen this before? Ò Anchors & queries in particular have been shown to improve retrieval [1] Ò Tweets have been shown to be similar to anchors [2] Ò Social tags, same [3] Ò But: in batch (i.e., add data, see if/how it improves retrieval) [1] T. Westerveld, W. Kraaij, and D. Hiemstra. Retrieving web pages using content, links, urls and anchors. TREC 2001 [2] G. Mishne and J. Lin. Twanchor text: A preliminary study of the value of tweets as anchor text. SIGIR ’12 [3] C.-J. Lee and W. B. Croft. Incorporating social anchors for ad hoc retrieval. OAIR ’13
  • 10. 10 Description sources Description sources KB Wikipedia dump (Aug ‘14) 57M descriptions for 4.8M entities. Web anchors Anchors from Google WikiLinks corpus. 9.8M descriptions for 876,063 entities. Tweets Tweets w/ links to Wikipedia pages (2011-2014) 52,631 descriptions for 38,269 entities. Queries Queries from MSN query logs that yield Wikipedia clicks. 47,002 descriptions for 18,724 entities. Social tags Delicious tags for Wiki pages, from the SocialBM0311 corpus. 4.4M descriptions for 289,015 entities. Dynamic sources Static sources
  • 11. 11 Original entity representation Tupac Shakur Tupac Amaru Shakur (Previously known as Lesane Parish Crooks)(too-pahk shə-koor;[1] June 16, 1971 – Septem- ber 13, 1996), also known by his stage names 2Pac and (briefly) Makaveli, was an American rapper, author, actor, and poet.[2] As of 2007, Shakur has sold over 75 million records worldwide, making him one of the best-selling music artists of all time.[3] His double disc albums All Eyez on Me and his Greatest Hits are among the [...] Original entity description Entity description
  • 12. 12 Static description sources KB Anchors 2Pac Tupac Makaveli KB Linked entities The Notorious B.I.G. Black Panther Party Muammar Gaddafi KB Redirects 2pac Shakur Thug Immortal KB Categories Murdered Rappers Death Row Record Artists American deists Web Anchors What job did Tupac have before he was a rapper Tupac Tupac is arguably more influential Tupac Amaru Shakur Tupac Shakur-style drive-by shooting Tupac Shakur Tupac Shakur reciting Shake- speare at art school Description sources KB Wikipedia dump (Aug ‘14) 57M descriptions for 4.8M entities. Web anchors Anchors from Google WikiLinks corpus. 9.8M descriptions for 876,063 entities. Tweets Tweets w/ links to Wikipedia pages (2011-2014) 52,631 descriptions for 38,269 entities. Queries Queries from MSN query logs that yield Wikipedia clicks. 47,002 descriptions for 18,724 entities. Social tags Delicious tags for Wiki pages, from th SocialBM0311 corpu 4.4M descriptions fo 289,015 entities. Dynamic sources Static sources Description sources KB Wikipedia dump (Aug ‘14) 57M descriptions for 4.8M entities. Web anchors Anchors from Google WikiLinks corpus. 9.8M descriptions for 876,063 entities. Q Q l c 4 1Static sources KB Anchors 2Pac Tupac Makaveli KB Linked entities The Notorious B.I.G. Black Panther Party Muammar Gaddafi KB Redirects 2pac Shakur Thug Immortal KB Categories Murdered Rappers Death Row Record Artists American deists Web Anchors What job did Tupac have before he was a rapper Tupac Tupac is arguably more influential Tupac Amaru Shakur Tupac Shakur-style drive-by shooting Tupac Shakur Tupac Shakur reciting Shake- speare at art school
  • 13. 13 Dynamic description sources Dynamic expansions tupac and the law hiphop/icons dead rappers people influenced by tupac awesomeartist rapd Happy Birthday Tupac!!! 2Pac Gemini RT: Las cenizas de Tupac, el mejor rapero de la historia,- fueron mezcladas con marihuana y fumadas por miembros de Outlawz Even more crazy that this was an- nounced just one day before what would have been Pac’s 40th birth- day. Tweets TagsQueries tion sources dump iptions for ies. Web anchors Anchors from Google WikiLinks corpus. 9.8M descriptions for 876,063 entities. Tweets Tweets w/ links to Wikipedia pages (2011-2014) 52,631 descriptions for 38,269 entities. Queries Queries from MSN query logs that yield Wikipedia clicks. 47,002 descriptions for 18,724 entities. Social tags Delicious tags for Wiki pages, from the SocialBM0311 corpus. 4.4M descriptions for 289,015 entities. Dynamic sources Static sources tion sources dump iptions for ies. Web anchors Anchors from Google WikiLinks corpus. 9.8M descriptions for 876,063 entities. Tweets Tweets w/ links to Wikipedia pages (2011-2014) 52,631 descriptions for 38,269 entities. Queries Queries from MSN query logs that yield Wikipedia clicks. 47,002 descriptions for 18,724 entities. Social tags Delicious tags for Wiki pages, from the SocialBM0311 corpus. 4.4M descriptions for 289,015 entities. Dynamic sources Static sources tion sources dump iptions for ies. Web anchors Anchors from Google WikiLinks corpus. 9.8M descriptions for 876,063 entities. Tweets Tweets w/ links to Wikipedia pages (2011-2014) 52,631 descriptions for 38,269 entities. Queries Queries from MSN query logs that yield Wikipedia clicks. 47,002 descriptions for 18,724 entities. Social tags Delicious tags for Wiki pages, from the SocialBM0311 corpus. 4.4M descriptions for 289,015 entities. Dynamic sources Static sources
  • 14. 14 Challenge Ò Heterogeneity 1. Description sources 2. Entities Ò Dynamic nature Ò Content changes over time
  • 15. 15 Adaptive ranking Ò Supervised single-field weighting model Ò Features: Ò field similarity: retrieval score per field. Ò field “importance”: length, novel terms, etc. Ò entity “importance”: time since last update. Ò Learn optimal field weights from clicks Supervised single-field weighting model Eeach field’s contribution towards the final score is individually weighted, learned from clicks at set intervals.
  • 16. 16 Experimental setup 1. Data: Ò MSN Query log (62,841 queries that yield entity clicks) Ò For each query: Ò Produce ranking Ò Observe click Ò Evaluate ranking (MAP/P@1) Ò Expand entities (w/ descriptions from dynamic sources) Ò [re-train ranker]
  • 17. 17 Results Ò Comparing effectiveness of diff. description sources Ò Comparing adaptive vs. non-adaptive ranker performance
  • 20. 20 Adaptive vs. non-adaptive ranking 0.60 0.50 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0 5000 10000 15000 20000 25000 30000
  • 21. 21 In summary Ò Expanding entity representations with different sources enables better matching of queries to entities Ò As new content comes in, it is beneficial to retrain the ranker Ò Informing ranker of “expansion state” further improves performance