SlideShare a Scribd company logo
1 of 37
Learning Semantic Relationships between Entities in Twitter ICWE, Cyprus, June 22, 2011 IlknurCelik, Fabian Abel, Geert-Jan Houben Web Information Systems, TU Delft
What we do: Science and Engineering for the Personal Web domains: news  social mediacultural heritage  public datae-learning Personalized Recommendations Personalized Search Adaptive Systems  Analysis and  User Modeling Semantic Enrichment, Linkage and Alignment user/usage data Social Web
60,000,000 number of tweets published per day
1 number of tweets per day that are interesting for me
Searching on Twitter
Issues with Multiple Keywords Search
Let’s try to search with One Keyword
Page 1
Page 2
Page 3
Page 60!! Music Artist Next Saturday @thatsimpsonguyaka Guilty Simpson will be performing at Area51 in my hometwonEindhoven. #realliveshit #iwillspinrecords about 9 hours ago via Blackberry tweet I was  looking for Locations
Is there an easier way?Faceted Search can help Current Query: Expand Query: Results: Yskiddd: Next saturday@thatsimpsonguy aka Guilty Simpson will be performing at Area51 in my homeytown Eindhoven. #realliveshit#iwillspinrecords2 Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL  sanmiquelmusic: This Saturday I'm joining @KrusadersMusic to Intents Eindhoven Music Locations more... Events more... Music Artists: +  Guilty Simpson +  Bryan Adams +  Elton John +  Golden Earring +  Rihanna +  The eagles +  3 Doors Down more...
Location: Eindhoven  Music Artist: Guilty Simpson Location: Area51 Semantic relationships between entities are essential to realize such applications.
Relation Discovery Framework Relation Discovery Framework temporal  constraints relation  type typed relations microblog posts Entity extraction & semantic enrichment Location A Person A Location A Person A isLocatedIn Relation  discovery Location B Group A Person A Group A involvedIn news articles Event A weighting  scheme source selection Applications ,[object Object]
 Query suggestions
 Schema enrichment,[object Object]
Relation Learning Strategies entities time period Relation: 	 relation(e1, e2, type, tstart, tend, weight) RelationLearningstrategy:  Input:entity e1 and e2, time period (tstart, tend) Challenge:inferweightand type of the relationfor the given Weightingaccording to co-occurrence frequency: Tweet-based: count co-occurrence in tweets News-based: count co-occurrence in news Tweet-News-based: count co-occurrence in both tweets and news type/label of relation relatedness
Research Questions Which strategy performs best in detecting relationships between entities? Does the accuracy depend on the type of entities which are involved in a relation? How do the strategies perform for discovering relationships which have temporal constraints (trending relationships)?
Dataset more than:  20,000 Twitter users 2 months 10,000,000 WikiLeaks founder, Julian Assange, under arrest in London tweets 75,000 news time Dec 15 Jan 15 Nov 15
Dataset Characteristics
Tweets and news articles per day 50,000-400,000 tweets per day 100-1000  news articles per day
Entities referenced per day 10,000-100,000 entity ref.  in tweets per day 5,000-20,000  entity ref. in news per day ~40% tweets do not mention any (recognizable) entity 72.6% of the top 1000 mentioned entities in Twitter are also mentioned in the mainstream news media 99.3% of the news articles mention at least one (recognizable) entity
Number of Distinct Entities per Entity Types 39 types of entities
Performance of Relation Learning Strategies
Our Ground Truth of true relations Based on DBpedia: We mapped entities to their corresponding DBpedia resources No appropriate DBpedia URIs for more than 35% of the entities We analyzed whether there is a direct relation between two entities Based on user study: Participants judged whether two entities are really: related (62.6% were rated as related) related in the given time period (57.3% were rated as related) Overall: 2588 judgments Thank you!
1. Which strategy performs best in detecting relationships between entities?
Accuracy of relation discovery  Combining both tweet-based and news-based strategies allows for highest accuracy Based on user study Based on DBpedia
F-Measure@k Combined strategy (and news-based) increase in performance. Tweet-based strategy saturates quickly
2. Does the accuracy depend on the type of entities which are involved in a relation?
Does the accuracy depend on the type of entities? 87%  precision 92%  Relationships which involve events can be discovered with high precision 26%  precision 23%
Does the accuracy depend on the type of entities? (cont.) Relationships between events can be detected with highest precision. Relationships between persons/groups are difficult to detect.
3. How do the strategies perform for discovering relationships which have temporal constraints?
Relationships with temporal constraints Tweet-based strategy performs better in discovering relationships that are valid only for a specific period in time
Where do relationships emerge faster? Speed of strategies is domain-dependent time difference (in days) of first occurrence of relationship News is faster Twitter is faster
Conclusions and Future Work What we did: relation discovery framework based on Twitter Findings: Strategy that considers both tweets and (linked) news articles allows for highest accuracy Performance varies for different domains (e.g. event-relationships can be detected with highest precision) Tweet-based strategy allows for detecting relationships, which have a restricted temporal validity, with high precision (and fast) Ongoing work: Adaptive Faceted Search on Twitter http://wis.ewi.tudelft.nl/tweetum/
Relation Discovery for Adaptive Faceted Search Current Query: 2. Analyze (temporal)relationships of entities that appear in the user profile to adapt facet ranking. Expand Query: Results: Yskiddd: Next saturday@thatsimpsonguy aka Guilty Simpson will be performing at Area51 in my homeytown Eindhoven. #realliveshit#iwillspinrecords2 Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL  sanmiquelmusic: This Saturday I'm joining @KrusadersMusic to Intents Eindhoven Music Locations more... Events more... Music Artists: +  Guilty Simpson +  Bryan Adams +  Elton John +  Golden Earring +  Rihanna +  The eagles +  3 Doors Down more... 1. Analyze (temporal)relationships of entities of the “current query” to adapt facet ranking. user
Thank you! IlknurCelik, Fabian Abel, Geert-Jan Houben Twitter: @persweb http://wis.ewi.tudelft.nl/tweetum/

More Related Content

What's hot

GeospatialDataAnalysis
GeospatialDataAnalysisGeospatialDataAnalysis
GeospatialDataAnalysis
Taylor Graham
 
PeekAnalytics Social Audience Report @jeffjarvis
PeekAnalytics Social Audience Report @jeffjarvisPeekAnalytics Social Audience Report @jeffjarvis
PeekAnalytics Social Audience Report @jeffjarvis
PeekAnaltyics
 
WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...
WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...
WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...
tksakaki
 
Twitter and the Global Brain
Twitter and the Global BrainTwitter and the Global Brain
Twitter and the Global Brain
MidMarket Place
 

What's hot (19)

Pirc net poster
Pirc net posterPirc net poster
Pirc net poster
 
Contactually & Encore Alert: Top Tools to Engage Your Prospects & Close More ...
Contactually & Encore Alert: Top Tools to Engage Your Prospects & Close More ...Contactually & Encore Alert: Top Tools to Engage Your Prospects & Close More ...
Contactually & Encore Alert: Top Tools to Engage Your Prospects & Close More ...
 
Smart Mobile Storytelling – APME NewsTrain
Smart Mobile Storytelling – APME NewsTrainSmart Mobile Storytelling – APME NewsTrain
Smart Mobile Storytelling – APME NewsTrain
 
Northwestern University IPHAM Twitter Basics Workshop
Northwestern University IPHAM Twitter Basics WorkshopNorthwestern University IPHAM Twitter Basics Workshop
Northwestern University IPHAM Twitter Basics Workshop
 
Cyberhate publications
Cyberhate publicationsCyberhate publications
Cyberhate publications
 
GeospatialDataAnalysis
GeospatialDataAnalysisGeospatialDataAnalysis
GeospatialDataAnalysis
 
Detection and resolution of rumours in social media
Detection and resolution of rumours in social mediaDetection and resolution of rumours in social media
Detection and resolution of rumours in social media
 
Analytics For Switch To Airtel Kenya Campaign Safaricom Teaser
Analytics For Switch To Airtel Kenya Campaign Safaricom TeaserAnalytics For Switch To Airtel Kenya Campaign Safaricom Teaser
Analytics For Switch To Airtel Kenya Campaign Safaricom Teaser
 
I-70 Tanker Fire
I-70 Tanker FireI-70 Tanker Fire
I-70 Tanker Fire
 
PeekAnalytics Social Audience Report @jeffjarvis
PeekAnalytics Social Audience Report @jeffjarvisPeekAnalytics Social Audience Report @jeffjarvis
PeekAnalytics Social Audience Report @jeffjarvis
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
Explaining Controversy on Social Media via Stance Summarization
Explaining Controversy on Social Media via Stance SummarizationExplaining Controversy on Social Media via Stance Summarization
Explaining Controversy on Social Media via Stance Summarization
 
Akvo Social Media Report Presentation
Akvo Social Media Report Presentation Akvo Social Media Report Presentation
Akvo Social Media Report Presentation
 
Modeling Spread of Disease from Social Interactions
Modeling Spread of Disease from Social InteractionsModeling Spread of Disease from Social Interactions
Modeling Spread of Disease from Social Interactions
 
WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...
WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...
WWW2010_Earthquake Shakes Twitter User: Analyzing Tweets for Real-Time Event...
 
Twitter and the Global Brain
Twitter and the Global BrainTwitter and the Global Brain
Twitter and the Global Brain
 
Cronkite News and Parse.ly
Cronkite News and Parse.lyCronkite News and Parse.ly
Cronkite News and Parse.ly
 
Week 7.3 Semantic Attacks - Spear Phishing
Week 7.3 Semantic Attacks - Spear PhishingWeek 7.3 Semantic Attacks - Spear Phishing
Week 7.3 Semantic Attacks - Spear Phishing
 
Grasso Fake News Final Presentation
Grasso Fake News Final PresentationGrasso Fake News Final Presentation
Grasso Fake News Final Presentation
 

Similar to Learning Semantic Relationships between Entities in Twitter

What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
KrisKasianovitz
 
Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016
suresh sood
 
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
SC CTSI at USC and CHLA
 
DH 199 Social Media Analytics
DH 199 Social Media AnalyticsDH 199 Social Media Analytics
DH 199 Social Media Analytics
Stephanie Wong
 

Similar to Learning Semantic Relationships between Entities in Twitter (20)

Information Contagion through Social Media: Towards a Realistic Model of the ...
Information Contagion through Social Media: Towards a Realistic Model of the ...Information Contagion through Social Media: Towards a Realistic Model of the ...
Information Contagion through Social Media: Towards a Realistic Model of the ...
 
Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...
 
Introduction to Computational Social Science
Introduction to Computational Social ScienceIntroduction to Computational Social Science
Introduction to Computational Social Science
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Mike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to WebometricsMike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to Webometrics
 
Disinformation challenges tools and techniques to deal or live with it
Disinformation challenges tools and techniques to deal or live with itDisinformation challenges tools and techniques to deal or live with it
Disinformation challenges tools and techniques to deal or live with it
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
Studying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & BiasStudying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & Bias
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
 
SHORTer VERSION - Liminality and Communitas in Social Media - The case of Twi...
SHORTer VERSION - Liminality and Communitas in Social Media - The case of Twi...SHORTer VERSION - Liminality and Communitas in Social Media - The case of Twi...
SHORTer VERSION - Liminality and Communitas in Social Media - The case of Twi...
 
Outreach Through Social Media | Ocean Sciences 2014
Outreach Through Social Media | Ocean Sciences 2014Outreach Through Social Media | Ocean Sciences 2014
Outreach Through Social Media | Ocean Sciences 2014
 
Echo Chamber? What Echo Chamber? Reviewing the Evidence
Echo Chamber? What Echo Chamber? Reviewing the EvidenceEcho Chamber? What Echo Chamber? Reviewing the Evidence
Echo Chamber? What Echo Chamber? Reviewing the Evidence
 
YaleDHI FtM Talk
YaleDHI FtM TalkYaleDHI FtM Talk
YaleDHI FtM Talk
 
Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016
 
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
 
Insights From Social Media
Insights From Social MediaInsights From Social Media
Insights From Social Media
 
DH 199 Social Media Analytics
DH 199 Social Media AnalyticsDH 199 Social Media Analytics
DH 199 Social Media Analytics
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
 

More from Web Information Systems, TU Delft

GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic Web
Web Information Systems, TU Delft
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
Web Information Systems, TU Delft
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
Web Information Systems, TU Delft
 

More from Web Information Systems, TU Delft (10)

Twitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
Twitter, Twinder, Twitcident: Filtering and Search in Social Web StreamsTwitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
Twitter, Twinder, Twitcident: Filtering and Search in Social Web Streams
 
GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic Web
 
Generating Resource Profiles by Exploiting the Context of Social Annotations
Generating Resource Profiles by Exploiting the Context of Social AnnotationsGenerating Resource Profiles by Exploiting the Context of Social Annotations
Generating Resource Profiles by Exploiting the Context of Social Annotations
 
Leveraging the Semantics of Tweets for Adaptive Faceted Search on Twitter
Leveraging the Semantics of Tweets for Adaptive Faceted Search on TwitterLeveraging the Semantics of Tweets for Adaptive Faceted Search on Twitter
Leveraging the Semantics of Tweets for Adaptive Faceted Search on Twitter
 
Payday on the Social Semantic Web
Payday on the Social Semantic WebPayday on the Social Semantic Web
Payday on the Social Semantic Web
 
#SDoW2011 Keynote: User Modeling and Personalization on Twitter
#SDoW2011 Keynote: User Modeling and Personalization on Twitter#SDoW2011 Keynote: User Modeling and Personalization on Twitter
#SDoW2011 Keynote: User Modeling and Personalization on Twitter
 
About the Social Semantic Web
About the Social Semantic WebAbout the Social Semantic Web
About the Social Semantic Web
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
 
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
UMAP 2011: Analyzing User Modeling on Twitter for Personalized News Recommend...
 
Analyzing Cross-System User Modeling on the Social Web
Analyzing Cross-System User Modeling on the Social WebAnalyzing Cross-System User Modeling on the Social Web
Analyzing Cross-System User Modeling on the Social Web
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Learning Semantic Relationships between Entities in Twitter

  • 1. Learning Semantic Relationships between Entities in Twitter ICWE, Cyprus, June 22, 2011 IlknurCelik, Fabian Abel, Geert-Jan Houben Web Information Systems, TU Delft
  • 2. What we do: Science and Engineering for the Personal Web domains: news social mediacultural heritage public datae-learning Personalized Recommendations Personalized Search Adaptive Systems Analysis and User Modeling Semantic Enrichment, Linkage and Alignment user/usage data Social Web
  • 3. 60,000,000 number of tweets published per day
  • 4. 1 number of tweets per day that are interesting for me
  • 6. Issues with Multiple Keywords Search
  • 7. Let’s try to search with One Keyword
  • 11. Page 60!! Music Artist Next Saturday @thatsimpsonguyaka Guilty Simpson will be performing at Area51 in my hometwonEindhoven. #realliveshit #iwillspinrecords about 9 hours ago via Blackberry tweet I was looking for Locations
  • 12. Is there an easier way?Faceted Search can help Current Query: Expand Query: Results: Yskiddd: Next saturday@thatsimpsonguy aka Guilty Simpson will be performing at Area51 in my homeytown Eindhoven. #realliveshit#iwillspinrecords2 Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL sanmiquelmusic: This Saturday I'm joining @KrusadersMusic to Intents Eindhoven Music Locations more... Events more... Music Artists: + Guilty Simpson + Bryan Adams + Elton John + Golden Earring + Rihanna + The eagles + 3 Doors Down more...
  • 13. Location: Eindhoven Music Artist: Guilty Simpson Location: Area51 Semantic relationships between entities are essential to realize such applications.
  • 14.
  • 16.
  • 17. Relation Learning Strategies entities time period Relation: relation(e1, e2, type, tstart, tend, weight) RelationLearningstrategy: Input:entity e1 and e2, time period (tstart, tend) Challenge:inferweightand type of the relationfor the given Weightingaccording to co-occurrence frequency: Tweet-based: count co-occurrence in tweets News-based: count co-occurrence in news Tweet-News-based: count co-occurrence in both tweets and news type/label of relation relatedness
  • 18. Research Questions Which strategy performs best in detecting relationships between entities? Does the accuracy depend on the type of entities which are involved in a relation? How do the strategies perform for discovering relationships which have temporal constraints (trending relationships)?
  • 19. Dataset more than: 20,000 Twitter users 2 months 10,000,000 WikiLeaks founder, Julian Assange, under arrest in London tweets 75,000 news time Dec 15 Jan 15 Nov 15
  • 21. Tweets and news articles per day 50,000-400,000 tweets per day 100-1000 news articles per day
  • 22. Entities referenced per day 10,000-100,000 entity ref. in tweets per day 5,000-20,000 entity ref. in news per day ~40% tweets do not mention any (recognizable) entity 72.6% of the top 1000 mentioned entities in Twitter are also mentioned in the mainstream news media 99.3% of the news articles mention at least one (recognizable) entity
  • 23. Number of Distinct Entities per Entity Types 39 types of entities
  • 24. Performance of Relation Learning Strategies
  • 25. Our Ground Truth of true relations Based on DBpedia: We mapped entities to their corresponding DBpedia resources No appropriate DBpedia URIs for more than 35% of the entities We analyzed whether there is a direct relation between two entities Based on user study: Participants judged whether two entities are really: related (62.6% were rated as related) related in the given time period (57.3% were rated as related) Overall: 2588 judgments Thank you!
  • 26. 1. Which strategy performs best in detecting relationships between entities?
  • 27. Accuracy of relation discovery Combining both tweet-based and news-based strategies allows for highest accuracy Based on user study Based on DBpedia
  • 28. F-Measure@k Combined strategy (and news-based) increase in performance. Tweet-based strategy saturates quickly
  • 29. 2. Does the accuracy depend on the type of entities which are involved in a relation?
  • 30. Does the accuracy depend on the type of entities? 87% precision 92% Relationships which involve events can be discovered with high precision 26% precision 23%
  • 31. Does the accuracy depend on the type of entities? (cont.) Relationships between events can be detected with highest precision. Relationships between persons/groups are difficult to detect.
  • 32. 3. How do the strategies perform for discovering relationships which have temporal constraints?
  • 33. Relationships with temporal constraints Tweet-based strategy performs better in discovering relationships that are valid only for a specific period in time
  • 34. Where do relationships emerge faster? Speed of strategies is domain-dependent time difference (in days) of first occurrence of relationship News is faster Twitter is faster
  • 35. Conclusions and Future Work What we did: relation discovery framework based on Twitter Findings: Strategy that considers both tweets and (linked) news articles allows for highest accuracy Performance varies for different domains (e.g. event-relationships can be detected with highest precision) Tweet-based strategy allows for detecting relationships, which have a restricted temporal validity, with high precision (and fast) Ongoing work: Adaptive Faceted Search on Twitter http://wis.ewi.tudelft.nl/tweetum/
  • 36. Relation Discovery for Adaptive Faceted Search Current Query: 2. Analyze (temporal)relationships of entities that appear in the user profile to adapt facet ranking. Expand Query: Results: Yskiddd: Next saturday@thatsimpsonguy aka Guilty Simpson will be performing at Area51 in my homeytown Eindhoven. #realliveshit#iwillspinrecords2 Usee123: Cool #EV3door7980 !!! http://bit.ly/igyyRhL sanmiquelmusic: This Saturday I'm joining @KrusadersMusic to Intents Eindhoven Music Locations more... Events more... Music Artists: + Guilty Simpson + Bryan Adams + Elton John + Golden Earring + Rihanna + The eagles + 3 Doors Down more... 1. Analyze (temporal)relationships of entities of the “current query” to adapt facet ranking. user
  • 37. Thank you! IlknurCelik, Fabian Abel, Geert-Jan Houben Twitter: @persweb http://wis.ewi.tudelft.nl/tweetum/
  • 38. The Social Web Help me to tackle the information overload! Who is this? What are his personal demands? How can we make him happy? Recommend me news articles that now interest me! Help me to find interesting (social) media! Give me personalized support when I do my online training! Personalize my Web experience! Do not bother me with advertisements that are not interesting for me!

Editor's Notes

  1. Motivation:Information overloadPersonalised “better” search
  2. Why do people search on Twitter rather than Google?Real time info & opinion about almost anything
  3. Example: HT’11 @Eindhoven, looking for some entertainment events...http://search.twitter.com/http://search.twitter.com/advanced
  4. Space limitation + selecting keywords (abbreviations –shorthand notations + colloquial expressions)
  5. Highlight 60
  6. Very time consuming and overwhelming indeed!
  7. entity extraction and semantic enrichment and relation discovery.
  8. large dataset of more than 10 million tweets and 70,000 news articles
  9. 100-1000 news articles per day50,000 and 400,000 tweets per dayTwo of the minima were caused by temporary unavailability of the Twitter monitoring service.
  10. approximately 10,000-100,000 entity references per day for tweetsapproximately 5,000-20,000 entity references per day for News~40% of the tweets had no entities99.3% of the news articles had at least one entityoverlap of entities: 72.6% of the top 1000 mentioned entities in Twitter are also mentioned in the news media.
  11. 39 different entity typesPersons, locations and organizations were mentioned most often, followed by movies, music albums, sport events and political eventswe analyzed specific types of relations such as relationships between persons and locations or organizations and events in detail
  12. Person/Group-Event relationships cover relations between persons and political events, persons and sport events, organizations and sport events, etcinteresting to see that the Tweet+News-based strategy discovers relationships, between persons/groups and events with higher precision 0.92 and 0.87 regarding P@10 and P@20 than people's relations to products (0.23 and 0.26) or locations (0.73 and 0.6).
  13. relations between entities that are of the same typerelationships between two events can also be discovered with high precision, followed by relations between locations...
  14. Twitter is more appropriate for inferring relationships, which have temporal constraints, than the news media.Tweet-based strategy improves precision (P@5) by 22.7%
  15. relationships between persons and movies or music albums emerge much faster (14.7 and 5.1 days respectively) in Twitter than in the traditional news media.
  16. Our framework extracts typed entities from enriched tweets/news and provides strategies for detecting semantic (trending) relationships between entities. We:investigated the precision and recall of the relation detection strategies,analyzed how the strategies perform for each type of relationships andWhich strategy performs best in detecting relationships between entities?Does the accuracy depend on the type of entities which are involved in a relation?How do the strategies perform for discovering relationships which have temporal constraints, and how fast can the strategies detect (trending) relationships?evaluated the quality and speed for discovering trending relationships that possibly have a limited temporal validity.
  17. Very time consuming and overwhelming indeed!