SlideShare a Scribd company logo
1 of 17
Download to read offline
Live Topic Generation
from Event Streams
Vuk Milicic, José Luis Redondo Garcia,
Giuseppe Rizzo, Raphaël Troncy, Thomas Steiner
raphael.troncy@eurecom.fr / @rtroncy
Media Finder (www2013)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 2
Media Finder (zooming on media items)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 3
Media Finder (timeline view)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 4
Media Finder (timeline view)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 5
Media Server
 Composition of media item extractors (12 SNs)
 Rely on search APIs + a fix 30s timeout window to provide results
 Fallback on screen scraping when necessary (Twitter ecosystem)
 Implemented as a NodeJS server
 Serialize results in a common schema (JSON)
22nd World Wide Web Conference (WWW) - Rio de Janeiro15/05/2013 - 6
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 7
Deep link
Permalink
Clean text for NLP
processing
Aggregate view of ALL
social interactions
12 Social Networks
Media Finder Architecture
 Media items harvesting using the Media Server
http://eventmedia.eurecom.fr/media-
server/search/{combined}/{term}
https://github.com/vuknje/media-server (@tomayac fork)
 Image near de-duplication
DCT signature on image and video frame,
Hamming distance between image pairs
 Clustering and disambiguation
Named Entity Extraction using NERD
Topic Generation using LDA
Density-based clustering using OPTICS
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 8
Named Entities are Pivotal
http://nerd.eurecom.fr/
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 9
REST API Ontology
Dashboard UI
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 10
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 11
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 12
Media Finder (named entities clustering)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 13
Media Finder (zooming in a cluster)
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 14
Summary
 Pick an event identified with a hashtag
 Use MediaServer to get media items
aggregated over multiple social networks
 Use NERD to get entities
aggregated over multiple extractors
 Cluster and identify meaningful topics
(aka entities)
with a meaningful label
often disambiguated with a DBpedia URI giving access
to more encyclopedic knowledge
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 15
Live Topic Generation from Event Streams
 Meet us at WWW 2013 Demo Session, Booth 14
http://www.youtube.com/watch?v=8iRiwz7cDYY
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 16
http://www.slideshare.net/troncy
15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 17

More Related Content

Similar to Live topic generation from event streams

Raphaël troncy
Raphaël troncyRaphaël troncy
Raphaël troncy
IRI
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
Sammy Fung
 
IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012
Stuart Myles
 
Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012
scorlosquet
 

Similar to Live topic generation from event streams (20)

Raphaël troncy
Raphaël troncyRaphaël troncy
Raphaël troncy
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
1802_Crossminer_OCF2018
1802_Crossminer_OCF20181802_Crossminer_OCF2018
1802_Crossminer_OCF2018
 
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
Creating and Utilizing Linked Open Statistical Data for the Development of Ad...
 
IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012IPTC Semantic Web Working Group Summer 2012
IPTC Semantic Web Working Group Summer 2012
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world
 
Deploying your Predictive Models as a Service via
 Domino API Endpoints
Deploying your Predictive Models as a Service via
 Domino API EndpointsDeploying your Predictive Models as a Service via
 Domino API Endpoints
Deploying your Predictive Models as a Service via
 Domino API Endpoints
 
Multiple awr reports_parser
Multiple awr reports_parserMultiple awr reports_parser
Multiple awr reports_parser
 
Kurento: a media server architecture and API for WebRTC
Kurento: a media server architecture and API for WebRTCKurento: a media server architecture and API for WebRTC
Kurento: a media server architecture and API for WebRTC
 
ROS Overview - Málaga 2012
ROS Overview - Málaga 2012ROS Overview - Málaga 2012
ROS Overview - Málaga 2012
 
LOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink SoftwareLOD2 webinar series: Virtuoso by OpenLink Software
LOD2 webinar series: Virtuoso by OpenLink Software
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
W3 presentation gfii 6 dec 2013
W3   presentation gfii 6 dec 2013W3   presentation gfii 6 dec 2013
W3 presentation gfii 6 dec 2013
 
TPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the WebTPAC2016 - From Linked Building Data to Building Data on the Web
TPAC2016 - From Linked Building Data to Building Data on the Web
 
Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012Drupal and the semantic web - SemTechBiz 2012
Drupal and the semantic web - SemTechBiz 2012
 
Ros platform overview
Ros platform overviewRos platform overview
Ros platform overview
 
#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...
#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...
#OSSPARIS17 - The CROSSMINER H2020 Project: Developer-Centric Knowledge Minin...
 
Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...
Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...
Paris Open Source Summit, Floss - Innovation collaborative 2017 Alessandra Ba...
 
The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ...
 The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ... The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ...
The CROSSMINER H2020 Project: Developer-Centric Knowledge Mining from Large ...
 

More from Raphael Troncy

More from Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED Opening
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop opening
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting Events
 
Finding media illustrating events
Finding media illustrating eventsFinding media illustrating events
Finding media illustrating events
 
Experiencing Events through User-Generated Media
Experiencing Events through User-Generated MediaExperiencing Events through User-Generated Media
Experiencing Events through User-Generated Media
 
Linking Events with Media
Linking Events with MediaLinking Events with Media
Linking Events with Media
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Live topic generation from event streams

  • 1. Live Topic Generation from Event Streams Vuk Milicic, José Luis Redondo Garcia, Giuseppe Rizzo, Raphaël Troncy, Thomas Steiner raphael.troncy@eurecom.fr / @rtroncy
  • 2. Media Finder (www2013) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 2
  • 3. Media Finder (zooming on media items) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 3
  • 4. Media Finder (timeline view) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 4
  • 5. Media Finder (timeline view) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 5
  • 6. Media Server  Composition of media item extractors (12 SNs)  Rely on search APIs + a fix 30s timeout window to provide results  Fallback on screen scraping when necessary (Twitter ecosystem)  Implemented as a NodeJS server  Serialize results in a common schema (JSON) 22nd World Wide Web Conference (WWW) - Rio de Janeiro15/05/2013 - 6
  • 7. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 7 Deep link Permalink Clean text for NLP processing Aggregate view of ALL social interactions 12 Social Networks
  • 8. Media Finder Architecture  Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media- server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)  Image near de-duplication DCT signature on image and video frame, Hamming distance between image pairs  Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA Density-based clustering using OPTICS 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 8
  • 9. Named Entities are Pivotal http://nerd.eurecom.fr/ 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 9 REST API Ontology Dashboard UI
  • 10. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 10
  • 11. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 11
  • 12. 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 12
  • 13. Media Finder (named entities clustering) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 13
  • 14. Media Finder (zooming in a cluster) 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 14
  • 15. Summary  Pick an event identified with a hashtag  Use MediaServer to get media items aggregated over multiple social networks  Use NERD to get entities aggregated over multiple extractors  Cluster and identify meaningful topics (aka entities) with a meaningful label often disambiguated with a DBpedia URI giving access to more encyclopedic knowledge 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 15
  • 16. Live Topic Generation from Event Streams  Meet us at WWW 2013 Demo Session, Booth 14 http://www.youtube.com/watch?v=8iRiwz7cDYY 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 16
  • 17. http://www.slideshare.net/troncy 15/05/2013 22nd World Wide Web Conference (WWW) - Rio de Janeiro - 17