SlideShare a Scribd company logo
1 of 26
Download to read offline
MediaFinder: Collect, Enrich
and Visualize Media Memes
Shared by the Crowd
Raphaël Troncy
raphael.troncy@eurecom.fr / @rtroncy
Conferences and natural disaster
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 2
- 314/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
- 414/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
- 514/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
- 614/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
Social Media: some definitions
 Media Item: a photo or a video that is shared on
a social network
 Micropost: a text status message that can
optionally accompany a media item
 Social Network: an online service that focuses
on building and reflecting social relationships
among people sharing interests or activities
Media Sharing Platforms: emphasis on sharing media
but blurred boundaries with social networks since users
are encouraged to react on media content
(like, comment, favorite, etc.)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 7
Social networks and media items
 First-order support:
 Posting requires the inclusion of a media item
 Example: Flickr, YouTube
 Second-order support:
 Possibility to post media items but also text-only messages
 Example: Facebook
 Third-order support:
 No direct support for media items but rely on third party applications
to host them
 Example: Twitter before the introduction of native photo support
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 8
Media Server
 Composition of media item extractors (12 SNs)
 Rely on search APIs + a fix 30s timeout window to provide results
 Fallback on screen scraping when necessary (Twitter ecosystem)
 Implemented as a NodeJS server
 Serialize results in a common schema (JSON)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 9
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 10
Deep link
Permalink
Clean text for NLP
processing
Aggregate view of ALL
social interactions
12 Social Networks
Media Finder (www2013)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 11
Media Finder (zooming on media items)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 12
Media Finder (timeline view)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 13
Named Entities are Pivotal
 Standalone software
GATE
Stanford CoreNLP
Temis
 Web APIs
http://nerd.eurecom.fr/
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 14
What is NERD?
REST API2ontology1
UI3
1 http://nerd.eurecom.fr/ontology
2 http://nerd.eurecom.fr/api/application.wadl
3 http://nerd.eurecom.fr
The NERD ontology has been
integrated in the NIF project,
a EU FP7 in the context of the
LOD2: Creating Knowledge
out of Interlinked Data
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 15
NERD REST API
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 16
GET,
POST,
PUT,
DELETE
/document
/user
/annotation/{extractor}
/extraction
/evaluation
...
JSON/RDF*
“entities” : [{
“entity”: “Tim Berners-Lee” ,
“type”: “Person” ,
“uri”: "http://dbpedia.org/resource/Tim_berners_lee",
“nerdType”: "http://nerd.eurecom.fr/ontology#Person",
“startChar”: 30,
“endChar”: 45,
“confidence”: 1,
“relevance”: 0.5
}]
Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction
Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France.
Media Finder Architecture
 Media items harvesting using the Media Server
http://eventmedia.eurecom.fr/media-
server/search/{combined}/{term}
https://github.com/vuknje/media-server (@tomayac fork)
 Image near de-duplication
DCT signature on image and video frame,
Hamming distance between image pairs
 Clustering and disambiguation
Named Entity Extraction using NERD
Topic Generation using LDA
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 17
Media Finder (named entities clustering)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 18
Media Finder (zooming in a cluster)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 19
Media Finder
 Live Topic Generation from Event Streams
Meet us at WWW 2013 Demo Session
http://www.youtube.com/watch?v=8iRiwz7cDYY
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 20
Tracking an event: Italian Election
 Repeated queries over a period of time
We have tracked and analyzed media posts tagged as
elezioni2013 from 2013-02-26 to 2013-03-03
Cron job: every 30 minutes over the 6 days
Slice the data in 24 hours slots
 Research questions:
Can we re-create the news headlines?
 Storyboarding:
http://mediafinder.eurecom.fr/story/elezioni2013
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 21
Tracking an event: Italian Election
 Dataset:
~16501 microposts containing (duplicate) media items
~21087 Named Entities extracted
 Clustering
NER and LDA
Generate Bag of Entities (BOE) disambiguated with a
DBpedia URI
 Examples:
Monti, Bersani, Italia, Berlusconi, Grillo, Stelle
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 22
Tracking an event: Italian Election
 Tracking and Analyzing The 2013 Italian Election
To appear at ESWC 2013 Demo Session
http://www.youtube.com/watch?v=jIMdnwMoWnk
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 23
Take Home Message
 Media Server / Media Finder:
Aggregating fresh social media items
Making sense of media collection for video hyper-linking
 NERD platform for extracting key information
 Vision: adoption of semantic multimedia
technologies will foster a European market for
media fragment re-purposing and re-selling
 Sneak preview:
Interact with a Kinect and discover enriched hypervideo
http://www.youtube.com/watch?v=4mSC685AG7k
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 24
Credits
 Vuk Milicic … interaction designer
 Giuseppe Rizzo … NERD guru
 José Luis Redondo Garcia … triplification and
clustering
 Thomas Steiner … Media Server original code
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 25
http://www.slideshare.net/troncy
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 26

More Related Content

Similar to MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd

Sti community – an approach for a semantically enhanced company platform 20...
Sti community – an approach for a semantically enhanced company platform   20...Sti community – an approach for a semantically enhanced company platform   20...
Sti community – an approach for a semantically enhanced company platform 20...
STIinnsbruck
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Amit Sheth
 

Similar to MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd (20)

Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...Weather events identification in social media streams: tools to detect their ...
Weather events identification in social media streams: tools to detect their ...
 
Fire incident data visualization
Fire incident data visualizationFire incident data visualization
Fire incident data visualization
 
Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)
 
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
SocialSensor Project: Sensing User Generated Input for Improved Media Discove...
 
Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)Fairhair.ai – alan turing institute june '17 (public)
Fairhair.ai – alan turing institute june '17 (public)
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong Kong
 
Sti community – an approach for a semantically enhanced company platform 20...
Sti community – an approach for a semantically enhanced company platform   20...Sti community – an approach for a semantically enhanced company platform   20...
Sti community – an approach for a semantically enhanced company platform 20...
 
GNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based MedicineGNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based Medicine
 
Approaches of Data Analysis: Networks generated through Social Media
Approaches of Data Analysis: Networks generated through Social MediaApproaches of Data Analysis: Networks generated through Social Media
Approaches of Data Analysis: Networks generated through Social Media
 
Citymatter: UX / UI Design
Citymatter: UX / UI DesignCitymatter: UX / UI Design
Citymatter: UX / UI Design
 
Diata 2012 ARCOMEM
Diata 2012 ARCOMEMDiata 2012 ARCOMEM
Diata 2012 ARCOMEM
 
Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009Live Social Semantics @ ISWC2009
Live Social Semantics @ ISWC2009
 
Fusing text and image for event
Fusing text and image for eventFusing text and image for event
Fusing text and image for event
 
WG2 soa&plan 201401
WG2 soa&plan 201401WG2 soa&plan 201401
WG2 soa&plan 201401
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2
 
Press Kit -LiMoSINe Project
Press Kit -LiMoSINe ProjectPress Kit -LiMoSINe Project
Press Kit -LiMoSINe Project
 
Semantic multimedia remixing
Semantic multimedia remixingSemantic multimedia remixing
Semantic multimedia remixing
 
SMSociety16_paper_209-FINAL VERSION
SMSociety16_paper_209-FINAL VERSIONSMSociety16_paper_209-FINAL VERSION
SMSociety16_paper_209-FINAL VERSION
 
Weaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent CitiesWeaving the Web of People and Things for Intelligent Cities
Weaving the Web of People and Things for Intelligent Cities
 

More from Raphael Troncy

More from Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social web
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting Events
 
Finding media illustrating events
Finding media illustrating eventsFinding media illustrating events
Finding media illustrating events
 
Experiencing Events through User-Generated Media
Experiencing Events through User-Generated MediaExperiencing Events through User-Generated Media
Experiencing Events through User-Generated Media
 
Linking Events with Media
Linking Events with MediaLinking Events with Media
Linking Events with Media
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010
 
LODE: Une Ontologie pour representer des evenements dans le Web de Donnees
LODE: Une Ontologie pour representer des evenements dans le Web de DonneesLODE: Une Ontologie pour representer des evenements dans le Web de Donnees
LODE: Une Ontologie pour representer des evenements dans le Web de Donnees
 
Provenance for Multimedia
Provenance for MultimediaProvenance for Multimedia
Provenance for Multimedia
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd

  • 1. MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd Raphaël Troncy raphael.troncy@eurecom.fr / @rtroncy
  • 2. Conferences and natural disaster 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 2
  • 3. - 314/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 4. - 414/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 5. - 514/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 6. - 614/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
  • 7. Social Media: some definitions  Media Item: a photo or a video that is shared on a social network  Micropost: a text status message that can optionally accompany a media item  Social Network: an online service that focuses on building and reflecting social relationships among people sharing interests or activities Media Sharing Platforms: emphasis on sharing media but blurred boundaries with social networks since users are encouraged to react on media content (like, comment, favorite, etc.) Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 7
  • 8. Social networks and media items  First-order support:  Posting requires the inclusion of a media item  Example: Flickr, YouTube  Second-order support:  Possibility to post media items but also text-only messages  Example: Facebook  Third-order support:  No direct support for media items but rely on third party applications to host them  Example: Twitter before the introduction of native photo support Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 8
  • 9. Media Server  Composition of media item extractors (12 SNs)  Rely on search APIs + a fix 30s timeout window to provide results  Fallback on screen scraping when necessary (Twitter ecosystem)  Implemented as a NodeJS server  Serialize results in a common schema (JSON) Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro14/05/2013 - 9
  • 10. 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 10 Deep link Permalink Clean text for NLP processing Aggregate view of ALL social interactions 12 Social Networks
  • 11. Media Finder (www2013) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 11
  • 12. Media Finder (zooming on media items) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 12
  • 13. Media Finder (timeline view) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 13
  • 14. Named Entities are Pivotal  Standalone software GATE Stanford CoreNLP Temis  Web APIs http://nerd.eurecom.fr/ 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 14
  • 15. What is NERD? REST API2ontology1 UI3 1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr The NERD ontology has been integrated in the NIF project, a EU FP7 in the context of the LOD2: Creating Knowledge out of Interlinked Data 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 15
  • 16. NERD REST API 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 16 GET, POST, PUT, DELETE /document /user /annotation/{extractor} /extraction /evaluation ... JSON/RDF* “entities” : [{ “entity”: “Tim Berners-Lee” , “type”: “Person” , “uri”: "http://dbpedia.org/resource/Tim_berners_lee", “nerdType”: "http://nerd.eurecom.fr/ontology#Person", “startChar”: 30, “endChar”: 45, “confidence”: 1, “relevance”: 0.5 }] Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France.
  • 17. Media Finder Architecture  Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media- server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)  Image near de-duplication DCT signature on image and video frame, Hamming distance between image pairs  Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 17
  • 18. Media Finder (named entities clustering) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 18
  • 19. Media Finder (zooming in a cluster) 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 19
  • 20. Media Finder  Live Topic Generation from Event Streams Meet us at WWW 2013 Demo Session http://www.youtube.com/watch?v=8iRiwz7cDYY 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 20
  • 21. Tracking an event: Italian Election  Repeated queries over a period of time We have tracked and analyzed media posts tagged as elezioni2013 from 2013-02-26 to 2013-03-03 Cron job: every 30 minutes over the 6 days Slice the data in 24 hours slots  Research questions: Can we re-create the news headlines?  Storyboarding: http://mediafinder.eurecom.fr/story/elezioni2013 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 21
  • 22. Tracking an event: Italian Election  Dataset: ~16501 microposts containing (duplicate) media items ~21087 Named Entities extracted  Clustering NER and LDA Generate Bag of Entities (BOE) disambiguated with a DBpedia URI  Examples: Monti, Bersani, Italia, Berlusconi, Grillo, Stelle 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 22
  • 23. Tracking an event: Italian Election  Tracking and Analyzing The 2013 Italian Election To appear at ESWC 2013 Demo Session http://www.youtube.com/watch?v=jIMdnwMoWnk 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 23
  • 24. Take Home Message  Media Server / Media Finder: Aggregating fresh social media items Making sense of media collection for video hyper-linking  NERD platform for extracting key information  Vision: adoption of semantic multimedia technologies will foster a European market for media fragment re-purposing and re-selling  Sneak preview: Interact with a Kinect and discover enriched hypervideo http://www.youtube.com/watch?v=4mSC685AG7k 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 24
  • 25. Credits  Vuk Milicic … interaction designer  Giuseppe Rizzo … NERD guru  José Luis Redondo Garcia … triplification and clustering  Thomas Steiner … Media Server original code 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 25
  • 26. http://www.slideshare.net/troncy 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 26