SlideShare a Scribd company logo
1 of 19
Machine-Crowd Annotation Workflow
for Event Understanding
across Collections & Domains
Oana Inel Extended Semantic Web Conference
PhD Symposium
May 30th 2016
Too much information ...
e.g., if you are interested in the topic of “whaling”
2
… and after a while it all looks the same
it is difficult to form a global picture on a topic
3
… thus, content without context is difficult to process
events can help create context around content
4
…, but events are not easy to deal with
• Events are vague
• Event semantics are difficult
• Events can be viewed and interpreted from multiple perspectives and interpretations
e.g. of participants interpretation: The mayor of the city called the celebration a success.
• Events can be presented at different levels of granularities
e.g. of spatial disagreement: The celebration took place in every city in the Netherlands.
• People are not consistent in the way they talk about or use events
e.g.: The celebration took place last week, fireworks shows were held everywhere.
5
… a lot of ground truth is needed to learn event specifics
• Traditional ground truth collection doesn’t scale:
• there is not really ‘one type of experts’ when it comes to events
• the annotation guidelines for events are difficult to define
• the annotation of events can be a tedious process
• all of the above can result in high inter-annotator disagreement
• Crowdsourcing could be an alternative
• but is still not a robust & replicable approach
6
… let’s look at some examples
According to department policy prosecutors must make
a strong showing that lawyers' fees came from assets
tainted by illegal profits before any attempts at seizure
are made.
The unit makes intravenous pumps used by hospitals
and had more than $110 million in sales last year
according to Advanced Medical.
7
… here is what experts annotate on these sentences
[According] to department policy prosecutors must make
a strong [showing] that lawyers' fees [came] from assets
tainted by illegal profits before any [attempts] at [seizure]
are [made].
The unit makes intravenous pumps used by hospitals
and [had] more than $110 million in [sales] last year
according to Advanced Medical.
8
… here is what the crowd annotates on them
According to department policy prosecutors must make
a [strong [showing]] that lawyers' fees [[came] from
assets] [tainted] by illegal profits before any [attempts] at
[seizure] are [made].
The unit [makes] intravenous pumps [used] by hospitals
and [[had] more than $110 million in [sales]] last year
according to Advanced Medical.
9
… here is what the machines can detect
According to department policy prosecutors must [make]
a strong showing that lawyers' fees [came] from assets
[tainted] by illegal profits before any attempts at seizure
are made.
The unit [makes] intravenous pumps [used] by hospitals
and [had] more than $110 million in sales last year
according to Advanced Medical.
10
Research Questions
• Can crowdsourcing help in improving event detection?
• Can we provide reliable crowdsourced training data?
• Can we optimize the crowdsourcing process by using results from
NLP tools?
• Can we achieve a replicable data collection process across different
data types and use cases?
11
Current Hypothesis:
Disagreement-based approach to crowdsource ground truth
is reliable and produces quality results
12
Preliminary Results - Crowd vs. Experts
● 200 news snippets from TimeBank● 3019 tweets published in 2014
● potential relevant tweets for events such as ‘whaling’,
‘Davos 2014’ among others
CrowdTruth approach outperforms the-state-of-the-art
crowdsourcing approaches such as single annotator and
majority vote
The crowd performs almost as good as the experts due to
very linguistic-specialized guidelines for expert annotators13
Current Hypothesis:
Disagreement-based approach to crowdsource ground truth
can be optimised by using results from NLP tools
15
Preliminary Results - Hybrid Workflow
ENTITY EXTRACTION
EVENTS CROWDSOURCING AND
LINKING TO CONCEPTS
SEGMENTATION & KEYFRAMES
LINKING EVENTS AND
CONCEPTS TO KEYFRAMES
diveplus.beeldengeluid.nl
16
Preliminary Results - Hybrid Workflow Outcome
17diveplus.beeldengeluid.nl
Approach: Disagreement is Signal
Principles for disagreement-based
crowdsourcing
• Do not enforce agreement
• Capture a multitude of views
• Take advantage of existing
tools, reuse their functionality
This results in teaching machines to reason in
the disagreement space
18
Overall Methodology
1. Instantiate the research methodology with specific data, domain
• Video synopsis, news
2. Identify state-of-the-art IE approaches that can be used
• NER tools for identifying events and their participating entities in the video synopsis
3. Evaluate IE approaches and identify their drawbacks
• Poor performance in extracting events
4. Combine IE with crowdsourcing tasks in a complementary way
• Use crowdsourcing for identifying the events and linking them with their participating entities
5. Evaluate crowdsourcing results with CrowdTruth disagreement-first approach
• Evaluate the input unit, the workers and the annotations
6. Instantiate the same workflow with different data and/or different domain
• Tweets, Twitter
7. Perform cross-domain analysis
• Event extraction in video synopsis vs. event extraction in tweets 19
Project Websites
http://CrowdTruth.org
http://diveproject.beeldengeluid.nl
Tools & Code
http://dev.CrowdTruth.org
http://github.com/CrowdTruth
http://diveplus.beeldengeluid.nl
Data
http://data.crowdtruth.org
http://data.dive.beeldengeluid.nl
20

More Related Content

Viewers also liked

Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)Lora Aroyo
 
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 Lora Aroyo
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Lora Aroyo
 
Towards Better Media Understanding and Searchability
Towards Better Media Understanding and SearchabilityTowards Better Media Understanding and Searchability
Towards Better Media Understanding and Searchabilityoanainel
 
Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?CrowdTruth
 
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataVisualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataCrowdTruth
 
Crowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsCrowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsBenjamin Timmermans
 
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...CrowdTruth
 
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Lora Aroyo
 
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014Lora Aroyo
 
Dive+@ICTOpen2017
Dive+@ICTOpen2017Dive+@ICTOpen2017
Dive+@ICTOpen2017oanainel
 
Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015CrowdTruth
 
CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015Lora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
Harnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event ExtractionHarnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event Extractionoanainel
 
DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation Victor de Boer
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo
 
Truth is a Lie - 7 Myths of Human Annotation
Truth is a Lie - 7 Myths of Human AnnotationTruth is a Lie - 7 Myths of Human Annotation
Truth is a Lie - 7 Myths of Human AnnotationAnca Dumitrache
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 

Viewers also liked (20)

Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
 
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
 
Towards Better Media Understanding and Searchability
Towards Better Media Understanding and SearchabilityTowards Better Media Understanding and Searchability
Towards Better Media Understanding and Searchability
 
Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?
 
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataVisualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
 
Crowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsCrowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain Questions
 
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
 
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
 
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
 
Dive+@ICTOpen2017
Dive+@ICTOpen2017Dive+@ICTOpen2017
Dive+@ICTOpen2017
 
Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015Dive+ NL eScience symposium 2015
Dive+ NL eScience symposium 2015
 
CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Harnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event ExtractionHarnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event Extraction
 
Kick-off meeting Linkflows project
Kick-off meeting Linkflows projectKick-off meeting Linkflows project
Kick-off meeting Linkflows project
 
DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
Truth is a Lie - 7 Myths of Human Annotation
Truth is a Lie - 7 Myths of Human AnnotationTruth is a Lie - 7 Myths of Human Annotation
Truth is a Lie - 7 Myths of Human Annotation
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 

Similar to ESWC - PhD Symposium 2016

W4P-Launch - Open Source Crowdsourcing platform
W4P-Launch - Open Source Crowdsourcing platformW4P-Launch - Open Source Crowdsourcing platform
W4P-Launch - Open Source Crowdsourcing platformOpen Knowledge Belgium
 
Where to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approachWhere to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approachLive Union
 
CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...
CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...
CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...Claire Ingram Bogusz
 
How Customer Intelligence Will Future Proof Your Event Portfolio
How Customer Intelligence Will Future Proof Your Event PortfolioHow Customer Intelligence Will Future Proof Your Event Portfolio
How Customer Intelligence Will Future Proof Your Event PortfolioBear Analytics
 
Queuing and The Age of Context: Release 1 The Digital Consumer Collaborative
Queuing and The Age of Context: Release 1 The Digital Consumer CollaborativeQueuing and The Age of Context: Release 1 The Digital Consumer Collaborative
Queuing and The Age of Context: Release 1 The Digital Consumer CollaborativeDave Norton
 
Intro For Informative Essay
Intro For Informative EssayIntro For Informative Essay
Intro For Informative EssayLisa Johnson
 
Essay Radiology Career
Essay Radiology CareerEssay Radiology Career
Essay Radiology CareerAmy Williams
 
Accountability in Action - Step Seven
Accountability in Action - Step SevenAccountability in Action - Step Seven
Accountability in Action - Step Seventincancollective
 
Essay On Current Affairs Of Pakistan 2014
Essay On Current Affairs Of Pakistan 2014Essay On Current Affairs Of Pakistan 2014
Essay On Current Affairs Of Pakistan 2014Shantel Jervey
 
10ictprojectforsocialchange
10ictprojectforsocialchange10ictprojectforsocialchange
10ictprojectforsocialchangeYoonaIm6
 
Crowdsourcing 101 for GLAMs
Crowdsourcing 101 for GLAMsCrowdsourcing 101 for GLAMs
Crowdsourcing 101 for GLAMsOlaf Janssen
 
ICT Project for Social Change - Empowerment Technologies
ICT Project for Social Change - Empowerment TechnologiesICT Project for Social Change - Empowerment Technologies
ICT Project for Social Change - Empowerment TechnologiesMark Jhon Oxillo
 
Bad Effects Of Smoking Short Essay. Online assignment writing service.
Bad Effects Of Smoking Short Essay. Online assignment writing service.Bad Effects Of Smoking Short Essay. Online assignment writing service.
Bad Effects Of Smoking Short Essay. Online assignment writing service.Lisa Richardson
 
Prospecting & Screening: A Beginners Guide
Prospecting & Screening: A Beginners GuideProspecting & Screening: A Beginners Guide
Prospecting & Screening: A Beginners GuideBen Rymer
 
Personal Data and Trust Network inaugural Event 11 march 2015 - record
Personal Data and Trust Network inaugural Event   11 march 2015 - recordPersonal Data and Trust Network inaugural Event   11 march 2015 - record
Personal Data and Trust Network inaugural Event 11 march 2015 - recordDigital Catapult
 
Speech Maarten Brouwer at Open Data for Development Camp, May 2011, Amsterdam
Speech Maarten Brouwer at  Open Data for Development Camp, May 2011,  AmsterdamSpeech Maarten Brouwer at  Open Data for Development Camp, May 2011,  Amsterdam
Speech Maarten Brouwer at Open Data for Development Camp, May 2011, Amsterdamopenforchange
 
UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...
UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...
UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...gjhouben
 

Similar to ESWC - PhD Symposium 2016 (20)

W4P-Launch - Open Source Crowdsourcing platform
W4P-Launch - Open Source Crowdsourcing platformW4P-Launch - Open Source Crowdsourcing platform
W4P-Launch - Open Source Crowdsourcing platform
 
Where to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approachWhere to focus event innovation? - An audience led approach
Where to focus event innovation? - An audience led approach
 
CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...
CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...
CROWDFUNDING IN ACTION: HOW INSTITUTIONAL LOGICS ENCOURAGE AND CONSTRAIN AFFO...
 
How Customer Intelligence Will Future Proof Your Event Portfolio
How Customer Intelligence Will Future Proof Your Event PortfolioHow Customer Intelligence Will Future Proof Your Event Portfolio
How Customer Intelligence Will Future Proof Your Event Portfolio
 
Audience Lessons
Audience LessonsAudience Lessons
Audience Lessons
 
Matchbox presentation
Matchbox presentation Matchbox presentation
Matchbox presentation
 
Queuing and The Age of Context: Release 1 The Digital Consumer Collaborative
Queuing and The Age of Context: Release 1 The Digital Consumer CollaborativeQueuing and The Age of Context: Release 1 The Digital Consumer Collaborative
Queuing and The Age of Context: Release 1 The Digital Consumer Collaborative
 
Intro For Informative Essay
Intro For Informative EssayIntro For Informative Essay
Intro For Informative Essay
 
Essay Radiology Career
Essay Radiology CareerEssay Radiology Career
Essay Radiology Career
 
Accountability in Action - Step Seven
Accountability in Action - Step SevenAccountability in Action - Step Seven
Accountability in Action - Step Seven
 
Essay On Current Affairs Of Pakistan 2014
Essay On Current Affairs Of Pakistan 2014Essay On Current Affairs Of Pakistan 2014
Essay On Current Affairs Of Pakistan 2014
 
10ictprojectforsocialchange
10ictprojectforsocialchange10ictprojectforsocialchange
10ictprojectforsocialchange
 
Crowdsourcing 101 for GLAMs
Crowdsourcing 101 for GLAMsCrowdsourcing 101 for GLAMs
Crowdsourcing 101 for GLAMs
 
ICT Project for Social Change - Empowerment Technologies
ICT Project for Social Change - Empowerment TechnologiesICT Project for Social Change - Empowerment Technologies
ICT Project for Social Change - Empowerment Technologies
 
Bad Effects Of Smoking Short Essay. Online assignment writing service.
Bad Effects Of Smoking Short Essay. Online assignment writing service.Bad Effects Of Smoking Short Essay. Online assignment writing service.
Bad Effects Of Smoking Short Essay. Online assignment writing service.
 
EIA2016 Turin - Alberto Giusti. Crowdfunding
EIA2016 Turin - Alberto Giusti.  CrowdfundingEIA2016 Turin - Alberto Giusti.  Crowdfunding
EIA2016 Turin - Alberto Giusti. Crowdfunding
 
Prospecting & Screening: A Beginners Guide
Prospecting & Screening: A Beginners GuideProspecting & Screening: A Beginners Guide
Prospecting & Screening: A Beginners Guide
 
Personal Data and Trust Network inaugural Event 11 march 2015 - record
Personal Data and Trust Network inaugural Event   11 march 2015 - recordPersonal Data and Trust Network inaugural Event   11 march 2015 - record
Personal Data and Trust Network inaugural Event 11 march 2015 - record
 
Speech Maarten Brouwer at Open Data for Development Camp, May 2011, Amsterdam
Speech Maarten Brouwer at  Open Data for Development Camp, May 2011,  AmsterdamSpeech Maarten Brouwer at  Open Data for Development Camp, May 2011,  Amsterdam
Speech Maarten Brouwer at Open Data for Development Camp, May 2011, Amsterdam
 
UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...
UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...
UMAP 2013 - Link, Like, Follow, Friend: The Social Element in User Modeling a...
 

Recently uploaded

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 

ESWC - PhD Symposium 2016

  • 1. Machine-Crowd Annotation Workflow for Event Understanding across Collections & Domains Oana Inel Extended Semantic Web Conference PhD Symposium May 30th 2016
  • 2. Too much information ... e.g., if you are interested in the topic of “whaling” 2
  • 3. … and after a while it all looks the same it is difficult to form a global picture on a topic 3
  • 4. … thus, content without context is difficult to process events can help create context around content 4
  • 5. …, but events are not easy to deal with • Events are vague • Event semantics are difficult • Events can be viewed and interpreted from multiple perspectives and interpretations e.g. of participants interpretation: The mayor of the city called the celebration a success. • Events can be presented at different levels of granularities e.g. of spatial disagreement: The celebration took place in every city in the Netherlands. • People are not consistent in the way they talk about or use events e.g.: The celebration took place last week, fireworks shows were held everywhere. 5
  • 6. … a lot of ground truth is needed to learn event specifics • Traditional ground truth collection doesn’t scale: • there is not really ‘one type of experts’ when it comes to events • the annotation guidelines for events are difficult to define • the annotation of events can be a tedious process • all of the above can result in high inter-annotator disagreement • Crowdsourcing could be an alternative • but is still not a robust & replicable approach 6
  • 7. … let’s look at some examples According to department policy prosecutors must make a strong showing that lawyers' fees came from assets tainted by illegal profits before any attempts at seizure are made. The unit makes intravenous pumps used by hospitals and had more than $110 million in sales last year according to Advanced Medical. 7
  • 8. … here is what experts annotate on these sentences [According] to department policy prosecutors must make a strong [showing] that lawyers' fees [came] from assets tainted by illegal profits before any [attempts] at [seizure] are [made]. The unit makes intravenous pumps used by hospitals and [had] more than $110 million in [sales] last year according to Advanced Medical. 8
  • 9. … here is what the crowd annotates on them According to department policy prosecutors must make a [strong [showing]] that lawyers' fees [[came] from assets] [tainted] by illegal profits before any [attempts] at [seizure] are [made]. The unit [makes] intravenous pumps [used] by hospitals and [[had] more than $110 million in [sales]] last year according to Advanced Medical. 9
  • 10. … here is what the machines can detect According to department policy prosecutors must [make] a strong showing that lawyers' fees [came] from assets [tainted] by illegal profits before any attempts at seizure are made. The unit [makes] intravenous pumps [used] by hospitals and [had] more than $110 million in sales last year according to Advanced Medical. 10
  • 11. Research Questions • Can crowdsourcing help in improving event detection? • Can we provide reliable crowdsourced training data? • Can we optimize the crowdsourcing process by using results from NLP tools? • Can we achieve a replicable data collection process across different data types and use cases? 11
  • 12. Current Hypothesis: Disagreement-based approach to crowdsource ground truth is reliable and produces quality results 12
  • 13. Preliminary Results - Crowd vs. Experts ● 200 news snippets from TimeBank● 3019 tweets published in 2014 ● potential relevant tweets for events such as ‘whaling’, ‘Davos 2014’ among others CrowdTruth approach outperforms the-state-of-the-art crowdsourcing approaches such as single annotator and majority vote The crowd performs almost as good as the experts due to very linguistic-specialized guidelines for expert annotators13
  • 14. Current Hypothesis: Disagreement-based approach to crowdsource ground truth can be optimised by using results from NLP tools 15
  • 15. Preliminary Results - Hybrid Workflow ENTITY EXTRACTION EVENTS CROWDSOURCING AND LINKING TO CONCEPTS SEGMENTATION & KEYFRAMES LINKING EVENTS AND CONCEPTS TO KEYFRAMES diveplus.beeldengeluid.nl 16
  • 16. Preliminary Results - Hybrid Workflow Outcome 17diveplus.beeldengeluid.nl
  • 17. Approach: Disagreement is Signal Principles for disagreement-based crowdsourcing • Do not enforce agreement • Capture a multitude of views • Take advantage of existing tools, reuse their functionality This results in teaching machines to reason in the disagreement space 18
  • 18. Overall Methodology 1. Instantiate the research methodology with specific data, domain • Video synopsis, news 2. Identify state-of-the-art IE approaches that can be used • NER tools for identifying events and their participating entities in the video synopsis 3. Evaluate IE approaches and identify their drawbacks • Poor performance in extracting events 4. Combine IE with crowdsourcing tasks in a complementary way • Use crowdsourcing for identifying the events and linking them with their participating entities 5. Evaluate crowdsourcing results with CrowdTruth disagreement-first approach • Evaluate the input unit, the workers and the annotations 6. Instantiate the same workflow with different data and/or different domain • Tweets, Twitter 7. Perform cross-domain analysis • Event extraction in video synopsis vs. event extraction in tweets 19
  • 19. Project Websites http://CrowdTruth.org http://diveproject.beeldengeluid.nl Tools & Code http://dev.CrowdTruth.org http://github.com/CrowdTruth http://diveplus.beeldengeluid.nl Data http://data.crowdtruth.org http://data.dive.beeldengeluid.nl 20

Editor's Notes

  1. Massive amount of information One of the main characteristics of today is the massive, even overwhelming amount of information around us Just think at all the videos, images and the infinite amount of web pages, tweets that you get as search results when you want to learn about a topic
  2. However, this unconceivable amount of information starts to ‘look all the same’ to the users and they are not able to properly consume the information and get an overview of the topic
  3. and this happens because content without context is difficult to process. but, events can help create context around content
  4. Experts can be inconsistent - despite the traditional believe that they are always right
  5. The crowd overlaps with the experts in proportion of 88%, i.e. it detects almost the same events as the experts But the added value is that crowd finds even more events and it is more specific Another point is that the crowd seems to be more consistent :-)
  6. And how little the machines are able to detect from this - so they need to learn more, thus more training data is needed for them
  7. majority vote - the answer that was picked by the majority of the workers and all the answers that were picked by at least half of the total number of workers single - randomly sampled from the set of workers annotating it; to show that having more annotators generates better quality data. CT scores consistently above the majority vote and single annotator and its performance is also comparable to that of domain experts. The crowdsourcing task where workers choose annotations from a fixed number of options perform better at higher thresholds, e.g. (Twitter event extraction). Whereas open annotation tasks (event extraction) perform better when the threshold is at its lowest, thus ensuring the most diverse opinions are accounted in the resulting ground truth.
  8. Message of the results Data on which the experiments were performed
  9. Have two hypothesis for this
  10. Experts are inconsistent
  11. Automatic tools detect less; difficult to see what is the focus The crowd is much more specific than the experts The crowd overlaps a lot with the experts Experts have some difficult events Experts are not consistent
  12. Automatic tools detect less; difficult to see what is the focus The crowd is much more specific than the experts The crowd overlaps a lot with the experts Experts have some difficult events Experts are not consistent