This document describes a study that crowdsourced the annotation of rumourous conversations on Twitter. The researchers developed a definition of rumour and annotated rumours and non-rumours related to several events. They designed an annotation scheme to capture conversational aspects like certainty and evidence within rumour threads. The scheme was iteratively revised and validated through crowdsourcing. Future work involves larger-scale crowdsourced annotation of conversations and using machine learning to analyze rumour identification and veracity.
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
1. Arkaitz Zubiaga
University of Warwick
Maria Liakata1
, Rob Procter1
, Kalina Bontcheva2
, Peter Tolmie1
1
University of Warwick, UK
2
University of Sheffield, UK
Crowdsourcing the Annotation of
Rumourous Conversations in Social
Media
11. Objectives
l Scenario where a journalist is tracking a breaking news story.
l Identify rumours, distinguishing them from non-rumours.
l Study the conversational aspects of rumours, towards
determining their veracity.
12. Objectives
Study conversational aspects of rumours.
l
1) Build a dataset with diverse sets of rumourous stories.
2) Annotate linguistic and interaction patterns within rumours
to enable automated analysis.
3) Analyse these patterns and use machine learning techniques to
determine the veracity of rumours.
13. Related Work
Previous work on rumour detection in social media
[Qazvinian et al. 2011, Procter et al. 2013, Castillo et al. 2013,
Starbird er al. 2014]
● Rumours known a priori, keyword search, e.g.,
“sandy sharks” or “london eye fire”.
● Looking at tweets individually, no interactions captured.
14. Our Approach
Our approach:
● Identify diverse set of rumours, can be unknown a
priori, e.g.,
follow #hurricanesandy, and see what comes up.
● Annotate conversational aspects (wrt veracity),
capturing interaction between tweets.
15. Creating a corpus of
rumourous conversations
Steps:
● Formal definition of rumour.
● Annotation of rumours and non-rumours.
● Annotation of conversational aspects.
16. Definition of rumour
Putting together OED and previous research on rumours:
Rumour: “a circulating story of questionable veracity,
which is apparently credible but hard to verify, and
produces sufficient skepticism and/or anxiety.”
17. Data Collection
• Track event on streaming API, e.g. #ferguson.
• Data sampling: According to definition of rumours,
sample tweets that spark high number of Rts.
• Conversations: Collecting associated conversations.
21. Annotation scheme: conversational
aspects of rumours
Designed annotation scheme to:
•
l
Capture sequential features of conversation thread.
l
Analyse effect of interaction at a given point.
l
Break down annotation into tweet triples (or less).
23. Crowdsourcing the annotation
of tweets
Used CrowdFlower for crowdsourcing, 5-10 annotators for tweet-feature
pair.
•
• All data also annotated by two of us, as a reference.
25. Crowdsourcing task
results
• Annotation of 216 tweets in 8 threads
l
3-4 features per tweet: 4,974 units.
• 98 different annotators.
• Final set of annotations obtained through majority voting.
29. Distribution of annotations
Skewed distribution of annotations:
l
l 66.5% of replies are comments.
l
l
79.8% of replies provide no evidence.
l
l
84% of comments provide no evidence.
30. Annotation scheme: conversational
aspects of rumours
(+) Underspecified
l Comments should not be annotated for certainty and
evidentiality (they're not adding anything to veracity anyway)
31. Conclusion
●
Described novel method to collect and annotate rumorous
conversations from Twitter.
●
Introduced annotation scheme for annotation of conversation threads.
l
Annotations looking at tweet triples.
l
Differentiating source tweets and replies.
Scheme iteratively revised and validated through crowdsourcing.
32. Future Work
● With validated annotation scheme, perform larger scale crowdsourced
annotation of conversations.
● Annotation of a wider variety of events, e.g., Charlie Hebdo shooting,
Germanwings plane crash, etc.
● Development of Machine Learning tools:
l
Rumour identification and veracity assessment.
l
Tweet classification: supporting/denying, providing evidence, etc.