SlideShare a Scribd company logo
1 of 75
Download to read offline
Crisis Computing
Finding relevant and credible information on social
media during disasters
Big Data Analytics Conference
Delhi, India, December 2014
January 2010
How/when did it start for me?
Humanitarian Computing
At least 775publications:
●
Crisis Analysis (55)
●
Crisis Management (309)
●
Situational Awareness (67)
●
Social Media (231)
●
Mobile Phones (74)
●
Crowdsourcing (116)
●
Software and Tools (97)
●
Human-Computer Interaction (28) 
●
Natural Language Processing (33) 
●
Trust and Security (33)
●
Geographical Analysis (53)
Source: http://humanitariancomp.referata.com/
Humanitarian Computing Topics
http://www.youtube.com/watch?v=0UFsJhYBxzY
8
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
An earthquake hits a Twitter user
• When an earthquake strikes, the first tweets are
posted 20-30 seconds later
• Damaging seismic waves travel at 3-5 km/s, while
network communications are light speed on
fiber/copper + latency
• After ~100km seismic waves may be overtaken by
tweets about them
http://xkcd.com/723/
Examples of crisis tweets
Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the
Unexpected Happens: Social Media Communications Across Crises.
To appear in CSCW 2015.
Examples of crisis tweets (cont.)
11
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Fertile grounds for applied research
✔
Problems of global significance
✔
Solved with labor-intensive methods
✔
Better solution provides a public good
✔
Large and noisy data sets available
✔
Engage volunteer communities
12
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Fertile grounds for applied research
✔
Problems of global significance
✔
Solved with labor-intensive methods
✔
Better solution provides a public good
✔
Large and noisy data sets available
✔
Engage volunteer communities
• Relevance to practitioners?
13
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Current collaborators
Patrick Meier
– QCRI
Sarah Vieweg
– QCRI
Muhammad Imran
– QCRI
Irina Temnikova
– QCRI
Alexandra Olteanu
– EPFL
Aditi Gupta
– IIIT Delhi
“P.K.” Kumaraguru
– IIIT Delhi
Fernando Diaz
– Microsoft
14
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Outline
Crisis Maps
Extraction
Matching
Verification
Credibility
Crisis maps from social media
Carlos Castillo, Fernando Diaz, and Hemant Purohit:
Leveraging Social Media and Web of Data to Assist Crisis Response Coordination
Tutorial at SDM, Philadelphia, PA, USA. April 2014.
Hemant Purohit, Carlos Castillo, Patrick Meier and Amit Sheth:
Crisis Mapping, Citizen Sensing and Social Media Analytics
Tutorial at ICWSM, May 2013.
Patrick Meier, Social Innovation Director @ QCRI – http://irevolution.net/
“What can speed humanitarian
response to tsunami-ravaged
coasts? Expose human rights
atrocities? Launch helicopters to
rescue earthquake victims?
Outwit corrupt regimes?
A map.”
21
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Crisis mapping goes mainstream (2011)
http://newsbeatsocial.com/watch/0_s6xxcr3p
Understanding Crisis Tweets
Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the
Unexpected Happens: Social Media Communications Across Crises.
To appear in CSCW 2015.
29
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Types of Disaster
30
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
3.
Extraction
Our approach
2.
Classification
1.
Filtering
31
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Filtering
Is disaster-
related?
Contributes to
situational
awareness?
Yes Yes
No No
32
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Classification
Caution &
Advice
Information
Sources
Damage &
Casualties
Donations
Gov
Eyewitness
Media
NGO
Outsider
...
...
Filtered
tweets
33
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
A large-scale study of crisis tweets
• Collect tweets from 26 disasters
• Classify according to:
●
Informative / Not informative
●
Information provided
●
Information source
34
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Advice on labeling
• Your instructions will never be correct the first
time you try
– e.g. personal / eyewitness
– Instructions must be re-written reactively
– Perform small-scale labeling first
• Instructions must be concrete and brief
– If you can't do it, the task has to be divided
35
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Information Provided in Crisis Tweets
N=26; Data available at http://crisislex.org/
36
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
What do people tweet about?
• Affected individuals
– 20% on average (min. 5%, max. 57%)
– most prevalent in human-induced, focalized & instantaneous events
• Sympathy and emotional support
– 20% on average (min. 3%, max. 52%)
– most prevalent in instantaneous events
• Other useful information
– 32% on average (min. 7%, max. 59%)
– least prevalent in diffused events
37
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
What do people tweet about? (cont.)
• Infrastructure and utilities
– 7% on average (min. 0%, max. 22%)
– most prevalent in diffused events, in particular floods
• Caution and advice
– 10% on average (min. 0%, max. 34%)
– least prevalent in instantaneous & human-induced events
• Donations and volunteering
– 10% on average (min. 0%, max. 44%)
– most prevalent in natural hazards
Distribution over information sources
Distribution over time
Extracting information and matching
emergency-related resources
Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier:
Extracting Information Nuggets from Disaster-Related Messages in Social Media
In ISCRAM. Baden-Baden, Germany, 2013. Best paper award.
Hemant Purohit, Amit Sheth, Carlos Castillo, Patrick Meier, Fernando Diaz:
Emergency-Relief Coord. on Social Media: Auto. Matching Resource Requests and Offers
First Monday 19 (1), January 2014
Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier:
Practical Extraction of Disaster-Relevant Information from Social Media
In SWDM. Rio de Janeiro, Brazil, 2013
41
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Information Extraction
...
Classified
tweets
@JimFreund: Apparently we have no choice.
There is a tornado watch in effect
tonight.
42
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Extraction
• #hashtags, @user mentions, URLs, etc.
– Regular expressions
– Text library from Twitter
• Temporal expressions
– Part-of-speech tagger + heuristics
– Natty library
• Supervised learning
43
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Labels for extraction
• Type-dependent instruction
• Ask evaluators to copy-paste a word/phrase from
each tweet
44
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Learning: Conditional Random Fields
• Used extensively in NLP for part-of-speech tagging
and information extraction
• Representation of observations is important
(capitalization, position, etc.)
HMM Linear-chain CRF
hidden
observed
45
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Tool
• CMU ARK Twitter NLP
– Tokenization
– Feature extraction
– CRF learning
• Very easy to use: simply change the training set
(part-of-speech tags) into anything, and re-train
46
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Output examples
RT @weatherchannel: .@NYGovCuomo orders closing of NYC bridges. Only
Staten Island bridges unaffected at this time. Bridges must close by 7pm. #Sandy
#NYC
Wow what a mess #Sandy has made. Be sure to check on the elderly and
homeless please! Thoughts and prayers to all affected
RT @twc_hurricane: Wind gusts over 60 mph are being reported at Central Park
and JFK airport in #NYC this hour. #Sandy
RT @mitchellreports: Red Cross tells us grateful for Romney donation but prefer
people send money or donate blood dont collect goods NOT best way to help
#Sandy
47
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Extractor evaluation
Setting Rec Prec
Train 2/3 Joplin, Test 1/3 Joplin 78% 90%
Train 2/3 Sandy, Test 1/3 Sandy 41% 79%
Train Joplin, Test Sandy 11% 78%
Train Joplin + 10% Sandy, Test 90% Sandy 21% 81%
• Precision is: one word or more in common with
what humans extracted
48
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Donations matching
• Identify and match requests/offers for donations
– Money, clothing, food, shelter, volunteers, blood
Average precision = 0.21 (0.16 if only text similarity is used)
Crowdsourced stream processing systems
Muhammad Imran, Ioanna Lykourentzou and Carlos Castillo:
Engineering Crowdsourced Stream Processing Systems
http://arxiv.org/abs/1310.5463
50
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
51
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Design objectives and principles
Design principles
Design objective Example metric Automatic
components
Crowdsourced
components
Low latency End-to-end time Keep-items moving Trivial tasks
High throughput Output items per
unit of time
High-performance
processing
Task automation
Load adaptability Rate response
function
Load shedding, load
queueing
Task prioritization
Cost effectiveness Cost vs. quality,
throughput, etc.
N/A Task frugality
High quality Application-
dependent
Redudancy, aggregation and quality control
Design patterns
● QA loop
● Task assignment
● Process/verify
● Supervised learning
● Crowdwork sub-task
chaining
● Humans are not a
bottleneck
● Humans review every
output element
53
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
http://aidr.qcri.org/
54
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Self-service for crisis-related classification
Unstructured
text reports
Categorized
information
Automatic
classifier
Model
Builder
Crowdsourced
ground-truth
Library of
training data
Credibility and verification
Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo and Patrick Meier:
TweetCred: A Real-time Web-based System for Credibility of Content on Twitter
In SocInfo 2014. Runner-up for best paper award.
Carlos Castillo, Marcelo Mendoza, Barbara Poblete:
Predicting Information Credibility in Time-Sensitive Social Media
In Internet Research, Vol. 23, Issue 5. October 2013.
A. Popoola, D. Krasnoshtan, A. Toth, V. Naroditskiy, C. Castillo, P. Meier and I. Rahwan:
Information Verification during Natural Disasters
Social Web and Disaster Management (SWDM) workshop, 2013.
3
http://www.youtube.com/watch?v=pAHoEO-K0Ek
62
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Crowdsourced verification: Veri.ly
• Frame crowdwork correctly
• Not upvoting/downvoting a claim
• Instead, providing evidence for/against
@VeriDotLy — http://veri.ly/
65
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Examples of evidence provided
66
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Automatic credibility evaluation: TweetCred
• Real-time web-based service
• Used as a Chrome extension
• Annotates Twitter's timeline with credibility
scores
67
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
http://twitdigest.iiitd.edu.in/TweetCred/
68
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Next steps
• Credibility facets
– Factually written
– Detailed
– Author on the ground
– ...
• Respond to searches about an event
Closing remarks
71
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Computationally
feasible
Supported by
data
Useful
Good projects in this space
72
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Computationally
feasible
Supported by
data
Useful
Good projects in this space
Temptation! Danger!
Poorly planned
projects :-(
AI-complete
problems
73
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Some venues
• SWDM – Workshop on Social Web
for Disaster Management
– Deadline: January 24th
• ISCRAM – International Conference on Information Systems
for Crisis Response and Management
+ the usual suspects, depending on your area ;-)
74
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Possibility of large impact by using computer
science to support humanitarian work
=
Applied computing at its best
Thank you!
Carlos Castillo · chato@acm.org
http://www.chato.cl/research/
With thanks to Patrick Meier for several slides

More Related Content

What's hot

Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Artificial Intelligence Institute at UofSC
 
The case for integrating crisis response with social media
The case for integrating crisis response with social media The case for integrating crisis response with social media
The case for integrating crisis response with social media
American Red Cross
 
Capstone Lessons Learned
Capstone Lessons LearnedCapstone Lessons Learned
Capstone Lessons Learned
Guy DeMarco
 
Emergency Risk Communication
Emergency Risk CommunicationEmergency Risk Communication
Emergency Risk Communication
Heather Blanchard
 
CDG14_BRIEF_ArchiveSocial_V
CDG14_BRIEF_ArchiveSocial_VCDG14_BRIEF_ArchiveSocial_V
CDG14_BRIEF_ArchiveSocial_V
Bronlea Mishler
 
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
Nalaka Gunawardene
 
Snowden-final-report-for-publication
Snowden-final-report-for-publicationSnowden-final-report-for-publication
Snowden-final-report-for-publication
Zarte Siempre
 
Com 427 final presentation
Com 427 final presentationCom 427 final presentation
Com 427 final presentation
Kyle Basedow
 

What's hot (20)

NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace CoordinationNCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
 
Applying citizen science model to disaster management
Applying citizen science model to disaster managementApplying citizen science model to disaster management
Applying citizen science model to disaster management
 
Real-Time Processing of Social Media Content for Social Good
Real-Time Processing of Social Media Content for Social GoodReal-Time Processing of Social Media Content for Social Good
Real-Time Processing of Social Media Content for Social Good
 
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
 
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
 
The case for integrating crisis response with social media
The case for integrating crisis response with social media The case for integrating crisis response with social media
The case for integrating crisis response with social media
 
Extracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaExtracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social Media
 
Processing Social Media Messages in Mass Emergency: A Survey
Processing Social Media Messages in Mass Emergency: A SurveyProcessing Social Media Messages in Mass Emergency: A Survey
Processing Social Media Messages in Mass Emergency: A Survey
 
SOCIAL MEDIA: BEFORE, DURING AND AFTER A DISASTER
SOCIAL MEDIA: BEFORE, DURING AND AFTER A DISASTERSOCIAL MEDIA: BEFORE, DURING AND AFTER A DISASTER
SOCIAL MEDIA: BEFORE, DURING AND AFTER A DISASTER
 
Role of social media in disaster management
Role of social media in disaster managementRole of social media in disaster management
Role of social media in disaster management
 
Capstone Lessons Learned
Capstone Lessons LearnedCapstone Lessons Learned
Capstone Lessons Learned
 
Emergency Risk Communication
Emergency Risk CommunicationEmergency Risk Communication
Emergency Risk Communication
 
Twitris in Action - a review of its many applications
Twitris in Action - a review of its many applications Twitris in Action - a review of its many applications
Twitris in Action - a review of its many applications
 
CDG14_BRIEF_ArchiveSocial_V
CDG14_BRIEF_ArchiveSocial_VCDG14_BRIEF_ArchiveSocial_V
CDG14_BRIEF_ArchiveSocial_V
 
Department of Homeland Security Report- Lessons Learned Using Social Media Du...
Department of Homeland Security Report- Lessons Learned Using Social Media Du...Department of Homeland Security Report- Lessons Learned Using Social Media Du...
Department of Homeland Security Report- Lessons Learned Using Social Media Du...
 
Lessons learned from Social media intervention during hurricane Sandy
Lessons learned from Social media intervention during hurricane SandyLessons learned from Social media intervention during hurricane Sandy
Lessons learned from Social media intervention during hurricane Sandy
 
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
 
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
 
Snowden-final-report-for-publication
Snowden-final-report-for-publicationSnowden-final-report-for-publication
Snowden-final-report-for-publication
 
Com 427 final presentation
Com 427 final presentationCom 427 final presentation
Com 427 final presentation
 

Viewers also liked

Kdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-iKdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-i
Laks Lakshmanan
 
Kdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-iiKdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-ii
Laks Lakshmanan
 
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
IIIT Hyderabad
 
Kdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivKdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-iv
Laks Lakshmanan
 
Kdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiKdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iii
Laks Lakshmanan
 

Viewers also liked (20)

Keynote talk: Big Crisis Data, an Open Invitation
Keynote talk: Big Crisis Data, an Open InvitationKeynote talk: Big Crisis Data, an Open Invitation
Keynote talk: Big Crisis Data, an Open Invitation
 
Fairness-Aware Data Mining
Fairness-Aware Data MiningFairness-Aware Data Mining
Fairness-Aware Data Mining
 
Discrimination Discovery
Discrimination DiscoveryDiscrimination Discovery
Discrimination Discovery
 
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
 
Databeers: Big Crisis Data
Databeers: Big Crisis DataDatabeers: Big Crisis Data
Databeers: Big Crisis Data
 
Big Crisis Data for ISPC
Big Crisis Data for ISPCBig Crisis Data for ISPC
Big Crisis Data for ISPC
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)
 
Dr. Searcher and Mr. Browser: A unified hyperlink-click graph
Dr. Searcher and Mr. Browser: A unified hyperlink-click graphDr. Searcher and Mr. Browser: A unified hyperlink-click graph
Dr. Searcher and Mr. Browser: A unified hyperlink-click graph
 
Characterizing the Life Cycle of Online News Stories Using Social Media React...
Characterizing the Life Cycle of Online News Stories Using Social Media React...Characterizing the Life Cycle of Online News Stories Using Social Media React...
Characterizing the Life Cycle of Online News Stories Using Social Media React...
 
The Effects of Time on Query Flow Graph-based Models for Query Suggestion
The Effects of Time on Query Flow Graph-based Models for Query SuggestionThe Effects of Time on Query Flow Graph-based Models for Query Suggestion
The Effects of Time on Query Flow Graph-based Models for Query Suggestion
 
Information Verification During Natural Disasters
Information Verification During Natural DisastersInformation Verification During Natural Disasters
Information Verification During Natural Disasters
 
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 
Kdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-iKdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-i
 
Kdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-iiKdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-ii
 
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
 
Kdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivKdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-iv
 
What to Expect When the Unexpected Happens: Social Media Communications Acros...
What to Expect When the Unexpected Happens: Social Media Communications Acros...What to Expect When the Unexpected Happens: Social Media Communications Acros...
What to Expect When the Unexpected Happens: Social Media Communications Acros...
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of Wikipedia
 
Kdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiKdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iii
 
Social Media Mining and Retrieval
Social Media Mining and RetrievalSocial Media Mining and Retrieval
Social Media Mining and Retrieval
 

Similar to Crisis Computing

Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
Amit Sheth
 
Presentation ISCRAM 2012
Presentation ISCRAM 2012Presentation ISCRAM 2012
Presentation ISCRAM 2012
Twittercrisis
 
International Journal of Communication 6 (2012), 2870–2893 193
International Journal of Communication 6 (2012), 2870–2893 193International Journal of Communication 6 (2012), 2870–2893 193
International Journal of Communication 6 (2012), 2870–2893 193
TatianaMajor22
 

Similar to Crisis Computing (20)

Crisis Informatics (November 2013)
Crisis Informatics (November 2013)Crisis Informatics (November 2013)
Crisis Informatics (November 2013)
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
 
Leveraging Social Media Communities for Crisis Response Coordination
Leveraging Social Media Communities for Crisis Response CoordinationLeveraging Social Media Communities for Crisis Response Coordination
Leveraging Social Media Communities for Crisis Response Coordination
 
Examples of Real-World Big Data Application
Examples of Real-World Big Data ApplicationExamples of Real-World Big Data Application
Examples of Real-World Big Data Application
 
08302011 cc vtc_risk
08302011 cc vtc_risk08302011 cc vtc_risk
08302011 cc vtc_risk
 
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
 
Presentation ISCRAM 2012
Presentation ISCRAM 2012Presentation ISCRAM 2012
Presentation ISCRAM 2012
 
InfoCrisis.Social - Design Process
InfoCrisis.Social - Design ProcessInfoCrisis.Social - Design Process
InfoCrisis.Social - Design Process
 
Take two tweets social media for doctors
Take two tweets social media for doctorsTake two tweets social media for doctors
Take two tweets social media for doctors
 
Middlebury Institute May 2016
Middlebury Institute May 2016Middlebury Institute May 2016
Middlebury Institute May 2016
 
Cc chicagocounts2
Cc chicagocounts2Cc chicagocounts2
Cc chicagocounts2
 
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
 
Pacific Endeavor 2012 Presentation
Pacific Endeavor 2012 PresentationPacific Endeavor 2012 Presentation
Pacific Endeavor 2012 Presentation
 
2009 ubc-the-ultimate-hack-slides-2.0-final-with-notes
2009 ubc-the-ultimate-hack-slides-2.0-final-with-notes2009 ubc-the-ultimate-hack-slides-2.0-final-with-notes
2009 ubc-the-ultimate-hack-slides-2.0-final-with-notes
 
Sprinting with Data
Sprinting with DataSprinting with Data
Sprinting with Data
 
Microcelebrity and The Tenure Track
Microcelebrity and The Tenure TrackMicrocelebrity and The Tenure Track
Microcelebrity and The Tenure Track
 
International Journal of Communication 6 (2012), 2870–2893 193
International Journal of Communication 6 (2012), 2870–2893 193International Journal of Communication 6 (2012), 2870–2893 193
International Journal of Communication 6 (2012), 2870–2893 193
 
Bushfire Connect - Trust, Transparency & Timeliness
Bushfire Connect - Trust, Transparency & TimelinessBushfire Connect - Trust, Transparency & Timeliness
Bushfire Connect - Trust, Transparency & Timeliness
 
Mvandervlugtbushfireconnect 110323012900 Phpapp02
Mvandervlugtbushfireconnect 110323012900 Phpapp02Mvandervlugtbushfireconnect 110323012900 Phpapp02
Mvandervlugtbushfireconnect 110323012900 Phpapp02
 
Evolution of the Humanitarian Data Ecosystem
Evolution of the Humanitarian Data EcosystemEvolution of the Humanitarian Data Ecosystem
Evolution of the Humanitarian Data Ecosystem
 

More from Carlos Castillo (ChaTo)

More from Carlos Castillo (ChaTo) (19)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social Media
 
When no clicks are good news
When no clicks are good newsWhen no clicks are good news
When no clicks are good news
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Natural experiments
Natural experimentsNatural experiments
Natural experiments
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Link prediction
Link predictionLink prediction
Link prediction
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Graph Partitioning and Spectral Methods
Graph Partitioning and Spectral MethodsGraph Partitioning and Spectral Methods
Graph Partitioning and Spectral Methods
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
 
Link-Based Ranking
Link-Based RankingLink-Based Ranking
Link-Based Ranking
 
Text Indexing / Inverted Indices
Text Indexing / Inverted IndicesText Indexing / Inverted Indices
Text Indexing / Inverted Indices
 
Indexing
IndexingIndexing
Indexing
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 
Clustering
ClusteringClustering
Clustering
 
Text similarity and the vector space model
Text similarity and the vector space modelText similarity and the vector space model
Text similarity and the vector space model
 
Intro to Creative Commons (May 2015)
Intro to Creative Commons (May 2015)Intro to Creative Commons (May 2015)
Intro to Creative Commons (May 2015)
 

Recently uploaded

Capstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfCapstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdf
eliklein8
 
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
ZurliaSoop
 
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
Cara Menggugurkan Kandungan 087776558899
 
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Heena Escort Service
 
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
ZurliaSoop
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
eliklein8
 
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdfSociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
SocioCosmos
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
Health
 
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (20)

Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolution
 
Capstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdfCapstone slidedeck for my capstone project part 2.pdf
Capstone slidedeck for my capstone project part 2.pdf
 
Content strategy : Content empire and cash in
Content strategy : Content empire and cash inContent strategy : Content empire and cash in
Content strategy : Content empire and cash in
 
The Butterfly Effect
The Butterfly EffectThe Butterfly Effect
The Butterfly Effect
 
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
Jual Obat Aborsi Kudus ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cy...
 
Sri Ganganagar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Sri Ganganagar Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsSri Ganganagar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Sri Ganganagar Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
💊💊 OBAT PENGGUGUR KANDUNGAN SEMARANG 087776-558899 ABORSI KLINIK SEMARANG
 
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
Meet Incall & Out Escort Service in D -9634446618 | #escort Service in GTB Na...
 
Enhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingEnhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content Marketing
 
Jhunjhunu Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Jhunjhunu Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsJhunjhunu Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Jhunjhunu Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdfSEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
 
Coorg Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Coorg Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsCoorg Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Coorg Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
Jual Obat Aborsi Palu ( Taiwan No.1 ) 085657271886 Obat Penggugur Kandungan C...
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
 
Marketing Plan - Social Media. The Sparks Foundation
Marketing Plan -  Social Media. The Sparks FoundationMarketing Plan -  Social Media. The Sparks Foundation
Marketing Plan - Social Media. The Sparks Foundation
 
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdfSociocosmos empowers you to go trendy on social media with a few clicks..pdf
Sociocosmos empowers you to go trendy on social media with a few clicks..pdf
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
 
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
JUAL PILL CYTOTEC PALOPO SULAWESI 087776558899 OBAT PENGGUGUR KANDUNGAN PALOP...
 
Kayamkulam Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kayamkulam Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKayamkulam Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kayamkulam Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRBVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
 

Crisis Computing

  • 1. Crisis Computing Finding relevant and credible information on social media during disasters Big Data Analytics Conference Delhi, India, December 2014
  • 2. January 2010 How/when did it start for me?
  • 3. Humanitarian Computing At least 775publications: ● Crisis Analysis (55) ● Crisis Management (309) ● Situational Awareness (67) ● Social Media (231) ● Mobile Phones (74) ● Crowdsourcing (116) ● Software and Tools (97) ● Human-Computer Interaction (28)  ● Natural Language Processing (33)  ● Trust and Security (33) ● Geographical Analysis (53) Source: http://humanitariancomp.referata.com/
  • 5.
  • 6.
  • 8. 8 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ An earthquake hits a Twitter user • When an earthquake strikes, the first tweets are posted 20-30 seconds later • Damaging seismic waves travel at 3-5 km/s, while network communications are light speed on fiber/copper + latency • After ~100km seismic waves may be overtaken by tweets about them http://xkcd.com/723/
  • 10. Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the Unexpected Happens: Social Media Communications Across Crises. To appear in CSCW 2015. Examples of crisis tweets (cont.)
  • 11. 11 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Fertile grounds for applied research ✔ Problems of global significance ✔ Solved with labor-intensive methods ✔ Better solution provides a public good ✔ Large and noisy data sets available ✔ Engage volunteer communities
  • 12. 12 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Fertile grounds for applied research ✔ Problems of global significance ✔ Solved with labor-intensive methods ✔ Better solution provides a public good ✔ Large and noisy data sets available ✔ Engage volunteer communities • Relevance to practitioners?
  • 13. 13 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Current collaborators Patrick Meier – QCRI Sarah Vieweg – QCRI Muhammad Imran – QCRI Irina Temnikova – QCRI Alexandra Olteanu – EPFL Aditi Gupta – IIIT Delhi “P.K.” Kumaraguru – IIIT Delhi Fernando Diaz – Microsoft
  • 14. 14 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Outline Crisis Maps Extraction Matching Verification Credibility
  • 15. Crisis maps from social media Carlos Castillo, Fernando Diaz, and Hemant Purohit: Leveraging Social Media and Web of Data to Assist Crisis Response Coordination Tutorial at SDM, Philadelphia, PA, USA. April 2014. Hemant Purohit, Carlos Castillo, Patrick Meier and Amit Sheth: Crisis Mapping, Citizen Sensing and Social Media Analytics Tutorial at ICWSM, May 2013.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Patrick Meier, Social Innovation Director @ QCRI – http://irevolution.net/ “What can speed humanitarian response to tsunami-ravaged coasts? Expose human rights atrocities? Launch helicopters to rescue earthquake victims? Outwit corrupt regimes? A map.”
  • 21. 21 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Crisis mapping goes mainstream (2011)
  • 22.
  • 23.
  • 24.
  • 25.
  • 27.
  • 28. Understanding Crisis Tweets Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the Unexpected Happens: Social Media Communications Across Crises. To appear in CSCW 2015.
  • 29. 29 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Types of Disaster
  • 30. 30 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ 3. Extraction Our approach 2. Classification 1. Filtering
  • 31. 31 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Filtering Is disaster- related? Contributes to situational awareness? Yes Yes No No
  • 32. 32 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Classification Caution & Advice Information Sources Damage & Casualties Donations Gov Eyewitness Media NGO Outsider ... ... Filtered tweets
  • 33. 33 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ A large-scale study of crisis tweets • Collect tweets from 26 disasters • Classify according to: ● Informative / Not informative ● Information provided ● Information source
  • 34. 34 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Advice on labeling • Your instructions will never be correct the first time you try – e.g. personal / eyewitness – Instructions must be re-written reactively – Perform small-scale labeling first • Instructions must be concrete and brief – If you can't do it, the task has to be divided
  • 35. 35 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Information Provided in Crisis Tweets N=26; Data available at http://crisislex.org/
  • 36. 36 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ What do people tweet about? • Affected individuals – 20% on average (min. 5%, max. 57%) – most prevalent in human-induced, focalized & instantaneous events • Sympathy and emotional support – 20% on average (min. 3%, max. 52%) – most prevalent in instantaneous events • Other useful information – 32% on average (min. 7%, max. 59%) – least prevalent in diffused events
  • 37. 37 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ What do people tweet about? (cont.) • Infrastructure and utilities – 7% on average (min. 0%, max. 22%) – most prevalent in diffused events, in particular floods • Caution and advice – 10% on average (min. 0%, max. 34%) – least prevalent in instantaneous & human-induced events • Donations and volunteering – 10% on average (min. 0%, max. 44%) – most prevalent in natural hazards
  • 40. Extracting information and matching emergency-related resources Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier: Extracting Information Nuggets from Disaster-Related Messages in Social Media In ISCRAM. Baden-Baden, Germany, 2013. Best paper award. Hemant Purohit, Amit Sheth, Carlos Castillo, Patrick Meier, Fernando Diaz: Emergency-Relief Coord. on Social Media: Auto. Matching Resource Requests and Offers First Monday 19 (1), January 2014 Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier: Practical Extraction of Disaster-Relevant Information from Social Media In SWDM. Rio de Janeiro, Brazil, 2013
  • 41. 41 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Information Extraction ... Classified tweets @JimFreund: Apparently we have no choice. There is a tornado watch in effect tonight.
  • 42. 42 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Extraction • #hashtags, @user mentions, URLs, etc. – Regular expressions – Text library from Twitter • Temporal expressions – Part-of-speech tagger + heuristics – Natty library • Supervised learning
  • 43. 43 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Labels for extraction • Type-dependent instruction • Ask evaluators to copy-paste a word/phrase from each tweet
  • 44. 44 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Learning: Conditional Random Fields • Used extensively in NLP for part-of-speech tagging and information extraction • Representation of observations is important (capitalization, position, etc.) HMM Linear-chain CRF hidden observed
  • 45. 45 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Tool • CMU ARK Twitter NLP – Tokenization – Feature extraction – CRF learning • Very easy to use: simply change the training set (part-of-speech tags) into anything, and re-train
  • 46. 46 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Output examples RT @weatherchannel: .@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges unaffected at this time. Bridges must close by 7pm. #Sandy #NYC Wow what a mess #Sandy has made. Be sure to check on the elderly and homeless please! Thoughts and prayers to all affected RT @twc_hurricane: Wind gusts over 60 mph are being reported at Central Park and JFK airport in #NYC this hour. #Sandy RT @mitchellreports: Red Cross tells us grateful for Romney donation but prefer people send money or donate blood dont collect goods NOT best way to help #Sandy
  • 47. 47 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Extractor evaluation Setting Rec Prec Train 2/3 Joplin, Test 1/3 Joplin 78% 90% Train 2/3 Sandy, Test 1/3 Sandy 41% 79% Train Joplin, Test Sandy 11% 78% Train Joplin + 10% Sandy, Test 90% Sandy 21% 81% • Precision is: one word or more in common with what humans extracted
  • 48. 48 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Donations matching • Identify and match requests/offers for donations – Money, clothing, food, shelter, volunteers, blood Average precision = 0.21 (0.16 if only text similarity is used)
  • 49. Crowdsourced stream processing systems Muhammad Imran, Ioanna Lykourentzou and Carlos Castillo: Engineering Crowdsourced Stream Processing Systems http://arxiv.org/abs/1310.5463
  • 50. 50 Carlos Castillo – chato@acm.org http://www.chato.cl/research/
  • 51. 51 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Design objectives and principles Design principles Design objective Example metric Automatic components Crowdsourced components Low latency End-to-end time Keep-items moving Trivial tasks High throughput Output items per unit of time High-performance processing Task automation Load adaptability Rate response function Load shedding, load queueing Task prioritization Cost effectiveness Cost vs. quality, throughput, etc. N/A Task frugality High quality Application- dependent Redudancy, aggregation and quality control
  • 52. Design patterns ● QA loop ● Task assignment ● Process/verify ● Supervised learning ● Crowdwork sub-task chaining ● Humans are not a bottleneck ● Humans review every output element
  • 53. 53 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ http://aidr.qcri.org/
  • 54. 54 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Self-service for crisis-related classification Unstructured text reports Categorized information Automatic classifier Model Builder Crowdsourced ground-truth Library of training data
  • 55.
  • 56.
  • 57. Credibility and verification Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo and Patrick Meier: TweetCred: A Real-time Web-based System for Credibility of Content on Twitter In SocInfo 2014. Runner-up for best paper award. Carlos Castillo, Marcelo Mendoza, Barbara Poblete: Predicting Information Credibility in Time-Sensitive Social Media In Internet Research, Vol. 23, Issue 5. October 2013. A. Popoola, D. Krasnoshtan, A. Toth, V. Naroditskiy, C. Castillo, P. Meier and I. Rahwan: Information Verification during Natural Disasters Social Web and Disaster Management (SWDM) workshop, 2013.
  • 58.
  • 59. 3
  • 60.
  • 62. 62 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Crowdsourced verification: Veri.ly • Frame crowdwork correctly • Not upvoting/downvoting a claim • Instead, providing evidence for/against @VeriDotLy — http://veri.ly/
  • 63.
  • 64.
  • 65. 65 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Examples of evidence provided
  • 66. 66 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Automatic credibility evaluation: TweetCred • Real-time web-based service • Used as a Chrome extension • Annotates Twitter's timeline with credibility scores
  • 67. 67 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ http://twitdigest.iiitd.edu.in/TweetCred/
  • 68. 68 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Next steps • Credibility facets – Factually written – Detailed – Author on the ground – ... • Respond to searches about an event
  • 69.
  • 71. 71 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Computationally feasible Supported by data Useful Good projects in this space
  • 72. 72 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Computationally feasible Supported by data Useful Good projects in this space Temptation! Danger! Poorly planned projects :-( AI-complete problems
  • 73. 73 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Some venues • SWDM – Workshop on Social Web for Disaster Management – Deadline: January 24th • ISCRAM – International Conference on Information Systems for Crisis Response and Management + the usual suspects, depending on your area ;-)
  • 74. 74 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Possibility of large impact by using computer science to support humanitarian work = Applied computing at its best
  • 75. Thank you! Carlos Castillo · chato@acm.org http://www.chato.cl/research/ With thanks to Patrick Meier for several slides