SlideShare a Scribd company logo
1 of 28
Download to read offline
Characterizing the Life Cycle
of Online News Stories
Using Social Media Reactions
Carlos Castillo, Mohammed El-Haddad, Matt Stempeck, Jürgen Pfeffer
Twitter: @ChaToX
2
Carlos Castillo – @chatox
http://www.chato.cl/research/
Outline
• Determining classes of news articles
• Predicting traffic using social media
3
Carlos Castillo – @chatox
http://www.chato.cl/research/
Usage analysis in online news
• Aikat (1998)
– Short dwell times, weekday+, weekend-,
bursty traffic.
• Crane and Sornette (2008), Yang and
Leskovec (2011), Lehmann et al. (2012)
– Behavioral classes of attention online
4
Carlos Castillo – @chatox
http://www.chato.cl/research/
Analysis of social media responses
• SocialFlow whitepaper (Lotan, Gaffney,
and Meyer 2011)
– Al Jazeera, BBC News, CNN, The Economist,
Fox News and The New York Times
• Hu et al. (2011)
– Tweets during speech of US president
5
Carlos Castillo – @chatox
http://www.chato.cl/research/
Predictive Web Analytics (references)
6
Carlos Castillo – @chatox
http://www.chato.cl/research/
Data collection
• Three weeks in October 2012
• “Beacon” embedded in Al Jazeera pages
– Real-time data processing
– Apache S4 application for online processing
– Cassandra (NoSQL database) for storage
≈ 3M visits
≈ 200K social media reactions
7
Carlos Castillo – @chatox
http://www.chato.cl/research/
Summary of dataset
8
Carlos Castillo – @chatox
http://www.chato.cl/research/
News In-Depth
Examples:
• US state of Maryland
abolishes death penalty
(May 2nd, 2013)
• Hundreds arrested in
China over 'fake' meat
(May 3rd, 2013)
Examples:
• Spirits of Japan shrine
haunt Asian relations
(May 2nd, 2013)
• Interactive: Powering
the Gulf (May 2nd,
2013)
9
Carlos Castillo – @chatox
http://www.chato.cl/research/
News (322) In-Depth (139)
Tag clouds extracted from titles of articles
Average News profile
Average In-Depth profile
In-Depth items have a slower growth
In-Depth items have a longer shelf-life
In-Depth items are shared on Facebook
News items are shared on Twitter
15
Carlos Castillo – @chatox
http://www.chato.cl/research/
Typical visitation profiles (12 hours)
Decreasing (78%)
Steady (9%)
Increasing (3%)
Rebounding (10%)
Examples
Decreasing
(78%):
● Almost all
breaking news
● Sometimes
delayed due to
timezone
differences, e.g.
Hurricane Sandy
Steady or
Increasing (12%):
● Ongoing news:
Obama/Romney,
Worker strikes in
SA, Syrian unrest
● Articles updated
with supporting
content
Rebounding
(10%):
● Articles picked up
by external
sources or social
media (typically
single source of
traffic)
● Background
articles to new
developments
17
Carlos Castillo – @chatox
http://www.chato.cl/research/
Prediction of visits
• Short-term traffic is to a large extent
correlated with long-term traffic
• Social media signals are correlated with
traffic and shelf-life
More reactions → more traffic
More discussion → longer shelf-life
• Can we predict 7 days after 30 minutes?
18
Carlos Castillo – @chatox
http://www.chato.cl/research/
Predicting traffic and shelf-life online
has a long history
• Predicting long-term behavior and
half-life from short-term observations
– Observations = comments, visits, votes, …
– Behavior = total comments, total visits, …
– 10+ papers specifically on web traffic
• Bit.ly (2011, 2012)
– Studies half-life per topic and platform
Results (traffic predictions)
Results (traffic predictions)
Extrapolate
visits
News are more
predictable than
In-Depth
Results (traffic predictions)
Improved
predictions
Using social
media variables
22
Carlos Castillo – @chatox
http://www.chato.cl/research/
Selected variables, traffic prediction
Results (shelf-life prediction)
Larger
improvements
for In-Depth
articles
Still, this is a 12 hours
error in predicting
something with an
average of 48-72 hours
24
Carlos Castillo – @chatox
http://www.chato.cl/research/
http://fast.qcri.org/
25
Carlos Castillo – @chatox
http://www.chato.cl/research/
What did we learn?
• Decrease, Stay or Increase. Rebound
– Roughly 80:10:10 ratio
• News vs In-Depth: different behavior
• Social media signals are useful to
understand and predict visits
26
Carlos Castillo – @chatox
http://www.chato.cl/research/
Invitation:
ECML/PKDD Discovery Challenge 2014
• Open competition
on predictive Web
Analytics
• Data provided by
Chartbeat Inc.
Thank you!
Carlos Castillo · chato@acm.org
http://www.chato.cl/research/
Characterizing the Life Cycle of Online News Stories Using Social Media Reactions

More Related Content

Viewers also liked

Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement BiasMounia Lalmas-Roelleke
 
Keynote talk: Big Crisis Data, an Open Invitation
Keynote talk: Big Crisis Data, an Open InvitationKeynote talk: Big Crisis Data, an Open Invitation
Keynote talk: Big Crisis Data, an Open InvitationCarlos Castillo (ChaTo)
 
Kdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-iiKdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-iiLaks Lakshmanan
 
Kdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivKdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivLaks Lakshmanan
 
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...IIIT Hyderabad
 
Kdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-iKdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-iLaks Lakshmanan
 
Extracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaExtracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaMuhammad Imran
 
What to Expect When the Unexpected Happens: Social Media Communications Acros...
What to Expect When the Unexpected Happens: Social Media Communications Acros...What to Expect When the Unexpected Happens: Social Media Communications Acros...
What to Expect When the Unexpected Happens: Social Media Communications Acros...Carlos Castillo (ChaTo)
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaDavid Laniado
 
Kdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiKdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiLaks Lakshmanan
 
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...Artificial Intelligence Institute at UofSC
 

Viewers also liked (15)

Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 Social Media News Communities: Gatekeeping, Coverage, and Statement Bias Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
 
Keynote talk: Big Crisis Data, an Open Invitation
Keynote talk: Big Crisis Data, an Open InvitationKeynote talk: Big Crisis Data, an Open Invitation
Keynote talk: Big Crisis Data, an Open Invitation
 
Kdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-iiKdd12 tutorial-inf-part-ii
Kdd12 tutorial-inf-part-ii
 
Kdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-ivKdd12 tutorial-inf-part-iv
Kdd12 tutorial-inf-part-iv
 
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
 
Kdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-iKdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-i
 
Extracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social MediaExtracting Information Nuggets from Disaster-Related Messages in Social Media
Extracting Information Nuggets from Disaster-Related Messages in Social Media
 
What to Expect When the Unexpected Happens: Social Media Communications Acros...
What to Expect When the Unexpected Happens: Social Media Communications Acros...What to Expect When the Unexpected Happens: Social Media Communications Acros...
What to Expect When the Unexpected Happens: Social Media Communications Acros...
 
Fairness-Aware Data Mining
Fairness-Aware Data MiningFairness-Aware Data Mining
Fairness-Aware Data Mining
 
Crisis Computing
Crisis ComputingCrisis Computing
Crisis Computing
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of Wikipedia
 
Kdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iiiKdd12 tutorial-inf-part-iii
Kdd12 tutorial-inf-part-iii
 
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
 
Social Media Mining and Retrieval
Social Media Mining and RetrievalSocial Media Mining and Retrieval
Social Media Mining and Retrieval
 
Discrimination Discovery
Discrimination DiscoveryDiscrimination Discovery
Discrimination Discovery
 

Similar to Characterizing the Life Cycle of Online News Stories Using Social Media Reactions

Ausvotes
AusvotesAusvotes
Ausvoteslchu125
 
Prime "Social" Ministers - François Hollande Analysis
Prime "Social" Ministers - François Hollande AnalysisPrime "Social" Ministers - François Hollande Analysis
Prime "Social" Ministers - François Hollande AnalysisDOING
 
Prime "Social" Ministers - Alexis Tsipras Analysis
Prime "Social" Ministers - Alexis Tsipras AnalysisPrime "Social" Ministers - Alexis Tsipras Analysis
Prime "Social" Ministers - Alexis Tsipras AnalysisDOING
 
Icwsm Politics Panel
Icwsm Politics PanelIcwsm Politics Panel
Icwsm Politics PanelKathy Gill
 
Prime social ministers - David Cameron Analysis
Prime social ministers - David Cameron AnalysisPrime social ministers - David Cameron Analysis
Prime social ministers - David Cameron AnalysisDOING
 
Prime "Social" Ministers - Matteo Renzi Analysis
Prime "Social" Ministers - Matteo Renzi AnalysisPrime "Social" Ministers - Matteo Renzi Analysis
Prime "Social" Ministers - Matteo Renzi AnalysisDOING
 
Filter Bubbles in the Australian Twittersphere?
Filter Bubbles in the Australian Twittersphere?Filter Bubbles in the Australian Twittersphere?
Filter Bubbles in the Australian Twittersphere?Axel Bruns
 
Pizza Talk IV: Fighting Back Shitstorms With An Army of Superfans
Pizza Talk IV: Fighting Back Shitstorms With An Army of SuperfansPizza Talk IV: Fighting Back Shitstorms With An Army of Superfans
Pizza Talk IV: Fighting Back Shitstorms With An Army of Superfansvm-people GmbH
 
AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...Stefan Dietze
 
WBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docx
WBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docxWBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docx
WBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docxcelenarouzie
 
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...Axel Bruns
 
Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...
Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...
Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...SocialMediaDayMI
 
Prime "Social" Ministers - Mariano Rajoy Analysis
Prime "Social" Ministers - Mariano Rajoy AnalysisPrime "Social" Ministers - Mariano Rajoy Analysis
Prime "Social" Ministers - Mariano Rajoy AnalysisDOING
 
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...semanticsconference
 
Ausvotes
AusvotesAusvotes
Ausvoteslchu125
 
Presentation ISCRAM 2012
Presentation ISCRAM 2012Presentation ISCRAM 2012
Presentation ISCRAM 2012Twittercrisis
 

Similar to Characterizing the Life Cycle of Online News Stories Using Social Media Reactions (20)

Ausvotes
AusvotesAusvotes
Ausvotes
 
Prime "Social" Ministers - François Hollande Analysis
Prime "Social" Ministers - François Hollande AnalysisPrime "Social" Ministers - François Hollande Analysis
Prime "Social" Ministers - François Hollande Analysis
 
Prime "Social" Ministers - Alexis Tsipras Analysis
Prime "Social" Ministers - Alexis Tsipras AnalysisPrime "Social" Ministers - Alexis Tsipras Analysis
Prime "Social" Ministers - Alexis Tsipras Analysis
 
Icwsm Politics Panel
Icwsm Politics PanelIcwsm Politics Panel
Icwsm Politics Panel
 
Prime social ministers - David Cameron Analysis
Prime social ministers - David Cameron AnalysisPrime social ministers - David Cameron Analysis
Prime social ministers - David Cameron Analysis
 
Prime "Social" Ministers - Matteo Renzi Analysis
Prime "Social" Ministers - Matteo Renzi AnalysisPrime "Social" Ministers - Matteo Renzi Analysis
Prime "Social" Ministers - Matteo Renzi Analysis
 
Filter Bubbles in the Australian Twittersphere?
Filter Bubbles in the Australian Twittersphere?Filter Bubbles in the Australian Twittersphere?
Filter Bubbles in the Australian Twittersphere?
 
New tools twitter
New tools twitterNew tools twitter
New tools twitter
 
Pizza Talk IV: Fighting Back Shitstorms With An Army of Superfans
Pizza Talk IV: Fighting Back Shitstorms With An Army of SuperfansPizza Talk IV: Fighting Back Shitstorms With An Army of Superfans
Pizza Talk IV: Fighting Back Shitstorms With An Army of Superfans
 
Document(2)
Document(2)Document(2)
Document(2)
 
AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...
 
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterBroker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
 
WBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docx
WBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docxWBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docx
WBS OutlineWork Breakdown Structure OutlineProject Initiation1.1De.docx
 
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...
 
Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...
Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...
Distinguere grano e loglio segnali, rumore e altre storie in un big (data) wo...
 
Prime "Social" Ministers - Mariano Rajoy Analysis
Prime "Social" Ministers - Mariano Rajoy AnalysisPrime "Social" Ministers - Mariano Rajoy Analysis
Prime "Social" Ministers - Mariano Rajoy Analysis
 
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
Vladimir Alexiev | Semantic Enrichment of Twitter Microposts Helps Understand...
 
Twitter 101
Twitter 101Twitter 101
Twitter 101
 
Ausvotes
AusvotesAusvotes
Ausvotes
 
Presentation ISCRAM 2012
Presentation ISCRAM 2012Presentation ISCRAM 2012
Presentation ISCRAM 2012
 

More from Carlos Castillo (ChaTo)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social MediaCarlos Castillo (ChaTo)
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Carlos Castillo (ChaTo)
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Carlos Castillo (ChaTo)
 

More from Carlos Castillo (ChaTo) (20)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social Media
 
When no clicks are good news
When no clicks are good newsWhen no clicks are good news
When no clicks are good news
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)
 
Big Crisis Data for ISPC
Big Crisis Data for ISPCBig Crisis Data for ISPC
Big Crisis Data for ISPC
 
Databeers: Big Crisis Data
Databeers: Big Crisis DataDatabeers: Big Crisis Data
Databeers: Big Crisis Data
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Natural experiments
Natural experimentsNatural experiments
Natural experiments
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Link prediction
Link predictionLink prediction
Link prediction
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Graph Partitioning and Spectral Methods
Graph Partitioning and Spectral MethodsGraph Partitioning and Spectral Methods
Graph Partitioning and Spectral Methods
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
 
Link-Based Ranking
Link-Based RankingLink-Based Ranking
Link-Based Ranking
 
Text Indexing / Inverted Indices
Text Indexing / Inverted IndicesText Indexing / Inverted Indices
Text Indexing / Inverted Indices
 
Indexing
IndexingIndexing
Indexing
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 

Recently uploaded

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Characterizing the Life Cycle of Online News Stories Using Social Media Reactions

  • 1. Characterizing the Life Cycle of Online News Stories Using Social Media Reactions Carlos Castillo, Mohammed El-Haddad, Matt Stempeck, Jürgen Pfeffer Twitter: @ChaToX
  • 2. 2 Carlos Castillo – @chatox http://www.chato.cl/research/ Outline • Determining classes of news articles • Predicting traffic using social media
  • 3. 3 Carlos Castillo – @chatox http://www.chato.cl/research/ Usage analysis in online news • Aikat (1998) – Short dwell times, weekday+, weekend-, bursty traffic. • Crane and Sornette (2008), Yang and Leskovec (2011), Lehmann et al. (2012) – Behavioral classes of attention online
  • 4. 4 Carlos Castillo – @chatox http://www.chato.cl/research/ Analysis of social media responses • SocialFlow whitepaper (Lotan, Gaffney, and Meyer 2011) – Al Jazeera, BBC News, CNN, The Economist, Fox News and The New York Times • Hu et al. (2011) – Tweets during speech of US president
  • 5. 5 Carlos Castillo – @chatox http://www.chato.cl/research/ Predictive Web Analytics (references)
  • 6. 6 Carlos Castillo – @chatox http://www.chato.cl/research/ Data collection • Three weeks in October 2012 • “Beacon” embedded in Al Jazeera pages – Real-time data processing – Apache S4 application for online processing – Cassandra (NoSQL database) for storage ≈ 3M visits ≈ 200K social media reactions
  • 7. 7 Carlos Castillo – @chatox http://www.chato.cl/research/ Summary of dataset
  • 8. 8 Carlos Castillo – @chatox http://www.chato.cl/research/ News In-Depth Examples: • US state of Maryland abolishes death penalty (May 2nd, 2013) • Hundreds arrested in China over 'fake' meat (May 3rd, 2013) Examples: • Spirits of Japan shrine haunt Asian relations (May 2nd, 2013) • Interactive: Powering the Gulf (May 2nd, 2013)
  • 9. 9 Carlos Castillo – @chatox http://www.chato.cl/research/ News (322) In-Depth (139) Tag clouds extracted from titles of articles
  • 12. In-Depth items have a slower growth
  • 13. In-Depth items have a longer shelf-life
  • 14. In-Depth items are shared on Facebook News items are shared on Twitter
  • 15. 15 Carlos Castillo – @chatox http://www.chato.cl/research/ Typical visitation profiles (12 hours) Decreasing (78%) Steady (9%) Increasing (3%) Rebounding (10%)
  • 16. Examples Decreasing (78%): ● Almost all breaking news ● Sometimes delayed due to timezone differences, e.g. Hurricane Sandy Steady or Increasing (12%): ● Ongoing news: Obama/Romney, Worker strikes in SA, Syrian unrest ● Articles updated with supporting content Rebounding (10%): ● Articles picked up by external sources or social media (typically single source of traffic) ● Background articles to new developments
  • 17. 17 Carlos Castillo – @chatox http://www.chato.cl/research/ Prediction of visits • Short-term traffic is to a large extent correlated with long-term traffic • Social media signals are correlated with traffic and shelf-life More reactions → more traffic More discussion → longer shelf-life • Can we predict 7 days after 30 minutes?
  • 18. 18 Carlos Castillo – @chatox http://www.chato.cl/research/ Predicting traffic and shelf-life online has a long history • Predicting long-term behavior and half-life from short-term observations – Observations = comments, visits, votes, … – Behavior = total comments, total visits, … – 10+ papers specifically on web traffic • Bit.ly (2011, 2012) – Studies half-life per topic and platform
  • 20. Results (traffic predictions) Extrapolate visits News are more predictable than In-Depth
  • 22. 22 Carlos Castillo – @chatox http://www.chato.cl/research/ Selected variables, traffic prediction
  • 23. Results (shelf-life prediction) Larger improvements for In-Depth articles Still, this is a 12 hours error in predicting something with an average of 48-72 hours
  • 24. 24 Carlos Castillo – @chatox http://www.chato.cl/research/ http://fast.qcri.org/
  • 25. 25 Carlos Castillo – @chatox http://www.chato.cl/research/ What did we learn? • Decrease, Stay or Increase. Rebound – Roughly 80:10:10 ratio • News vs In-Depth: different behavior • Social media signals are useful to understand and predict visits
  • 26. 26 Carlos Castillo – @chatox http://www.chato.cl/research/ Invitation: ECML/PKDD Discovery Challenge 2014 • Open competition on predictive Web Analytics • Data provided by Chartbeat Inc.
  • 27. Thank you! Carlos Castillo · chato@acm.org http://www.chato.cl/research/