SlideShare a Scribd company logo
1 of 17
Tweeting for
Hillary
Li Meng, Matt Beaulieu, ML Tlachac, Yousef Fadila
DS 501 : Introduction To Data Science – Case Study 1: Collecting Data from Twitter
https://github.com/yousef-fadila/casestudy1/blob/master/CaseStudy1.ipynb
“The more compelling campaign
is a direct result of better data
collection, analysis and smart
decision making”
-PromptCloud
Motivation
Social media is a means for getting political news, and initiating
political discussion
Being able to interpret data with regards to the election would
give a campaign manager live feedback on how their
candidates actions likely impact polling
This allows them to gain an advantage by reacting accordingly
to changing political climates
The Data
Pulled about 15.5K Tweets from the
twitter streaming API
Filter based on:
Language: en
Tweets mentioning @Hillary Clinton
Can then process hashtags,
mentions, and relevant words, to
Most Frequent Words
Appearances Word
1240 trump
915 hillary
113 benghazi
346 cant
142 didnt
252 doesnt
146 poorest
117 trumps
130 wont
259 pneumonia
87 footing
192 liar
232 donors
541 dont
45 dnc
Appearances Word
245 thats
91 isnt
41 tweet
63 ive
85 nypd
142 systematically
66 whats
68 cough
61 hypocrisy
32 dishonesty
103 crooked
40 theres
47 stamina
66 unfit
30 scum
Types of Frequent Words
1. Opponent: trump, trumps
2. Criticism: unfit, liar, hypocrisy
3. Topics: bodyguards, benghazi, poorest, blackmail, pneumonia,
audiobooks
4. Patterns: cant, doesnt, didnt, wont, dont, isnt
Popular Tweets
Entity Popularity
Screen Name Mentions
HillaryClinton 15421
RealDonaldTrump 2718
FoxNews 1532
POTUS 503
CNN 481
politico 283
timkaine 263
FLOTUS 245
MSNBC 244
USAneedsTRUMP 235
Popular Mentions with @HillaryClinton Popular #hashtags with @HillaryClinton
Hashtag Count
#MAGA 385
#ImWithHer 351
#SpecialReport 209
#NeverHillary 178
#DNCLeak 177
#HispanicHeritageMonth 163
#tcot 156
#Trump 149
#TrumpPence16 125
#HillaryHealth 102
Hillary’s Friends
ID Screen Name
571202103 Medium
21337440 ChildDefender
23449384 amberdiscko
128790234 Samynemir
1656913327 sarajacobs89
325886383 SammyKoppelman
802430450 Natasha_S_Law
729761993461248000 ktvibbs
115740215 SarahAudelo
34782406 Lincoln_Ross
3044781131 HillaryforAR
113298560 GunaRockYa
15972271 CdotDukes
582037089 MiguelAyala312
734768872625188864 AndrewBatesNC
41021335 TroyClair
4736170399 BrianZuzenak
150885854 SarahPeckVA
231673 yianni
125083946 GillDrummond
● Communication Directors
● Charities
● Media Websites
● United States Senators
● etc.
Sentiment Analysis
Using Python’s NLTK text classifier, classified each tweet as “Positive”,
“Negative”, or “Neutral”.
Could give an idea of how “twitter” felt about Hillary Clinton
Positive Neutral Negative
Geographic Analysis
Using the “positivity” of each tweet, we formed a ratio of positive and
negative tweets, and compared it national polling data, to see how
tweet hashtags related to polling data, if at all.
Sentiment Analysis on Text
Hashtags in Positive Tweets Count
#HispanicHeritageMonth 118
#ImWithHer 107
#MAGA 72
#tcot 65
#Democrats 50
#RedNationRising 46
#WakeUpAmerica 43
#NeverHillary 32
#HillaryClinton 31
Hashtags in Negative Tweets Count
#ImWithHer 74
#LatinosWithTrump 51
#AmericansUnitedForTrump 49
#MAGA 42
#NeverHillary 39
#CrookedHillary 38
● Broke down the most popular hashtags in
positive and negative tweets
● Some hashtags, in either table, seemed out
of place
● This could be part of the source of error in the
sentiment classification
Sentiment analysis on Hashtags
Manually identify positive and negative hashtags, and use this to
determine popular words in tweets containing those hashtags in order
to re-train the NLTK alogrithim
Positive Hashtags include...
● Never Trump
● Hillary2016
● StrongerTogether
● Vote
● UnitedBlue
Negative Hashtags include...
● MAGA
● NeverHillary
● CrookedHillary
● LatinoswithTrump
● AmericansUnitedwithTrump
Conclusions
Word frequency analysis revealed relevant tweets to Clinton, and issues that
she could consider addressing, or at least know what’s being talked about.
Judging tweets by positive or negative sentiment gave mixed results.
Training the positive and negative classifier on positive or negative hashtags
proved more insightful.
Ultimately, 15.5K tweets is not enough data, especially when separating it by
state.
Twitter has great potential to be useful to campaigns.
Thank You
Questions?
Source code and Charts: https://github.com/yousef-fadila/casestudy1/blob/master/CaseStudy1.ipynb

More Related Content

Viewers also liked

Viewers also liked (20)

El princito por kc y mr
El princito por kc y mrEl princito por kc y mr
El princito por kc y mr
 
2017.01.20
2017.01.202017.01.20
2017.01.20
 
Hacia una cultura ecológica
Hacia una cultura ecológicaHacia una cultura ecológica
Hacia una cultura ecológica
 
Spot deceptive TripAdvisor Reviews
Spot deceptive TripAdvisor ReviewsSpot deceptive TripAdvisor Reviews
Spot deceptive TripAdvisor Reviews
 
Good angle bad angle by dr faustus
Good angle bad angle by dr faustusGood angle bad angle by dr faustus
Good angle bad angle by dr faustus
 
Europe Language Jobs Annual Review 2016
Europe Language Jobs Annual Review 2016Europe Language Jobs Annual Review 2016
Europe Language Jobs Annual Review 2016
 
Trabajo
TrabajoTrabajo
Trabajo
 
R25798
R25798R25798
R25798
 
Mery sanchez....
Mery sanchez....Mery sanchez....
Mery sanchez....
 
Incapacitació i tutela i altres mesures legals
Incapacitació i tutela i altres mesures legalsIncapacitació i tutela i altres mesures legals
Incapacitació i tutela i altres mesures legals
 
Actividades para productos notables y factorizaciones induccion
Actividades para productos notables y factorizaciones induccionActividades para productos notables y factorizaciones induccion
Actividades para productos notables y factorizaciones induccion
 
Topología
TopologíaTopología
Topología
 
Tercer indicador. michel y lina
Tercer indicador. michel y linaTercer indicador. michel y lina
Tercer indicador. michel y lina
 
Por la orda
Por la ordaPor la orda
Por la orda
 
Reconocimiento general y de actores
Reconocimiento general y de actoresReconocimiento general y de actores
Reconocimiento general y de actores
 
Historia de roma
Historia de romaHistoria de roma
Historia de roma
 
Oa slide
Oa slideOa slide
Oa slide
 
Unidad 5 el univerrsomodificado (1)
Unidad 5 el univerrsomodificado (1)Unidad 5 el univerrsomodificado (1)
Unidad 5 el univerrsomodificado (1)
 
Matrixprop
MatrixpropMatrixprop
Matrixprop
 
INVESTIGATING THE STRUCTURE, MORPHOLOGY AND OPTICAL BAND GAP OF CADMIUM SULPH...
INVESTIGATING THE STRUCTURE, MORPHOLOGY AND OPTICAL BAND GAP OF CADMIUM SULPH...INVESTIGATING THE STRUCTURE, MORPHOLOGY AND OPTICAL BAND GAP OF CADMIUM SULPH...
INVESTIGATING THE STRUCTURE, MORPHOLOGY AND OPTICAL BAND GAP OF CADMIUM SULPH...
 

Similar to Tweeting for Hillary - DS 501 case study 1

Data Science Poster Final
Data Science Poster FinalData Science Poster Final
Data Science Poster FinalJesse Hinson
 
Twitter Analysis: Fake News
Twitter Analysis: Fake  NewsTwitter Analysis: Fake  News
Twitter Analysis: Fake NewsErika Siregar
 
The Political Session
The Political SessionThe Political Session
The Political SessionJason Preston
 
Australian Political Parties and social media: uses and attitudes
Australian Political Parties and social media: uses and attitudesAustralian Political Parties and social media: uses and attitudes
Australian Political Parties and social media: uses and attitudesStephen Dann
 
Social Media Metrics and Politics Final
Social Media Metrics and Politics FinalSocial Media Metrics and Politics Final
Social Media Metrics and Politics Finalhsolonynka
 
PEORIA Project Report Unclosed Deals
PEORIA Project Report Unclosed DealsPEORIA Project Report Unclosed Deals
PEORIA Project Report Unclosed DealsGSPMgwu
 
Twin Cities Election Forum Report
Twin Cities Election Forum ReportTwin Cities Election Forum Report
Twin Cities Election Forum ReportJefferson Center
 
Are Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting ElectionsAre Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting ElectionsLu Chen
 
Social Media Metrics and Politics
Social Media Metrics and PoliticsSocial Media Metrics and Politics
Social Media Metrics and Politicshsolonynka
 
Twitter Sentiment and Network Analysis
Twitter Sentiment and Network AnalysisTwitter Sentiment and Network Analysis
Twitter Sentiment and Network AnalysisXudong Brandon Liang
 
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
 
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...Artificial Intelligence Institute at UofSC
 
Twitter’s new guide for campaigners
Twitter’s new guide for campaignersTwitter’s new guide for campaigners
Twitter’s new guide for campaignersMohamed Mahdy
 
Social Media Playbook by Twitter for Government & Elections
Social Media Playbook by Twitter for Government & ElectionsSocial Media Playbook by Twitter for Government & Elections
Social Media Playbook by Twitter for Government & ElectionsMichael Cirrito
 
What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?Justin Littman
 
Governing a Divided Nation - Insights about the 2016 U.S. Presidential Election
Governing a Divided Nation - Insights about the 2016 U.S. Presidential ElectionGoverning a Divided Nation - Insights about the 2016 U.S. Presidential Election
Governing a Divided Nation - Insights about the 2016 U.S. Presidential ElectionMSL
 

Similar to Tweeting for Hillary - DS 501 case study 1 (20)

Data Science Poster Final
Data Science Poster FinalData Science Poster Final
Data Science Poster Final
 
Twitter Analysis: Fake News
Twitter Analysis: Fake  NewsTwitter Analysis: Fake  News
Twitter Analysis: Fake News
 
Twitter Final
Twitter FinalTwitter Final
Twitter Final
 
The Political Session
The Political SessionThe Political Session
The Political Session
 
Australian Political Parties and social media: uses and attitudes
Australian Political Parties and social media: uses and attitudesAustralian Political Parties and social media: uses and attitudes
Australian Political Parties and social media: uses and attitudes
 
Social Media Metrics and Politics Final
Social Media Metrics and Politics FinalSocial Media Metrics and Politics Final
Social Media Metrics and Politics Final
 
PEORIA Project Report Unclosed Deals
PEORIA Project Report Unclosed DealsPEORIA Project Report Unclosed Deals
PEORIA Project Report Unclosed Deals
 
Twin Cities Election Forum Report
Twin Cities Election Forum ReportTwin Cities Election Forum Report
Twin Cities Election Forum Report
 
Are Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting ElectionsAre Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting Elections
 
Social Media Metrics and Politics
Social Media Metrics and PoliticsSocial Media Metrics and Politics
Social Media Metrics and Politics
 
Twitter Sentiment and Network Analysis
Twitter Sentiment and Network AnalysisTwitter Sentiment and Network Analysis
Twitter Sentiment and Network Analysis
 
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
 
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
 
Senior Thesis
Senior Thesis Senior Thesis
Senior Thesis
 
twitterposter2
twitterposter2twitterposter2
twitterposter2
 
Twitter’s new guide for campaigners
Twitter’s new guide for campaignersTwitter’s new guide for campaigners
Twitter’s new guide for campaigners
 
Social Media Playbook by Twitter for Government & Elections
Social Media Playbook by Twitter for Government & ElectionsSocial Media Playbook by Twitter for Government & Elections
Social Media Playbook by Twitter for Government & Elections
 
What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?
 
Governing a Divided Nation - Insights about the 2016 U.S. Presidential Election
Governing a Divided Nation - Insights about the 2016 U.S. Presidential ElectionGoverning a Divided Nation - Insights about the 2016 U.S. Presidential Election
Governing a Divided Nation - Insights about the 2016 U.S. Presidential Election
 
Kottler Thesis 2011
Kottler Thesis 2011Kottler Thesis 2011
Kottler Thesis 2011
 

More from Yousef Fadila

Trackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity CalorimeterTrackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity CalorimeterYousef Fadila
 
Synergy on the Blockchain! whitepaper
Synergy on the Blockchain!  whitepaperSynergy on the Blockchain!  whitepaper
Synergy on the Blockchain! whitepaperYousef Fadila
 
Synergy Platform Whitepaper alpha
Synergy Platform Whitepaper alphaSynergy Platform Whitepaper alpha
Synergy Platform Whitepaper alphaYousef Fadila
 
Recommandation systems -
Recommandation systems - Recommandation systems -
Recommandation systems - Yousef Fadila
 
Analysis on steam platform
Analysis on steam platformAnalysis on steam platform
Analysis on steam platformYousef Fadila
 
interactive voting based map matching algorithm
interactive voting based map matching algorithminteractive voting based map matching algorithm
interactive voting based map matching algorithmYousef Fadila
 
co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.Yousef Fadila
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsYousef Fadila
 
Anomaly Detection - Catch me if you can
Anomaly Detection - Catch me if you canAnomaly Detection - Catch me if you can
Anomaly Detection - Catch me if you canYousef Fadila
 
CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1
CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1
CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1Yousef Fadila
 
Innovative thinking التفكير الابداعي
Innovative thinking التفكير الابداعيInnovative thinking التفكير الابداعي
Innovative thinking التفكير الابداعيYousef Fadila
 
Am i overpaying - business proposal
Am i overpaying - business proposal Am i overpaying - business proposal
Am i overpaying - business proposal Yousef Fadila
 

More from Yousef Fadila (12)

Trackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity CalorimeterTrackster Pruning at the CMS High-Granularity Calorimeter
Trackster Pruning at the CMS High-Granularity Calorimeter
 
Synergy on the Blockchain! whitepaper
Synergy on the Blockchain!  whitepaperSynergy on the Blockchain!  whitepaper
Synergy on the Blockchain! whitepaper
 
Synergy Platform Whitepaper alpha
Synergy Platform Whitepaper alphaSynergy Platform Whitepaper alpha
Synergy Platform Whitepaper alpha
 
Recommandation systems -
Recommandation systems - Recommandation systems -
Recommandation systems -
 
Analysis on steam platform
Analysis on steam platformAnalysis on steam platform
Analysis on steam platform
 
interactive voting based map matching algorithm
interactive voting based map matching algorithminteractive voting based map matching algorithm
interactive voting based map matching algorithm
 
co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie Reviews
 
Anomaly Detection - Catch me if you can
Anomaly Detection - Catch me if you canAnomaly Detection - Catch me if you can
Anomaly Detection - Catch me if you can
 
CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1
CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1
CS 548 KNOWLEDGE DISCOVERY AND DATA MINING Project 1
 
Innovative thinking التفكير الابداعي
Innovative thinking التفكير الابداعيInnovative thinking التفكير الابداعي
Innovative thinking التفكير الابداعي
 
Am i overpaying - business proposal
Am i overpaying - business proposal Am i overpaying - business proposal
Am i overpaying - business proposal
 

Recently uploaded

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Recently uploaded (20)

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Tweeting for Hillary - DS 501 case study 1

  • 1. Tweeting for Hillary Li Meng, Matt Beaulieu, ML Tlachac, Yousef Fadila DS 501 : Introduction To Data Science – Case Study 1: Collecting Data from Twitter https://github.com/yousef-fadila/casestudy1/blob/master/CaseStudy1.ipynb
  • 2. “The more compelling campaign is a direct result of better data collection, analysis and smart decision making” -PromptCloud
  • 3. Motivation Social media is a means for getting political news, and initiating political discussion Being able to interpret data with regards to the election would give a campaign manager live feedback on how their candidates actions likely impact polling This allows them to gain an advantage by reacting accordingly to changing political climates
  • 4. The Data Pulled about 15.5K Tweets from the twitter streaming API Filter based on: Language: en Tweets mentioning @Hillary Clinton Can then process hashtags, mentions, and relevant words, to
  • 5. Most Frequent Words Appearances Word 1240 trump 915 hillary 113 benghazi 346 cant 142 didnt 252 doesnt 146 poorest 117 trumps 130 wont 259 pneumonia 87 footing 192 liar 232 donors 541 dont 45 dnc Appearances Word 245 thats 91 isnt 41 tweet 63 ive 85 nypd 142 systematically 66 whats 68 cough 61 hypocrisy 32 dishonesty 103 crooked 40 theres 47 stamina 66 unfit 30 scum
  • 6. Types of Frequent Words 1. Opponent: trump, trumps 2. Criticism: unfit, liar, hypocrisy 3. Topics: bodyguards, benghazi, poorest, blackmail, pneumonia, audiobooks 4. Patterns: cant, doesnt, didnt, wont, dont, isnt
  • 8. Entity Popularity Screen Name Mentions HillaryClinton 15421 RealDonaldTrump 2718 FoxNews 1532 POTUS 503 CNN 481 politico 283 timkaine 263 FLOTUS 245 MSNBC 244 USAneedsTRUMP 235 Popular Mentions with @HillaryClinton Popular #hashtags with @HillaryClinton Hashtag Count #MAGA 385 #ImWithHer 351 #SpecialReport 209 #NeverHillary 178 #DNCLeak 177 #HispanicHeritageMonth 163 #tcot 156 #Trump 149 #TrumpPence16 125 #HillaryHealth 102
  • 9.
  • 10. Hillary’s Friends ID Screen Name 571202103 Medium 21337440 ChildDefender 23449384 amberdiscko 128790234 Samynemir 1656913327 sarajacobs89 325886383 SammyKoppelman 802430450 Natasha_S_Law 729761993461248000 ktvibbs 115740215 SarahAudelo 34782406 Lincoln_Ross 3044781131 HillaryforAR 113298560 GunaRockYa 15972271 CdotDukes 582037089 MiguelAyala312 734768872625188864 AndrewBatesNC 41021335 TroyClair 4736170399 BrianZuzenak 150885854 SarahPeckVA 231673 yianni 125083946 GillDrummond ● Communication Directors ● Charities ● Media Websites ● United States Senators ● etc.
  • 11. Sentiment Analysis Using Python’s NLTK text classifier, classified each tweet as “Positive”, “Negative”, or “Neutral”. Could give an idea of how “twitter” felt about Hillary Clinton Positive Neutral Negative
  • 12. Geographic Analysis Using the “positivity” of each tweet, we formed a ratio of positive and negative tweets, and compared it national polling data, to see how tweet hashtags related to polling data, if at all.
  • 13. Sentiment Analysis on Text Hashtags in Positive Tweets Count #HispanicHeritageMonth 118 #ImWithHer 107 #MAGA 72 #tcot 65 #Democrats 50 #RedNationRising 46 #WakeUpAmerica 43 #NeverHillary 32 #HillaryClinton 31 Hashtags in Negative Tweets Count #ImWithHer 74 #LatinosWithTrump 51 #AmericansUnitedForTrump 49 #MAGA 42 #NeverHillary 39 #CrookedHillary 38 ● Broke down the most popular hashtags in positive and negative tweets ● Some hashtags, in either table, seemed out of place ● This could be part of the source of error in the sentiment classification
  • 14. Sentiment analysis on Hashtags Manually identify positive and negative hashtags, and use this to determine popular words in tweets containing those hashtags in order to re-train the NLTK alogrithim Positive Hashtags include... ● Never Trump ● Hillary2016 ● StrongerTogether ● Vote ● UnitedBlue Negative Hashtags include... ● MAGA ● NeverHillary ● CrookedHillary ● LatinoswithTrump ● AmericansUnitedwithTrump
  • 15.
  • 16. Conclusions Word frequency analysis revealed relevant tweets to Clinton, and issues that she could consider addressing, or at least know what’s being talked about. Judging tweets by positive or negative sentiment gave mixed results. Training the positive and negative classifier on positive or negative hashtags proved more insightful. Ultimately, 15.5K tweets is not enough data, especially when separating it by state. Twitter has great potential to be useful to campaigns.
  • 17. Thank You Questions? Source code and Charts: https://github.com/yousef-fadila/casestudy1/blob/master/CaseStudy1.ipynb