SlideShare a Scribd company logo
1 of 18
Evaluation Datasets for Twitter Sentiment Analysis
A survey and a new dataset, the STS-Gold

Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani
Knowledge Media Institute, The Open University,
Milton Keynes, United Kingdom

1st Workshop on Emotion and Sentiment in Social and
Expressive Media Approaches and perspectives from AI
• Definition & Background
• Evaluation Datasets for Twitter Sentiment
Analysis
• STS-Gold

Outline
• Comparative Study
• Conclusion
Sentiment Analysis – Definition
Sentiment Analysis
“Sentiment analysis is the task of identifying
positive and negative opinions, emotions and
evaluations in text”

The main dish was
delicious

It is a Syrian dish

Positive

Neutral

The main dish was
salty and horrible

Negative
3
Supervised

Sentiment Approaches

Unsupervised
Hybrid

Tweet-level
Sentiment Levels
Phrase-level
Entity-level

Twitter
Sentiment
Analysis
(Background)

Subjectivity
Sentiment Tasks

Polarity
Sentiment Strength
Emotion/Mood

4
Evaluation Datasets for Twitter Sentiment Analysis
SA Level

SA Task

No. of Tweets

Construction & Annotation

Dataset
Dataset

Vocabulary Size

Class Distribution
Sparsity
Dataset

SA Level

SA Task

Annotation/Agreement

Tweet

Subjectivity

Manual/UD

Tweet/Target

Subjectivity

Manual/UD

Obama-McCain Debate (OMD)

Tweet

Polarity*

Manual/α=0.655

Sentiment Strength Twitter Dataset (SS-Tweet)

Tweet

Strength/Subj
ectivity**

Manual
α≈0.56

Sanders Twitter Dataset

Tweet

Subjectivity

Manual/UD

Dialogue Earth Twitter Corpus (WAB, GASP)

Tweet/Target

Subjectivity

Manual/UD

SemEval-2013 Dataset

Tweet/Expre
ssion

Subjectivity

Manual/UD

Stanford Twitter Corpus (STS)
Health Care Reform (HCR)

Evaluation Datasets – Overview
• Details about the annotation
methodology (STS, HCR, Sanders)

What is Missing?

• Entity-level Sentiment Evaluation:
• Most works are focused on
assessing the performance of
sentiment classifiers at the tweet
level (STS, OMD, SS-Tweet, Sanders)
• Datasets, which allow for the
sentiment evaluation at the entity
level, assign similar sentiment
labels to the tweet and the entities
within it. (HCR, WAB, GASP)
 Enables the evaluation at both the entity and tweet
levels

 Tweets and entities are annotated independently

 Contains 58 Entities & 3000 Tweets
Data Collection

STS Corpus
Select

28 Entities
Select

100 Tweet/Entity
180K Tweets

STS-Gold

Alchemy API

2800 Tweets

Entity-Extraction
+200 tweets

Identify Frequent
Concepts

3000 Tweets

Top & Mid
Frequent Entities

Entity-Extraction

147 Entities
STS-Gold
Obama

Taylor Swift

Vegas

YouTube

Facebook

London
City

Person

Person

Person

Company

LeBron

Oprah

Person

Seattle

McDonalds

Starbucks

Sydney
iPod

iPhone
Lakers
England

Cavas

US

Xbox

Technology
Person

PSP

Organization

Person

Country

Headache

NASA

Person

Health
Condition

UN

Brazil

LeBron

Flu

Person
Cancer

Fever
3000 Tweets

147 Entities

Data Annotation

Tweenator.com

Sentiment Classes
Positive, Negative, Neutr
al, Mixed, Other

STS-Gold
3000 Tweets

147 Entities

Inter-annotation Agreement
Tweet α=0.765

Filtering

2205 Tweets

58 Entities

Entity α1=0.416
α2=0.964
Comparative Study

•
•
•
•

Vocabulary Size
Number of Tweets
Data Sparsity
Classification Performance
– Polarity Classification
– Naïve Bayes & Maximum Entropy
Comparative Study.1
Vocabulary Size vs. No. of Tweets
- There exists a high correction between the vocabulary size and the number of
tweets (ρ = 0.95)
- However, increasing the number of tweets does not always lead to increasing the
vocabulary size. (OMD)
Data Spar sity

Comparativeimportant factor that affectstheov
Da s t s rs isa Study.2
ta e pa ity
n

-

m chinele rning cla s rs[17]. According toS if e a
a
a
s ifie
a t l.
tha
nothe type
r
sof da
ta(e m
.g., oviere w da ) duetoa
vie
ta
Data Sparsity in tweets.
words
Inthiss ction, wea
e
imtocom rethepre e dda s ts
pa
s nte ta e
Twitter datasets are generally tethes rs de eof agive
Toca
lculavery sparse ity gre
pa
nda s t weus
ta e
e
Increasing both the number of tweets or the vocabulary size increases the sparsity
[13]:
Pn
degree of the dataset:
- ρno_of_tweets = 0.71
i Ni
Sd = 1 −
- ρvocabulary_size = 0.77
n ⇥ |V |
Whe
reN i isthethenum r of dis
be
tinct wordsintwe t i
e
the dataset and |V | the vocabulary size.
9

The Twe tNLP toke r ca be downloa d from ht t p:
e
nize n
de
Tweet NLP/
Comparative Study.3
Classification Performance vs. Dataset Sparsity (1)

0.9

Average Classifier Performance

Average Classifier Performance

According to Makrehchi et al (2008) and Saif et al (2012): in a given dataset the
classification performance and the sparsity degree are negatively correlated, i.e.,
increasing the dataset sparsity hinders the classification performance.
228
M . M akrehchi and M .S. K amel

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1

Industry Sectors
20 newsgroups
Reuters

0.991 0.992 0.993 0.994 0.995 0.996 0.997 0.998 0.999

Average Sparsity

(a)

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.9441

Industry Sectors
20 newsgroups
Reuters
0.9550

0.9661

0.9772

0.9886

1.00

0.9441

0.9550

Average Sparsity

(b)

F i g. 2. Classifier performance as a funct ion of sparsity: (a) Rocchio, and (b) SV M
Comparative Study.3
Classification Performance vs. Dataset Sparsity (2)
- No correlation between the classification performance and the sparsity degree
across the datasets. (ρacc = −0.06, ρf1 = 0.23)
- The sparsity-performance correlation is intrinsic, meaning that it might exists within
the dataset itself, but not necessarily across the datasets.
• Current datasets to evaluate Twitter
sentiment classifiers:
– Focus on the tweet-level.
– Assign similar sentiment labels to the
tweets and the entities within them.

• STS-Gold allows for sentiment evaluation
as both the tweet and the entity levels.

• A correlation between the vocabulary size
and the number of tweets does not
always exist.
• The sparsity-performance correlation is
intrinsic, i.e., it only exists within the
dataset itself, but not across the different
datasets.

Conclusion!
Thank You
Email: hassan.saif@open.ac.uk
Twitter: hrsaif
Website: tweenator.com

More Related Content

What's hot

Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Rachit Goel
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media Ravindra Chaudhary
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...Prateek Singh
 
A review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigA review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigNurfadhlina Mohd Sharef
 
Ontology based sentiment analysis
Ontology based sentiment analysisOntology based sentiment analysis
Ontology based sentiment analysisprathako
 
IRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisIRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisGangasagar Patil
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis Naveen Kumar
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisSagar Ahire
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in TwitterAyushi Dalmia
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonHetu Bhavsar
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis pptSonuCreation
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on TwitterNitish J Prabhu
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarRavi Kumar
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesKarol Chlasta
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeAdel Rahimi
 
SENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATASENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATAParvathy Devaraj
 

What's hot (20)

Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
sentiment analysis text extraction from social media
sentiment  analysis text extraction from social media sentiment  analysis text extraction from social media
sentiment analysis text extraction from social media
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
 
A review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigA review of sentiment analysis approaches in big
A review of sentiment analysis approaches in big
 
Ontology based sentiment analysis
Ontology based sentiment analysisOntology based sentiment analysis
Ontology based sentiment analysis
 
Opinion Mining – Twitter
Opinion Mining – TwitterOpinion Mining – Twitter
Opinion Mining – Twitter
 
IRE2014-Sentiment Analysis
IRE2014-Sentiment AnalysisIRE2014-Sentiment Analysis
IRE2014-Sentiment Analysis
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Sentiment Analaysis on Twitter
Sentiment Analaysis on TwitterSentiment Analaysis on Twitter
Sentiment Analaysis on Twitter
 
Alleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment AnalysisAlleviating Data Sparsity for Twitter Sentiment Analysis
Alleviating Data Sparsity for Twitter Sentiment Analysis
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
 
Big Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = AwesomeBig Data + Sentiment Analysis = Awesome
Big Data + Sentiment Analysis = Awesome
 
SENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATASENTIMENT ANALYSIS OF TWITTER DATA
SENTIMENT ANALYSIS OF TWITTER DATA
 

Similar to Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold

statistical analysis of questionnaires
statistical analysis of questionnairesstatistical analysis of questionnaires
statistical analysis of questionnairesMohamed Afifi
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018Nancy Garmer
 
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...Eric Brown
 
Twitter sentiment classifications 1
Twitter sentiment classifications 1Twitter sentiment classifications 1
Twitter sentiment classifications 1eshtiyak
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratorySara Hooker
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxdonaldp2
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxcarolinef5
 
Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Platforma Otwartej Nauki
 
Research Data Management
Research  Data ManagementResearch  Data Management
Research Data ManagementMahmoud91Tx
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming DatacentricTimothy Cook
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal
 
Slides
SlidesSlides
Slidesbutest
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchKelly Page
 
Sa discover text webinar
Sa discover text webinarSa discover text webinar
Sa discover text webinarQuestionPro
 

Similar to Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold (20)

statistical analysis of questionnaires
statistical analysis of questionnairesstatistical analysis of questionnaires
statistical analysis of questionnaires
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...Twitter Sentiment & Investing - modeling stock price movements with twitter s...
Twitter Sentiment & Investing - modeling stock price movements with twitter s...
 
Twitter sentiment classifications 1
Twitter sentiment classifications 1Twitter sentiment classifications 1
Twitter sentiment classifications 1
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
 
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docxDESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
DESCRIPTIVE ANALYSIS1DESCRIPTIVE ANALYSIS8Examining .docx
 
Slalom
SlalomSlalom
Slalom
 
Mike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to WebometricsMike Thelwall: Introduction to Webometrics
Mike Thelwall: Introduction to Webometrics
 
Media 330057 smxx
Media 330057 smxxMedia 330057 smxx
Media 330057 smxx
 
Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...Digging for data: opportunities and challenges in an open research landscape_...
Digging for data: opportunities and challenges in an open research landscape_...
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 
wendi_ppt
wendi_pptwendi_ppt
wendi_ppt
 
Research Data Management
Research  Data ManagementResearch  Data Management
Research Data Management
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
Slides
SlidesSlides
Slides
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
 
Sa discover text webinar
Sa discover text webinarSa discover text webinar
Sa discover text webinar
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold

  • 1. Evaluation Datasets for Twitter Sentiment Analysis A survey and a new dataset, the STS-Gold Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom 1st Workshop on Emotion and Sentiment in Social and Expressive Media Approaches and perspectives from AI
  • 2. • Definition & Background • Evaluation Datasets for Twitter Sentiment Analysis • STS-Gold Outline • Comparative Study • Conclusion
  • 3. Sentiment Analysis – Definition Sentiment Analysis “Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text” The main dish was delicious It is a Syrian dish Positive Neutral The main dish was salty and horrible Negative 3
  • 5. Evaluation Datasets for Twitter Sentiment Analysis SA Level SA Task No. of Tweets Construction & Annotation Dataset Dataset Vocabulary Size Class Distribution Sparsity
  • 6. Dataset SA Level SA Task Annotation/Agreement Tweet Subjectivity Manual/UD Tweet/Target Subjectivity Manual/UD Obama-McCain Debate (OMD) Tweet Polarity* Manual/α=0.655 Sentiment Strength Twitter Dataset (SS-Tweet) Tweet Strength/Subj ectivity** Manual α≈0.56 Sanders Twitter Dataset Tweet Subjectivity Manual/UD Dialogue Earth Twitter Corpus (WAB, GASP) Tweet/Target Subjectivity Manual/UD SemEval-2013 Dataset Tweet/Expre ssion Subjectivity Manual/UD Stanford Twitter Corpus (STS) Health Care Reform (HCR) Evaluation Datasets – Overview
  • 7. • Details about the annotation methodology (STS, HCR, Sanders) What is Missing? • Entity-level Sentiment Evaluation: • Most works are focused on assessing the performance of sentiment classifiers at the tweet level (STS, OMD, SS-Tweet, Sanders) • Datasets, which allow for the sentiment evaluation at the entity level, assign similar sentiment labels to the tweet and the entities within it. (HCR, WAB, GASP)
  • 8.  Enables the evaluation at both the entity and tweet levels  Tweets and entities are annotated independently  Contains 58 Entities & 3000 Tweets
  • 9. Data Collection STS Corpus Select 28 Entities Select 100 Tweet/Entity 180K Tweets STS-Gold Alchemy API 2800 Tweets Entity-Extraction +200 tweets Identify Frequent Concepts 3000 Tweets Top & Mid Frequent Entities Entity-Extraction 147 Entities
  • 11. 3000 Tweets 147 Entities Data Annotation Tweenator.com Sentiment Classes Positive, Negative, Neutr al, Mixed, Other STS-Gold 3000 Tweets 147 Entities Inter-annotation Agreement Tweet α=0.765 Filtering 2205 Tweets 58 Entities Entity α1=0.416 α2=0.964
  • 12. Comparative Study • • • • Vocabulary Size Number of Tweets Data Sparsity Classification Performance – Polarity Classification – Naïve Bayes & Maximum Entropy
  • 13. Comparative Study.1 Vocabulary Size vs. No. of Tweets - There exists a high correction between the vocabulary size and the number of tweets (ρ = 0.95) - However, increasing the number of tweets does not always lead to increasing the vocabulary size. (OMD)
  • 14. Data Spar sity Comparativeimportant factor that affectstheov Da s t s rs isa Study.2 ta e pa ity n - m chinele rning cla s rs[17]. According toS if e a a a s ifie a t l. tha nothe type r sof da ta(e m .g., oviere w da ) duetoa vie ta Data Sparsity in tweets. words Inthiss ction, wea e imtocom rethepre e dda s ts pa s nte ta e Twitter datasets are generally tethes rs de eof agive Toca lculavery sparse ity gre pa nda s t weus ta e e Increasing both the number of tweets or the vocabulary size increases the sparsity [13]: Pn degree of the dataset: - ρno_of_tweets = 0.71 i Ni Sd = 1 − - ρvocabulary_size = 0.77 n ⇥ |V | Whe reN i isthethenum r of dis be tinct wordsintwe t i e the dataset and |V | the vocabulary size. 9 The Twe tNLP toke r ca be downloa d from ht t p: e nize n de Tweet NLP/
  • 15. Comparative Study.3 Classification Performance vs. Dataset Sparsity (1) 0.9 Average Classifier Performance Average Classifier Performance According to Makrehchi et al (2008) and Saif et al (2012): in a given dataset the classification performance and the sparsity degree are negatively correlated, i.e., increasing the dataset sparsity hinders the classification performance. 228 M . M akrehchi and M .S. K amel 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Industry Sectors 20 newsgroups Reuters 0.991 0.992 0.993 0.994 0.995 0.996 0.997 0.998 0.999 Average Sparsity (a) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.9441 Industry Sectors 20 newsgroups Reuters 0.9550 0.9661 0.9772 0.9886 1.00 0.9441 0.9550 Average Sparsity (b) F i g. 2. Classifier performance as a funct ion of sparsity: (a) Rocchio, and (b) SV M
  • 16. Comparative Study.3 Classification Performance vs. Dataset Sparsity (2) - No correlation between the classification performance and the sparsity degree across the datasets. (ρacc = −0.06, ρf1 = 0.23) - The sparsity-performance correlation is intrinsic, meaning that it might exists within the dataset itself, but not necessarily across the datasets.
  • 17. • Current datasets to evaluate Twitter sentiment classifiers: – Focus on the tweet-level. – Assign similar sentiment labels to the tweets and the entities within them. • STS-Gold allows for sentiment evaluation as both the tweet and the entity levels. • A correlation between the vocabulary size and the number of tweets does not always exist. • The sparsity-performance correlation is intrinsic, i.e., it only exists within the dataset itself, but not across the different datasets. Conclusion!
  • 18. Thank You Email: hassan.saif@open.ac.uk Twitter: hrsaif Website: tweenator.com