SlideShare a Scribd company logo
1 of 31
Semantic Patterns for Sentiment 
Analysis of Twitter 
Hassan Saif, Yulan He, Miriam Fernandez and Harith Alani 
The 13th International Semantic Web Conference (ISWC2014) 
May 2014
OutLine 
o Sentiment Analysis 
o Traditional Sentiment Analysis 
o Pattern-based Sentiment Analysis 
o Semantic Sentiment Patterns 
o Evaluation 
o Results 
o Conclusion
Sentiment Analysis 
“Sentiment analysis is the task of identifying 
positive and negative opinions, emotions and 
evaluations in text” 
3 
Nooo, it is very 
humid :( 
The weather is 
great today :) 
I think its almost 
30 degrees today 
Opinion Fact Opinion
Traditional Sentiment Analysis 
Training Features: 
– Syntactic features 
(letter, n-grams, 
word n-grams, POS 
tags, etc) 
– Linguistic Features 
(Synonyms, glosses, 
etc) 
(1) The Lexicon-based Approach 
(1) The Machine Learning Approach 
Just got my new iPhone 6, looks 
and feel great! :D 
Sentiment Lexicon 
great sad 
down 
wrong
Traditional Sentiment Analysis 
However.. 
Sentiment is often expressed via more subtle relations, 
patterns and dependencies among words in tweets: 
Destroy Invading Germs 
Negative Negative Concept 
Positive Sentiment
Pattern-based Sentiment Analysis 
Syntactic Pattern Approaches 
Semantic Pattern Approaches
Syntactic Pattern Approaches 
• Based on syntactic relations between words. 
• Rely on predefined POS templates: 
<subject> passive-verb <subject> active-verb 
<customer> was satisfied <she> complained 
• But, they are Semantically Weak! 
<beer> is cold 
<subject> verb cold 
<weather> is cold
Semantic Pattern Approaches 
• Apply syntactic and semantic processing techniques 
• Use external semantic resources (Ontologies, Semantic 
Networks, etc.) 
• Capture the conceptual semantic relations in text that implicitly 
convey sentiment 
– Happy birthday (Positive) 
– Invading Germs (Negative)
Syntactic & Semantic Pattern Approaches 
are not tailored to 
Twitter
Syntactic & Semantic Pattern 
Approaches 
Are designed to function on 
Formal Text, that is: 
1. Long enough 
2. Well-Structured 
3. Formal Sentences
Tweets are often 
• Short! 
• Noisy and messy 
• Have informal, and 
ill-structured sentences
We Propose.. 
 A pattern-based approach 
 Works on Twitter 
 Does not rely on the syntactic structures of tweets or pre-defined 
syntactic templates 
 Does not rely on or semantic knowledge sources. 
 Automatically extracts patterns from the 
contextual semantic and sentiment similarities of 
words in tweets
Contextual Semantics and Sentiment 
Contextual Semantics 
• Contextual Semantics refer to semantics inferred 
from words’ co-occurrences in tweets. 
“Words that occur in similar context tend to have similar meaning” 
Wittgenstein (1953) 
Threat 
Hack 
Trojan Horse 
Dangerous 
Code 
Program 
Harm 
Malware 
Greek Tale 
Trojan Horse 
History 
Troy 
Wooden Class
Contextual Semantic Sentiment Patterns 
“Some words in different tweets tend to come with similar contextual semantics 
and sentiment, forming therefore specific clusters or patterns. 
Threat 
Trojan Horse 
Hack 
Code 
Dangerous 
Spyware 
Program 
Harm 
Malware
Contextual Semantic Sentiment Patterns 
Threat 
Trojan Horse 
Hack 
Code 
Dangerous 
Spyware 
Program 
Harm 
Malware 
C_Semantics(Worms) 
Negative Contextual Pattern 
C_Semantics(Adware) 
C_Semantics(Time bombs) 
Follow 
Follow 
Follow
Pattern Extraction 
Tweets 
Sentiment Lexicon 
Capturing Contextual 
Semantics & Sentiment 
Syntactical Preprocessing 
Extracting Semantic 
Sentiment Patterns 
Bag of 
SentiCircles 
Bag of 
SS-Patterns 
1. Syntactical Preprocessing of tweets 
2. Capturing the Contextual Semantics and Sentiment of 
words 
3. Extracting Semantic Sentiment Patterns 
Pipeline
(1) Syntactical Preprocessing 
• All URL links are replaced with the term “URL” 
• Remove all non-ASCII and non-English characters 
• Revert words that contain repeated letters to 
their original English form. 
– “maaadddd” will be converted to “mad” after 
processing.
(2) Capturing Contextual Semantics & Sentiment 
The SentiCircle Approach 
Context Terms 
Term (m) C1 
Trojan Horse 
Prior Sentiment 
DanCg1erous fix 
Degree of Correlation 
X = R * COS(θ) Y = R * SIN(θ) 
SentiCircle of “Trojan Horse” 
+1 
Very Positive Positive 
useful discover 
easily 
-1 +1 Neutral 
xi 
Dangerous 
X 
ri 
θi 
yi 
destroy 
Very Negative Negative 
-1 
Region 
ri = TDOC(Ci) 
θi = Prior_Sentiment (Ci) * π 
threat 
Malicious 
attack 
Overall Contextual Sentiment (Senti-Median) 
Saif, H., Fernandez, M., He, Y. and Alani, H. (2014) SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter, ESWC2014
(3) Extracting Semantic Sentiment Patterns 
Patterns are extracted by finding clusters of 
Similar SentiCircles 
iPod 
Spyw 
are 
Oprah 
Obam 
a 
SentiCircle’s Feature Vector 
Geometry Density Dispersion 
(1) 
(2) K-means 
SS-Patterns 
SentiCircle’s Feature Vectors
Evaluation 
SS-Patterns 
Training 
Sentiment 
Classifiers 
Entity-level Sentiment Analysis 
Detect the sentiment (Positive, 
Negative, Neutral) of named entities 
extracted from tweets 
Tweet-level Sentiment Analysis 
Detect the overall sentiment 
(Positive, Negative) of a tweet.
Evaluation Setup (1) 
Sentiment Classifiers 
– Tweet-Level 
• Maximum Entropy (MaxEnt) 
• Naïve Bayes (NB) 
– Entity-Level 
• MLE Classifier
Datasets 
Evaluation Setup (2) 
Tweet-level 
9 Twitter datasets 
Entity-Level 
58 manually 
annotated named 
entities
Evaluation Setup (3) 
Baseline Features 
Syntactic Features 
Unigrams Individual unique terms in tweets 
POS Features Words’ part-of-speech tags 
Twitter Features Usernames, emoticons, hashtags, etc 
Lexicon Features Prior sentiment of words in a given sentiment 
lexicon(e.g., great->positive, destroy->negative) 
Semantic Features 
LDA-Topic Features Topics generated by LDA 
Semantic Concepts Semantic concepts of named entities in tweets (e.g., 
Obama -> Person, London -> City)
Results
Tweet-Level Sentiment Analysis (1) 
The baseline model is a sentiment classifier trained 
from word unigram features. 
• MaxEnt outperforms NB in average Accuracy and 
F1-measure
Tweet-Level Sentiment Analysis (2) 
Win/Loss in Accuracy and F-measure of using different features for sentiment 
classification on all nine datasets.
Entity-Level Sentiment Analysis 
SS-Patterns produce 6.31% and 7.5% higher accuracy and F-measure than other features 
67.00 
65.00 
63.00 
61.00 
59.00 
57.00 
55.00 
Accuracy F1 
Unigrams LDA-Topics Semantic Concepts SS-Patterns
Within-Pattern Sentiment Consistency 
• Refers to the percentage of words having 
similar sentiment within a given pattern. 
• Strongly consistent patterns are those whose 
terms have similar sentiment.
Within-Pattern Sentiment Consistency 
• STS-Entity Dataset: 
– 58 Entities 14 SS-Patterns 
Consistency(Pattern12) = 88.89% 
Consistency(Pattern5) = 50% 
(Strongly Consistent) 
(Poorly Consistent) 
Average Sentiment Consistency (14 SS-Patterns) = 88%
Conclusion 
• We proposed a new approach for automatically extracting patterns 
from the contextual semantic and sentiment similarities of words in 
tweets. 
• Used patterns as features in tweet- and entity-level sentiment 
classification tasks 
• SS-Patterns consistently outperformed the syntactic and semantic 
type of features for entity- and tweet-level sentiment analysis 
• Conducted quantitative analysis on a sample of our extracted SS-Patterns 
and show that our patterns are strongly consistent with 
the sentiment of the words within them.
Thank You 
Email: hassan.saif@open.ac.uk 
Twitter: hrsaif 
Website: tweenator.com

More Related Content

What's hot

Query recommendation papers
Query recommendation papersQuery recommendation papers
Query recommendation papers
Ashish Kulkarni
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
harit66
 
HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...
HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...
HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...
WarNik Chow
 

What's hot (20)

Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
2 13
2 132 13
2 13
 
sentiment analysis
sentiment analysis sentiment analysis
sentiment analysis
 
MTech Seminar Presentation [IIT-Bombay]
MTech Seminar Presentation [IIT-Bombay]MTech Seminar Presentation [IIT-Bombay]
MTech Seminar Presentation [IIT-Bombay]
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Query recommendation papers
Query recommendation papersQuery recommendation papers
Query recommendation papers
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Final deck
Final deckFinal deck
Final deck
 
Sentiment analyzer and opinion mining
Sentiment analyzer and opinion miningSentiment analyzer and opinion mining
Sentiment analyzer and opinion mining
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regression
 
A review of sentiment analysis approaches in big
A review of sentiment analysis approaches in bigA review of sentiment analysis approaches in big
A review of sentiment analysis approaches in big
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie Reviews
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMiner
 
HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...
HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...
HashCount for SemEval-2018 Task 3: Concatenative Featurization of Tweet and H...
 
Project report
Project reportProject report
Project report
 

Similar to Semantic Patterns for Sentiment Analysis of Twitter

Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Traian Rebedea
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Chunyang Chen
 

Similar to Semantic Patterns for Sentiment Analysis of Twitter (20)

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
 
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
Iulia Pasov, Sixt. Trends in sentiment analysis. The entire history from rule...
 
NLP
NLPNLP
NLP
 
Sentiment Analysis of Film-Related Messages on Social Media
Sentiment Analysis of Film-Related Messages on Social MediaSentiment Analysis of Film-Related Messages on Social Media
Sentiment Analysis of Film-Related Messages on Social Media
 
Aaai 2006 Pedersen
Aaai 2006 PedersenAaai 2006 Pedersen
Aaai 2006 Pedersen
 
A review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptxA review on sentiment analysis and emotion detection.pptx
A review on sentiment analysis and emotion detection.pptx
 
Eacl 2006 Pedersen
Eacl 2006 PedersenEacl 2006 Pedersen
Eacl 2006 Pedersen
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
Do you Mean what you say? Recognizing Emotions.
Do you Mean what you say? Recognizing Emotions.Do you Mean what you say? Recognizing Emotions.
Do you Mean what you say? Recognizing Emotions.
 
sent_analysis_report
sent_analysis_reportsent_analysis_report
sent_analysis_report
 
Frame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with SentiloFrame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with Sentilo
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
110917_0900_Karimi.pdf
110917_0900_Karimi.pdf110917_0900_Karimi.pdf
110917_0900_Karimi.pdf
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
detect emotion from text
detect emotion from textdetect emotion from text
detect emotion from text
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
 
Artificial Intelligence (3).pdf
Artificial Intelligence (3).pdfArtificial Intelligence (3).pdf
Artificial Intelligence (3).pdf
 

Recently uploaded

+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
Health
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
eliklein8
 
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 OnlyVIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
 
Film show investigation powerpoint for the site
Film show investigation powerpoint for the siteFilm show investigation powerpoint for the site
Film show investigation powerpoint for the site
 
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
 
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncrCall Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
Call Girls In Gurgaon Dlf pHACE 2 Women Delhi ncr
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
 
Improve Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing CompanyImprove Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing Company
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
 
Stunning ➥8448380779▻ Call Girls In Paharganj Delhi NCR
Stunning ➥8448380779▻ Call Girls In Paharganj Delhi NCRStunning ➥8448380779▻ Call Girls In Paharganj Delhi NCR
Stunning ➥8448380779▻ Call Girls In Paharganj Delhi NCR
 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
 
Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolution
 
This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...
 
Elite Class ➥8448380779▻ Call Girls In New Friends Colony Delhi NCR
Elite Class ➥8448380779▻ Call Girls In New Friends Colony Delhi NCRElite Class ➥8448380779▻ Call Girls In New Friends Colony Delhi NCR
Elite Class ➥8448380779▻ Call Girls In New Friends Colony Delhi NCR
 
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 76 Noida Escorts >༒8448380779 Escort Service
 
The Butterfly Effect
The Butterfly EffectThe Butterfly Effect
The Butterfly Effect
 
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
 
Generate easy money from tiktok using this simple steps on the book.
Generate easy money from tiktok using this simple steps on the book.Generate easy money from tiktok using this simple steps on the book.
Generate easy money from tiktok using this simple steps on the book.
 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANY
 
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
 
Film show pre-production powerpoint for site
Film show pre-production powerpoint for siteFilm show pre-production powerpoint for site
Film show pre-production powerpoint for site
 
Website research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazineWebsite research Powerpoint for Bauer magazine
Website research Powerpoint for Bauer magazine
 

Semantic Patterns for Sentiment Analysis of Twitter

  • 1. Semantic Patterns for Sentiment Analysis of Twitter Hassan Saif, Yulan He, Miriam Fernandez and Harith Alani The 13th International Semantic Web Conference (ISWC2014) May 2014
  • 2. OutLine o Sentiment Analysis o Traditional Sentiment Analysis o Pattern-based Sentiment Analysis o Semantic Sentiment Patterns o Evaluation o Results o Conclusion
  • 3. Sentiment Analysis “Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text” 3 Nooo, it is very humid :( The weather is great today :) I think its almost 30 degrees today Opinion Fact Opinion
  • 4. Traditional Sentiment Analysis Training Features: – Syntactic features (letter, n-grams, word n-grams, POS tags, etc) – Linguistic Features (Synonyms, glosses, etc) (1) The Lexicon-based Approach (1) The Machine Learning Approach Just got my new iPhone 6, looks and feel great! :D Sentiment Lexicon great sad down wrong
  • 5. Traditional Sentiment Analysis However.. Sentiment is often expressed via more subtle relations, patterns and dependencies among words in tweets: Destroy Invading Germs Negative Negative Concept Positive Sentiment
  • 6. Pattern-based Sentiment Analysis Syntactic Pattern Approaches Semantic Pattern Approaches
  • 7. Syntactic Pattern Approaches • Based on syntactic relations between words. • Rely on predefined POS templates: <subject> passive-verb <subject> active-verb <customer> was satisfied <she> complained • But, they are Semantically Weak! <beer> is cold <subject> verb cold <weather> is cold
  • 8. Semantic Pattern Approaches • Apply syntactic and semantic processing techniques • Use external semantic resources (Ontologies, Semantic Networks, etc.) • Capture the conceptual semantic relations in text that implicitly convey sentiment – Happy birthday (Positive) – Invading Germs (Negative)
  • 9. Syntactic & Semantic Pattern Approaches are not tailored to Twitter
  • 10. Syntactic & Semantic Pattern Approaches Are designed to function on Formal Text, that is: 1. Long enough 2. Well-Structured 3. Formal Sentences
  • 11. Tweets are often • Short! • Noisy and messy • Have informal, and ill-structured sentences
  • 12. We Propose..  A pattern-based approach  Works on Twitter  Does not rely on the syntactic structures of tweets or pre-defined syntactic templates  Does not rely on or semantic knowledge sources.  Automatically extracts patterns from the contextual semantic and sentiment similarities of words in tweets
  • 13. Contextual Semantics and Sentiment Contextual Semantics • Contextual Semantics refer to semantics inferred from words’ co-occurrences in tweets. “Words that occur in similar context tend to have similar meaning” Wittgenstein (1953) Threat Hack Trojan Horse Dangerous Code Program Harm Malware Greek Tale Trojan Horse History Troy Wooden Class
  • 14. Contextual Semantic Sentiment Patterns “Some words in different tweets tend to come with similar contextual semantics and sentiment, forming therefore specific clusters or patterns. Threat Trojan Horse Hack Code Dangerous Spyware Program Harm Malware
  • 15. Contextual Semantic Sentiment Patterns Threat Trojan Horse Hack Code Dangerous Spyware Program Harm Malware C_Semantics(Worms) Negative Contextual Pattern C_Semantics(Adware) C_Semantics(Time bombs) Follow Follow Follow
  • 16. Pattern Extraction Tweets Sentiment Lexicon Capturing Contextual Semantics & Sentiment Syntactical Preprocessing Extracting Semantic Sentiment Patterns Bag of SentiCircles Bag of SS-Patterns 1. Syntactical Preprocessing of tweets 2. Capturing the Contextual Semantics and Sentiment of words 3. Extracting Semantic Sentiment Patterns Pipeline
  • 17. (1) Syntactical Preprocessing • All URL links are replaced with the term “URL” • Remove all non-ASCII and non-English characters • Revert words that contain repeated letters to their original English form. – “maaadddd” will be converted to “mad” after processing.
  • 18. (2) Capturing Contextual Semantics & Sentiment The SentiCircle Approach Context Terms Term (m) C1 Trojan Horse Prior Sentiment DanCg1erous fix Degree of Correlation X = R * COS(θ) Y = R * SIN(θ) SentiCircle of “Trojan Horse” +1 Very Positive Positive useful discover easily -1 +1 Neutral xi Dangerous X ri θi yi destroy Very Negative Negative -1 Region ri = TDOC(Ci) θi = Prior_Sentiment (Ci) * π threat Malicious attack Overall Contextual Sentiment (Senti-Median) Saif, H., Fernandez, M., He, Y. and Alani, H. (2014) SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter, ESWC2014
  • 19. (3) Extracting Semantic Sentiment Patterns Patterns are extracted by finding clusters of Similar SentiCircles iPod Spyw are Oprah Obam a SentiCircle’s Feature Vector Geometry Density Dispersion (1) (2) K-means SS-Patterns SentiCircle’s Feature Vectors
  • 20. Evaluation SS-Patterns Training Sentiment Classifiers Entity-level Sentiment Analysis Detect the sentiment (Positive, Negative, Neutral) of named entities extracted from tweets Tweet-level Sentiment Analysis Detect the overall sentiment (Positive, Negative) of a tweet.
  • 21. Evaluation Setup (1) Sentiment Classifiers – Tweet-Level • Maximum Entropy (MaxEnt) • Naïve Bayes (NB) – Entity-Level • MLE Classifier
  • 22. Datasets Evaluation Setup (2) Tweet-level 9 Twitter datasets Entity-Level 58 manually annotated named entities
  • 23. Evaluation Setup (3) Baseline Features Syntactic Features Unigrams Individual unique terms in tweets POS Features Words’ part-of-speech tags Twitter Features Usernames, emoticons, hashtags, etc Lexicon Features Prior sentiment of words in a given sentiment lexicon(e.g., great->positive, destroy->negative) Semantic Features LDA-Topic Features Topics generated by LDA Semantic Concepts Semantic concepts of named entities in tweets (e.g., Obama -> Person, London -> City)
  • 25. Tweet-Level Sentiment Analysis (1) The baseline model is a sentiment classifier trained from word unigram features. • MaxEnt outperforms NB in average Accuracy and F1-measure
  • 26. Tweet-Level Sentiment Analysis (2) Win/Loss in Accuracy and F-measure of using different features for sentiment classification on all nine datasets.
  • 27. Entity-Level Sentiment Analysis SS-Patterns produce 6.31% and 7.5% higher accuracy and F-measure than other features 67.00 65.00 63.00 61.00 59.00 57.00 55.00 Accuracy F1 Unigrams LDA-Topics Semantic Concepts SS-Patterns
  • 28. Within-Pattern Sentiment Consistency • Refers to the percentage of words having similar sentiment within a given pattern. • Strongly consistent patterns are those whose terms have similar sentiment.
  • 29. Within-Pattern Sentiment Consistency • STS-Entity Dataset: – 58 Entities 14 SS-Patterns Consistency(Pattern12) = 88.89% Consistency(Pattern5) = 50% (Strongly Consistent) (Poorly Consistent) Average Sentiment Consistency (14 SS-Patterns) = 88%
  • 30. Conclusion • We proposed a new approach for automatically extracting patterns from the contextual semantic and sentiment similarities of words in tweets. • Used patterns as features in tweet- and entity-level sentiment classification tasks • SS-Patterns consistently outperformed the syntactic and semantic type of features for entity- and tweet-level sentiment analysis • Conducted quantitative analysis on a sample of our extracted SS-Patterns and show that our patterns are strongly consistent with the sentiment of the words within them.
  • 31. Thank You Email: hassan.saif@open.ac.uk Twitter: hrsaif Website: tweenator.com