This document summarizes the timeline of sentiment analysis and entity linking tasks for social media text from 2013-2016. It describes the SENTIPOLC shared task at Evalita 2014 on sentiment classification of tweets, which had the highest participation. The challenges of analyzing sentiment for figurative language and domain dependence of supervised systems are discussed. Integrating entity linking and sentiment analysis on shared Twitter datasets is proposed for Evalita 2016.
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Deep Tweets Entity Linking Sentiment Analysis
1. Deep Tweets: from Entity Linking
to Sentiment Analysis
Pierpaolo Basile, Valerio Basile, Malvina Nissim, Nicole Novielli
{pierpaolo.basile,nicole.novielli}@uniba.it
{v.basile,m.nissim}@rug.nl
2. Timeline of Tasks
SemEval‘13
Sentiment Analysis
in Twitter
SemEval‘14
- Sentiment
Analysis in Twitter
- Aspect Based
Sentiment Analysis
Evalita 2014
SENTIPOLC
SemEval‘15
- Implicit Polarity of Events
- Sentiment Analysis in Twitter
- Sentiment Analysis of Figurative
Language in Twitter
- Aspect Based Sentiment Analysis
SemEval‘16
- Sentiment Analysis in Twitter
- Aspect Based Sentiment Analysis
- Detecting Stance in Tweets
3. SENTIPOLC @Evalita 2014
• Tasks
– Subjectivity Classification
– Polarity Classification (most popular)
– Irony Detection
• Best system supervised (Uniba)
– Two rule-based systems (Unibo, Ca’ Foscari-Venezia)
– All ML systems supervised
• Most popular task at Evalita 2014
– 11 Teams
– 35 Submitted runs (only from research institutions)
– Interest from industry
4. Timeline of Tasks
#Micropost2014
Named Entity Extraction
and Linking (NEEL)
#MSM2013
Concept Extraction
Challenge
SemEval‘13
Sentiment Analysis
in Twitter
SemEval‘14
- Sentiment
Analysis in Twitter
- Aspect Based
Sentiment Analysis
Evalita 2014
SENTIPOLC
SemEval‘15
- Implicit Polarity of Events
- Sentiment Analysis in Twitter
- Sentiment Analysis of Figurative
Language in Twitter
- Aspect Based Sentiment Analysis
#Micropost2015
Named Entity Extraction
and Linking (NEEL)
SemEval’15
Multilingual All-Words
Sense Disambiguation
and Entity Linking
SemEval‘16
- Sentiment Analysis in Twitter
- Aspect Based Sentiment Analysis
- Detecting Stance in Tweets
6. Entity-Based Sentiment Analysis
• Detecting the sentiment attached to an entity
in a tweet
• Stance detection
• Relevant for modelling socio-economic
phenomena
– Mining political sentiment, predicting election
results
– Commercial application
– Health issues
7. Annotation of Entities
@FabioClerici sono altri a dire che un reato.
E il "politometro" come lo chiama #Grillo vale
per tutti. Anche per chi fa #antipolitica.
FabioClerici (offsets 1-13)
linked as NIL
(no resources in DBpedia)
Grillo (offsets 85-91)
linked with the respective URI in DBpedia:
http://dbpedia. org/resource/Beppe_Grillo
8. Challenge-oriented Sentiment
Analysis?
• Prevalence of supervised ML systems in both
SemEval and Evalita
• Beyond the challenge, are they valid in the
real world?
– Domain-dependence and low temporal validity
– Political debates: countries afflicted by war
– Technology: ‘killer’ features in positive reviews
10. Sentiment Analysis of Figurative
Language
• Complex relation between sentiment and
figurative language
– Irony mainly acts as a polarity reverser
– Metaphor, sarcasm and other linguistic devices
might impact sentiment in different ways
• Necessary treatment: > 20% of tweets show
some form of figurative usage (irony/sarcasm)
11. Annotation of Irony
• Extension of the SENTIPOLC schema
subj pos neg irony opos oneg Description
1 1 0 1 0 1 Subjective tweet
Positive literal polarity
Negative overall polarity
Botta di ottimismo a #lInfedele: Governo
Monti, o la va o la spacca
12. Resources
• SENTIPOLC Dataset1
– Train set using tweets about political topic
• TWITA2
– Expand train set
– Test (no political topic)
• Italian dataset of manually annotated tweets
for Named Entity Linking3
– Add sentiment annotation
1 - http://www.di.unito.it/~tutreeb/sentipolc-evalita14/data.html (Basile et al., 2014)
2 - http://valeriobasile.github.io/twita/about.html (Basile and Nissim 2013)
3 - https://github.com/swapUniba/neel-it-twitter (Basile et al., @CLIC 2015)
13. Conclusion and Open Issues
• Entity linking and sentiment analysis on Twitter are
challenging, attractive, and timely tasks for the
Italian NLP community
– Options: running the two tasks on shared data?
– How does SA differ in message- and entity-level?
Techniques, features, results.
– How to deal with the layer of figurative language?
– How is annotation affected?
• How to prevent challenge-bound systems?
– Train and test set from different domains
– Multiple runs of submission
The goal of entity linking is to automatically extract entities from text and link them to the corresponding entries in taxonomies and/or knowledge bases as DBpedia or Freebase. The annotated data for entity linking tasks (such as our proposed typically include the start and end offsets of the entity mention in the tweet, the entity type belonging to one of the categories defined in the target taxonomy, and the link to the corresponding DBpedia resource or to a NIL reference