This document summarizes research on using convolutional neural networks for sentiment analysis on Italian tweets. It describes training a CNN model using word embeddings and sentiment-specific embeddings generated from tweets. The best performing model used sentiment-specific embeddings trained on 500k tweets, with filters of sizes 7, 8, 9, 10. This model achieved an F-score of 0.6837 on the binary sentiment classification task, outperforming the official run and models using plain word embeddings. The research demonstrated that CNNs and sentiment-specific embeddings are effective for sentiment analysis of Italian tweets.
Invezz.com - Grow your wealth with trading signals
CNN Sentiment Analysis Italian Tweets
1. Convolutional Neural Networks for
Sentiment Analysis on Italian Tweets
Giuseppe Attardi, Daniele
Sartiano, Chiara Alzetta,
Federica Semplici
Dipartimento di Informatica
Università di Pisa
Università di Pisa
2. Task 2. Polarity Classification
G. Attardi, D. Sartiano (2016) SemEval 2016, Task 4
Not
going
to
the
beach
tomorrow
:-(
convolutional layer with
multiple filters
Multilayer
perceptron
with dropout
embeddings
for each word
max over time
pooling
Convolutional Neural Network
3. Training the network
Plain Word Embeddings
Word2vec on 167
million Italian tweets
Parameters:
embeddings size 300
window dimension 5
discarding words
with freq < 5
450k word
embeddings
obtained
Sentiment Specific WE
Starting from plain
WE
Sentiment polarity
of texts into the
embeddings
Positive and
Negative tweets
based on emoticons
More negative tweets
than positive tweets
4. Distant Supervision
Silver corpus created as follows:
Randomly choose max 10k tweets per class
(mixed and neutral added)
Select tweets which are assigned same class by:
1. emoticon presence (RE match)
2. classifier trained using the task trainset (gold).
5. Experiments
Extensive experiments with various
configurations of the classifier:
filters
plain or sentiment specific word embeddings
gold or silver training set.
Best settings:
Run 1 Run 2
Embeddings WE skipgram SWE
Training set Gold Silver Gold Silver
Filters 2, 3, 5 4, 5, 6, 7 7, 7, 7, 7, 8, 8, 8, 8 7, 8, 9, 10
6. Results
Top official results for polarity classification
The extended silver corpus did not help, possibly
because the resulting corpus was still
unbalanced.
System
Positive
F-score
Negative
F-score
Combined F-
score
UniPI_2.c 0.685 0.6426 0.6638
team1_1.u 0.6354 0.6885 0.662
team1_2.u 0.6312 0.6838 0.6575
team4_.c 0.644 0.6605 0.6522
team3_.1.c 0.6265 0.6743 0.6504
team5_2.c 0.6426 0.648 0.6453
team3_.2.c 0.6395 0.6469 0.6432
UniPI_1.u 0.6699 0.6146 0.6422
UniPI_1.c 0.6766 0.6002 0.6384
UniPI_2.u 0.6586 0.5654 0.612
7. New Results
Unipi_2c Positive Negative F-score
official run 0.685 0.6426 0.6638
plain embeddings 0.6851 0.6612 0.6731
SE 200k tweets 25 epochs 0.6779 0.6826 0.6803
SE 500k tweets 4 epochs 0.6818 0.6856 0.6837
8. Conclusions
The experiments confirmed the validity of the
Convolutional Neural Networks in Twitter
sentiment classification, also for the Italian
language.
Sentiment Embeddings proved to be effective
for sentiment classification
Editor's Notes
We are going to talk about the results of the taks 2 for polarity classification
We used a Deep Learning approach, i.e. a Convolutional Neural Network
the same neural network was used for Semeval 2016 for english tweets
We now try to use the same approach on italian tweets
The architecture of the ConvNet is composed of 4 steps described in the picture
Architecture: the neural network is trained:
Once with word embeddings (created with word2vec)
Once with sentiment specific word embeddings.
For both types we used a preprocessed text of tweets
classic sentence splitting, tokenization and normalization of the elements not useful for the task, like URLs, mentions and numbers.
The corpus is a collection of 167 Italian Tweets, enlarged also with other 1,3 million of tweets from Integris
We created word embeddings with these parameters and we obtained 450k of them.
Senti word Embeddings are word embeddings created from the same corpus, but now the sentences are labeled with polarity
We defined the polarity of tweets, positive and negative, based on the emoticons that appear in the text of the tweet
Some emoticons are clear, for the others we observed a sample of tweets where they appear and took the decision.
Positive tweets are much more frequent than negative tweets.
Since
1. we notice that the polarity distribution of the gold training set is skewed too
2. that the training set is quite small,
we created a silver corpus with distant supervision to add
To create the silver corpus we selected not more than 10,000 tweet from each class
We first assigned the polarity to the tweets with regular expressions looking for the emoticons belonging to 4 classes this time
We assigned 2 new labels (mixed and neutral) based on the annotation of the gold corpus
The 2 new classes are added to the original corpus
Than we took the same tweets and try to classify them using the classifier trained on the task trainset
If the two techniques assign the same label (both positive, both negative..), than the tweet can be used for the silver corpus.
As a result the silver corpus is still unbalanced, with very few ‘mixed’ examples
Anyway at this point we were able to run the classifier with many different configurations, varying different parameters.
In the end we found that these in the table are the best settings.
In the end our approach obtained the best score for polarity classification.
We also applied the same approach for subjectivity classification, without performing extensive experiments.
We obtained again quite successful results, even though not the top score.
In conclusion we can say that our experiments confirm….