SlideShare a Scribd company logo
1 of 8
Convolutional Neural Networks for
Sentiment Analysis on Italian Tweets
Giuseppe Attardi, Daniele
Sartiano, Chiara Alzetta,
Federica Semplici
Dipartimento di Informatica
Università di Pisa
Università di Pisa
Task 2. Polarity Classification
G. Attardi, D. Sartiano (2016) SemEval 2016, Task 4
Not
going
to
the
beach
tomorrow
:-(
convolutional layer with
multiple filters
Multilayer
perceptron
with dropout
embeddings
for each word
max over time
pooling
Convolutional Neural Network
Training the network
Plain Word Embeddings
 Word2vec on 167
million Italian tweets
 Parameters:
 embeddings size 300
 window dimension 5
 discarding words
with freq < 5
 450k word
embeddings
obtained
Sentiment Specific WE
 Starting from plain
WE
 Sentiment polarity
of texts into the
embeddings
 Positive and
Negative tweets
based on emoticons
 More negative tweets
than positive tweets
Distant Supervision
 Silver corpus created as follows:
 Randomly choose max 10k tweets per class
(mixed and neutral added)
 Select tweets which are assigned same class by:
1. emoticon presence (RE match)
2. classifier trained using the task trainset (gold).
Experiments
 Extensive experiments with various
configurations of the classifier:
 filters
 plain or sentiment specific word embeddings
 gold or silver training set.
 Best settings:
Run 1 Run 2
Embeddings WE skipgram SWE
Training set Gold Silver Gold Silver
Filters 2, 3, 5 4, 5, 6, 7 7, 7, 7, 7, 8, 8, 8, 8 7, 8, 9, 10
Results
 Top official results for polarity classification
 The extended silver corpus did not help, possibly
because the resulting corpus was still
unbalanced.
System
Positive
F-score
Negative
F-score
Combined F-
score
UniPI_2.c 0.685 0.6426 0.6638
team1_1.u 0.6354 0.6885 0.662
team1_2.u 0.6312 0.6838 0.6575
team4_.c 0.644 0.6605 0.6522
team3_.1.c 0.6265 0.6743 0.6504
team5_2.c 0.6426 0.648 0.6453
team3_.2.c 0.6395 0.6469 0.6432
UniPI_1.u 0.6699 0.6146 0.6422
UniPI_1.c 0.6766 0.6002 0.6384
UniPI_2.u 0.6586 0.5654 0.612
New Results
Unipi_2c Positive Negative F-score
official run 0.685 0.6426 0.6638
plain embeddings 0.6851 0.6612 0.6731
SE 200k tweets 25 epochs 0.6779 0.6826 0.6803
SE 500k tweets 4 epochs 0.6818 0.6856 0.6837
Conclusions
 The experiments confirmed the validity of the
Convolutional Neural Networks in Twitter
sentiment classification, also for the Italian
language.
 Sentiment Embeddings proved to be effective
for sentiment classification

More Related Content

Viewers also liked

word embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisword embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisMostapha Benhenda
 
Distributed representation of sentences and documents
Distributed representation of sentences and documentsDistributed representation of sentences and documents
Distributed representation of sentences and documentsAbdullah Khan Zehady
 
Practical Sentiment Analysis
Practical Sentiment AnalysisPractical Sentiment Analysis
Practical Sentiment AnalysisPeople Pattern
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Machine Learning From Movie Reviews - Long Form
Machine Learning From Movie Reviews - Long FormMachine Learning From Movie Reviews - Long Form
Machine Learning From Movie Reviews - Long FormJennifer Dunne
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisFabio Benedetti
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Kavita Ganesan
 
Distributed Representations of Sentences and Documents
Distributed Representations of Sentences and DocumentsDistributed Representations of Sentences and Documents
Distributed Representations of Sentences and Documentssakaizawa
 

Viewers also liked (10)

word embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysisword embeddings and applications to machine translation and sentiment analysis
word embeddings and applications to machine translation and sentiment analysis
 
Distributed representation of sentences and documents
Distributed representation of sentences and documentsDistributed representation of sentences and documents
Distributed representation of sentences and documents
 
Word2vec 4 all
Word2vec 4 allWord2vec 4 all
Word2vec 4 all
 
Practical Sentiment Analysis
Practical Sentiment AnalysisPractical Sentiment Analysis
Practical Sentiment Analysis
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Machine Learning From Movie Reviews - Long Form
Machine Learning From Movie Reviews - Long FormMachine Learning From Movie Reviews - Long Form
Machine Learning From Movie Reviews - Long Form
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment Analysis
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
Distributed Representations of Sentences and Documents
Distributed Representations of Sentences and DocumentsDistributed Representations of Sentences and Documents
Distributed Representations of Sentences and Documents
 

Similar to CNN Sentiment Analysis Italian Tweets

Semantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media PostsSemantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media PostsGiulio Carducci
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptxKtonNguyn2
 
SophiaConf 2018 - P. Urso (Activeeon)
SophiaConf 2018 - P. Urso (Activeeon)SophiaConf 2018 - P. Urso (Activeeon)
SophiaConf 2018 - P. Urso (Activeeon)TelecomValley
 
AlphaZero and beyond: Polygames
AlphaZero and beyond: PolygamesAlphaZero and beyond: Polygames
AlphaZero and beyond: PolygamesOlivier Teytaud
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearningscalawox
 
Discover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and DiscoveryDiscover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and DiscoveryNew Delhi Salesforce Developer Group
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人台灣資料科學年會
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Future of Xiaomi in Indian Market
Future of Xiaomi in Indian MarketFuture of Xiaomi in Indian Market
Future of Xiaomi in Indian MarketSoochna Sahu
 
Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017Brian Ho
 
Evaluation of the suitability of people services for performing delphi studies
Evaluation of the suitability of people services for performing delphi studiesEvaluation of the suitability of people services for performing delphi studies
Evaluation of the suitability of people services for performing delphi studiesJohannes K
 
Agile analysis development
Agile analysis developmentAgile analysis development
Agile analysis developmentsetitesuk
 

Similar to CNN Sentiment Analysis Italian Tweets (20)

Semantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media PostsSemantic Analysis to Compute Personality Traits from Social Media Posts
Semantic Analysis to Compute Personality Traits from Social Media Posts
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
 
SophiaConf 2018 - P. Urso (Activeeon)
SophiaConf 2018 - P. Urso (Activeeon)SophiaConf 2018 - P. Urso (Activeeon)
SophiaConf 2018 - P. Urso (Activeeon)
 
AlphaZero and beyond: Polygames
AlphaZero and beyond: PolygamesAlphaZero and beyond: Polygames
AlphaZero and beyond: Polygames
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
 
Discover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and DiscoveryDiscover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and Discovery
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Usability lab design proposal
Usability lab design proposal Usability lab design proposal
Usability lab design proposal
 
Future of Xiaomi in Indian Market
Future of Xiaomi in Indian MarketFuture of Xiaomi in Indian Market
Future of Xiaomi in Indian Market
 
Thinking in software testing
Thinking in software testingThinking in software testing
Thinking in software testing
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
 
Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017
 
resumelrs_jan_2017
resumelrs_jan_2017resumelrs_jan_2017
resumelrs_jan_2017
 
TDD and Getting Paid
TDD and Getting PaidTDD and Getting Paid
TDD and Getting Paid
 
Protein Structure Alignment
Protein Structure AlignmentProtein Structure Alignment
Protein Structure Alignment
 
Evaluation of the suitability of people services for performing delphi studies
Evaluation of the suitability of people services for performing delphi studiesEvaluation of the suitability of people services for performing delphi studies
Evaluation of the suitability of people services for performing delphi studies
 
Agile analysis development
Agile analysis developmentAgile analysis development
Agile analysis development
 

Recently uploaded

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Recently uploaded (20)

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

CNN Sentiment Analysis Italian Tweets

  • 1. Convolutional Neural Networks for Sentiment Analysis on Italian Tweets Giuseppe Attardi, Daniele Sartiano, Chiara Alzetta, Federica Semplici Dipartimento di Informatica Università di Pisa Università di Pisa
  • 2. Task 2. Polarity Classification G. Attardi, D. Sartiano (2016) SemEval 2016, Task 4 Not going to the beach tomorrow :-( convolutional layer with multiple filters Multilayer perceptron with dropout embeddings for each word max over time pooling Convolutional Neural Network
  • 3. Training the network Plain Word Embeddings  Word2vec on 167 million Italian tweets  Parameters:  embeddings size 300  window dimension 5  discarding words with freq < 5  450k word embeddings obtained Sentiment Specific WE  Starting from plain WE  Sentiment polarity of texts into the embeddings  Positive and Negative tweets based on emoticons  More negative tweets than positive tweets
  • 4. Distant Supervision  Silver corpus created as follows:  Randomly choose max 10k tweets per class (mixed and neutral added)  Select tweets which are assigned same class by: 1. emoticon presence (RE match) 2. classifier trained using the task trainset (gold).
  • 5. Experiments  Extensive experiments with various configurations of the classifier:  filters  plain or sentiment specific word embeddings  gold or silver training set.  Best settings: Run 1 Run 2 Embeddings WE skipgram SWE Training set Gold Silver Gold Silver Filters 2, 3, 5 4, 5, 6, 7 7, 7, 7, 7, 8, 8, 8, 8 7, 8, 9, 10
  • 6. Results  Top official results for polarity classification  The extended silver corpus did not help, possibly because the resulting corpus was still unbalanced. System Positive F-score Negative F-score Combined F- score UniPI_2.c 0.685 0.6426 0.6638 team1_1.u 0.6354 0.6885 0.662 team1_2.u 0.6312 0.6838 0.6575 team4_.c 0.644 0.6605 0.6522 team3_.1.c 0.6265 0.6743 0.6504 team5_2.c 0.6426 0.648 0.6453 team3_.2.c 0.6395 0.6469 0.6432 UniPI_1.u 0.6699 0.6146 0.6422 UniPI_1.c 0.6766 0.6002 0.6384 UniPI_2.u 0.6586 0.5654 0.612
  • 7. New Results Unipi_2c Positive Negative F-score official run 0.685 0.6426 0.6638 plain embeddings 0.6851 0.6612 0.6731 SE 200k tweets 25 epochs 0.6779 0.6826 0.6803 SE 500k tweets 4 epochs 0.6818 0.6856 0.6837
  • 8. Conclusions  The experiments confirmed the validity of the Convolutional Neural Networks in Twitter sentiment classification, also for the Italian language.  Sentiment Embeddings proved to be effective for sentiment classification

Editor's Notes

  1. We are going to talk about the results of the taks 2 for polarity classification We used a Deep Learning approach, i.e. a Convolutional Neural Network the same neural network was used for Semeval 2016 for english tweets We now try to use the same approach on italian tweets The architecture of the ConvNet is composed of 4 steps described in the picture
  2. Architecture: the neural network is trained: Once with word embeddings (created with word2vec) Once with sentiment specific word embeddings. For both types we used a preprocessed text of tweets classic sentence splitting, tokenization and normalization of the elements not useful for the task, like URLs, mentions and numbers. The corpus is a collection of 167 Italian Tweets, enlarged also with other 1,3 million of tweets from Integris We created word embeddings with these parameters and we obtained 450k of them. Senti word Embeddings are word embeddings created from the same corpus, but now the sentences are labeled with polarity We defined the polarity of tweets, positive and negative, based on the emoticons that appear in the text of the tweet Some emoticons are clear, for the others we observed a sample of tweets where they appear and took the decision. Positive tweets are much more frequent than negative tweets. Since 1. we notice that the polarity distribution of the gold training set is skewed too 2. that the training set is quite small, we created a silver corpus with distant supervision to add
  3. To create the silver corpus we selected not more than 10,000 tweet from each class We first assigned the polarity to the tweets with regular expressions looking for the emoticons belonging to 4 classes this time We assigned 2 new labels (mixed and neutral) based on the annotation of the gold corpus The 2 new classes are added to the original corpus Than we took the same tweets and try to classify them using the classifier trained on the task trainset If the two techniques assign the same label (both positive, both negative..), than the tweet can be used for the silver corpus. As a result the silver corpus is still unbalanced, with very few ‘mixed’ examples
  4. Anyway at this point we were able to run the classifier with many different configurations, varying different parameters. In the end we found that these in the table are the best settings.
  5. In the end our approach obtained the best score for polarity classification. We also applied the same approach for subjectivity classification, without performing extensive experiments. We obtained again quite successful results, even though not the top score.
  6. In conclusion we can say that our experiments confirm….