SlideShare a Scribd company logo
1 of 3
11/18/2015 Analyze Twitter Data
with Hortonworks
Hadoop
Intermediate Project Report
Bharat Khanna
UNIVERSITY AT BUFFALO
1
Sentiment Analysis of Mr. Narendra Modi’s Brand Image using Twitter Data
Summary: - I am doing sentiment analysis of Mr. Narendra Modi’s Brand Image across
different nations using data from twitter. For fetching the twitter data, I am using Apache
Flume that is open source and by default comes installed in Hortonworks sandbox platform
1.3.
After fetching the data from twitter, it would be loaded directly to HDFS (Hadoop Distributed
File System). This way I am reducing the extra overhead of transferring the data from local
system to HDFS.
Data loaded in HDFS is still in unstructured format and not good for Ad-hoc analysis. So I will
be converting the JSON data to tabular format and store it in HIVE. Also I would be providing
a graphical user interface to end users to run their own ad-hoc analysis.
Next step deals with using the dictionary file to score the sentiment of each tweet by the
number of positive words compared to number of negative words, and then assigned a
positive, negative or neutral sentiment value to eachtweet. I have downloaded the dictionary
file from below link.
Click here for Dictionary
Last part of project is to show results of sentiments analysis in form of visualizations. Here I
will be using Tableau for it. I will be connecting Tableau to Hive using Hortonworks ODBC
Driver that I downloaded from Hortonworks website (link mentioned in references section).
I will show the results of analysis in the form graphs and maps using Tableau’s inbuilt VIZQL
server.
Data sets and Software:
Sentiment Data: - Sentiment Data is unstructured data that represents opinions, emotions,
attitudes contained in sources such as social media posts, online blogs, and product reviews
etc.
Whyuse sentiment Data:- Organizations use sentiment data to know what people feel about
their product and what they can do to effectively market their product.
How did I fetched Twitter Data: - Created twitter app, configured flume.conf with app
credentials and ran flume. All the steps for fetching data from twitter using Apache Flume I
have mentioned in a YouTube video and a ppt, the link of which is below. I have alsouploaded
video at ublearns discussion forum of DC.
YouTube: - https://youtu.be/E1w5SkE7Cco
Slide share: - http://www.slideshare.net/bharat3khanna/extracting-twitter-data-using-
apache-flume
Source code for Flume-Snapshot.jar:- Idownloadedsource code of Flume-snapshot.jarfromgithub
and builtthe jarusingmavenpackage inHadoop cluster.
2
Click here for Flume Source Code
Size of Data: - Though there is no limitation of amount of data I can get from twitter but for this
project, I am going to do my analysis on approximately 100 mb of data.
AlgorithmsUsed:- IamnotusingMap-Reduce Algorithmhere,sinceIwanttodoanalysis oncomplete
data and I don’twant to use aggregatedmeasures.If I wouldhave usedMap Reduce,thenmy lot of
data wouldhave beenaggregatedbyreducer.My source data isin JSON format and I am usingHive-
serde.jar (serde stands serializer and deserializer) that helps in parsing the JSON data effectively to
hive tables.
Source code forHive-serde.jar:-Idownloaded source code of Hive-serde.jarfromgithubandbuiltthe
jar using maven package in Hadoop cluster.
Clickhere forHive-serde.jarsource code
Analysis to be done on Twitter data: - I am going to do following analysis using Hive and Tableau:-
a) Maximum tweets count per user.
b) Count of retweets.
c) Geographically mapping people’s sentiments towards Mr. Modi.
References: -
http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop
https://github.com/cloudera/cdh-twitter-example
https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon
http://hortonworks.com/products/releases/hdp-1-3/#add_ons

More Related Content

What's hot

Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in TwitterAyushi Dalmia
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter DataNurendra Choudhary
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on TwitterSmritiAgarwal26
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitterprnk08
 
Sentiment analysis in twitter using python
Sentiment analysis in twitter using pythonSentiment analysis in twitter using python
Sentiment analysis in twitter using pythonCloudTechnologies
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on TwitterSubarno Pal
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter dataBhagyashree Deokar
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysisAntaraBhattacharya12
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarRavi Kumar
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysisSunil Kandari
 
Sentiment Analysis
Sentiment Analysis Sentiment Analysis
Sentiment Analysis prnk08
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataIswarya M
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using mlPravin Katiyar
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis reportSavio Aberneithie
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment AnalysisAyush Khandelwal
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesKarol Chlasta
 
Social Media Sentiments Analysis
Social Media Sentiments AnalysisSocial Media Sentiments Analysis
Social Media Sentiments AnalysisPratisthaSingh5
 

What's hot (20)

Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment analysis in twitter using python
Sentiment analysis in twitter using pythonSentiment analysis in twitter using python
Sentiment analysis in twitter using python
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter data
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysis
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 
Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .Sentimental Analysis of twitter data .
Sentimental Analysis of twitter data .
 
Sentiment Analysis
Sentiment Analysis Sentiment Analysis
Sentiment Analysis
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
 
Social Media Sentiments Analysis
Social Media Sentiments AnalysisSocial Media Sentiments Analysis
Social Media Sentiments Analysis
 

Similar to Twitter sentiment analysis project report

Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveIRJET Journal
 
Social data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveSocial data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveijctet
 
Analytics With PowerBI On Azure
Analytics With PowerBI On AzureAnalytics With PowerBI On Azure
Analytics With PowerBI On AzureAnita Luthra
 
IRJET- Opinion Mining on Pulwama Attack
IRJET-  	  Opinion Mining on Pulwama AttackIRJET-  	  Opinion Mining on Pulwama Attack
IRJET- Opinion Mining on Pulwama AttackIRJET Journal
 
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET Journal
 
Social media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docxSocial media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docxwrite12
 
Sentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner toolSentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner toolValarmathi Srinivasan
 
sentimentanaly 2.pdf
sentimentanaly 2.pdfsentimentanaly 2.pdf
sentimentanaly 2.pdfvisheshs4
 
Real time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ indexReal time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ indexEric Tham
 
Product Sentiment Analysis
Product Sentiment AnalysisProduct Sentiment Analysis
Product Sentiment Analysisnancy amala
 
Stock prediction using social network
Stock prediction using social networkStock prediction using social network
Stock prediction using social networkChanon Hongsirikulkit
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1Van Huy
 
Five steps to search and store tweets by keywords
Five steps to search and store tweets by keywordsFive steps to search and store tweets by keywords
Five steps to search and store tweets by keywordsWeiai Wayne Xu
 
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingHitachi Vantara
 
Curriculum Vitae
Curriculum VitaeCurriculum Vitae
Curriculum VitaeSunny Roy
 
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisA Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisIRJET Journal
 
Data extraction tools
Data extraction toolsData extraction tools
Data extraction toolsCristian Ruiz
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxJOELFRANKLIN13
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1Van Huy
 

Similar to Twitter sentiment analysis project report (20)

Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
 
Social data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hiveSocial data analysis using apache flume, hdfs, hive
Social data analysis using apache flume, hdfs, hive
 
Analytics With PowerBI On Azure
Analytics With PowerBI On AzureAnalytics With PowerBI On Azure
Analytics With PowerBI On Azure
 
IRJET- Opinion Mining on Pulwama Attack
IRJET-  	  Opinion Mining on Pulwama AttackIRJET-  	  Opinion Mining on Pulwama Attack
IRJET- Opinion Mining on Pulwama Attack
 
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using HadoopIRJET- Sentiment Analysis on Twitter Posts using Hadoop
IRJET- Sentiment Analysis on Twitter Posts using Hadoop
 
Social media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docxSocial media and its data are both a challenge and.docx
Social media and its data are both a challenge and.docx
 
Sentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner toolSentiment analysis and classification of tweets using rapid miner tool
Sentiment analysis and classification of tweets using rapid miner tool
 
sentimentanaly 2.pdf
sentimentanaly 2.pdfsentimentanaly 2.pdf
sentimentanaly 2.pdf
 
Real time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ indexReal time sentiment analysis of twitter feeds with the NASDAQ index
Real time sentiment analysis of twitter feeds with the NASDAQ index
 
Develop MS Office Plugins
Develop MS Office Plugins Develop MS Office Plugins
Develop MS Office Plugins
 
Product Sentiment Analysis
Product Sentiment AnalysisProduct Sentiment Analysis
Product Sentiment Analysis
 
Stock prediction using social network
Stock prediction using social networkStock prediction using social network
Stock prediction using social network
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1
 
Five steps to search and store tweets by keywords
Five steps to search and store tweets by keywordsFive steps to search and store tweets by keywords
Five steps to search and store tweets by keywords
 
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File SharingESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
ESG - HDS HCP Anywhere Easy, Secure, On-Premises File Sharing
 
Curriculum Vitae
Curriculum VitaeCurriculum Vitae
Curriculum Vitae
 
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment AnalysisA Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
 
Data extraction tools
Data extraction toolsData extraction tools
Data extraction tools
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptx
 
FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1FDMEE Tutorial - Part 1
FDMEE Tutorial - Part 1
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Twitter sentiment analysis project report

  • 1. 11/18/2015 Analyze Twitter Data with Hortonworks Hadoop Intermediate Project Report Bharat Khanna UNIVERSITY AT BUFFALO
  • 2. 1 Sentiment Analysis of Mr. Narendra Modi’s Brand Image using Twitter Data Summary: - I am doing sentiment analysis of Mr. Narendra Modi’s Brand Image across different nations using data from twitter. For fetching the twitter data, I am using Apache Flume that is open source and by default comes installed in Hortonworks sandbox platform 1.3. After fetching the data from twitter, it would be loaded directly to HDFS (Hadoop Distributed File System). This way I am reducing the extra overhead of transferring the data from local system to HDFS. Data loaded in HDFS is still in unstructured format and not good for Ad-hoc analysis. So I will be converting the JSON data to tabular format and store it in HIVE. Also I would be providing a graphical user interface to end users to run their own ad-hoc analysis. Next step deals with using the dictionary file to score the sentiment of each tweet by the number of positive words compared to number of negative words, and then assigned a positive, negative or neutral sentiment value to eachtweet. I have downloaded the dictionary file from below link. Click here for Dictionary Last part of project is to show results of sentiments analysis in form of visualizations. Here I will be using Tableau for it. I will be connecting Tableau to Hive using Hortonworks ODBC Driver that I downloaded from Hortonworks website (link mentioned in references section). I will show the results of analysis in the form graphs and maps using Tableau’s inbuilt VIZQL server. Data sets and Software: Sentiment Data: - Sentiment Data is unstructured data that represents opinions, emotions, attitudes contained in sources such as social media posts, online blogs, and product reviews etc. Whyuse sentiment Data:- Organizations use sentiment data to know what people feel about their product and what they can do to effectively market their product. How did I fetched Twitter Data: - Created twitter app, configured flume.conf with app credentials and ran flume. All the steps for fetching data from twitter using Apache Flume I have mentioned in a YouTube video and a ppt, the link of which is below. I have alsouploaded video at ublearns discussion forum of DC. YouTube: - https://youtu.be/E1w5SkE7Cco Slide share: - http://www.slideshare.net/bharat3khanna/extracting-twitter-data-using- apache-flume Source code for Flume-Snapshot.jar:- Idownloadedsource code of Flume-snapshot.jarfromgithub and builtthe jarusingmavenpackage inHadoop cluster.
  • 3. 2 Click here for Flume Source Code Size of Data: - Though there is no limitation of amount of data I can get from twitter but for this project, I am going to do my analysis on approximately 100 mb of data. AlgorithmsUsed:- IamnotusingMap-Reduce Algorithmhere,sinceIwanttodoanalysis oncomplete data and I don’twant to use aggregatedmeasures.If I wouldhave usedMap Reduce,thenmy lot of data wouldhave beenaggregatedbyreducer.My source data isin JSON format and I am usingHive- serde.jar (serde stands serializer and deserializer) that helps in parsing the JSON data effectively to hive tables. Source code forHive-serde.jar:-Idownloaded source code of Hive-serde.jarfromgithubandbuiltthe jar using maven package in Hadoop cluster. Clickhere forHive-serde.jarsource code Analysis to be done on Twitter data: - I am going to do following analysis using Hive and Tableau:- a) Maximum tweets count per user. b) Count of retweets. c) Geographically mapping people’s sentiments towards Mr. Modi. References: - http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop https://github.com/cloudera/cdh-twitter-example https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon http://hortonworks.com/products/releases/hdp-1-3/#add_ons