Amazon Reviews Sentiment Analysis

Amazon Reviews Sentiment Analysis - Data Warehouse and Data Mining (UCS625) Project Report
Akshit Arora (akshit.arora1995@gmail.com) and Arush Nagpal (arushngpl16@gmail.com).
1
Amazon Reviews Sentiment Analysis
Arush Nagpal1
, Akshit Arora1
1
Thapar Institute of Engineering and Technology University, Patiala - 147004, Punjab, India
Sentiment analysis is an important step towards comprehension in natural language processing. Analyzing user sentiments towards
products through their review comments and ratings can be economically profitable to product designers. We propose a platform that
classifies the reviews given by users on the amazon product page, into positive and negative sentiments using a simple-rule based system.
Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical
features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts.
We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions
for expressing and emphasizing sentiment intensity.
Index Terms—Analysis, Amazon, Products, Reviews, Sentiment.
I. INTRODUCTION
1.1 Need of the system
ore than ever before, people’s judgments of what to do, or
what to eat, are governed by the opinions of other people.
The internet has become the ultimate trove of the opinions of
many, many people. Today, sites like Amazon have become a
vast database for products that include reviews and opinions
written by everyday people.
Around the globe, there are more than 6,500 daily
newspapers selling close to 400 million copies every day.
Additionally, there are blogs, micro blogs, periodicals,
magazines, fanzines, etc. How can we make sense of all this
information? How can we classify it and aggregate it so that we
can perform quantitative analysis?
As a seller, it is essential to “stay on top of your game”, i.e.,
keep your product updated with the most requested features.
However, most e-commerce websites provide only an average
rating (out of 5) for each product. Consequently, it is difficult
to identify why people like or dislike a particular product. We
aim to solve this problem.
Apart from quantitative reviews (which are mostly skewed),
amazon also records qualitative reviews. The objective is to
assess those text reviews and determine whether they are
negative or positive. Not only classification of sentiment but we
also focus on determining how strong the sentiment is.
Sentiment analysis research focuses on understanding the
positive or negative tone of a sentence based on sentence
syntax, structure, and content.
This paper describes the working of a simple rule-based
system called VADER (Hutto & Gilbert, 2004), for Valence
Aware Dictionary for Sentiment Reasoning.
1.2 Applications
Sentiment analysis is useful to a wide range of problems
that are of interest to human-computer interaction
practitioners and researchers, as well as those from fields such
as sociology, marketing and advertising, psychology,
economics, and political science. It can be used to solve the
problem.
Nowadays, social media has become a platform for people to
convey their voice to the public. Among various opinions that
people share and exchange, there are a lot of comments about
consumer products. Recently, it has been shown that the chatter
of the consumers in the social media such as Facebook, Twitter,
Myspace, Google+ and etc. correlates strongly with the
product’s actual financial performance in the market. This
forms a beneficial database for companies to analyze the
consumers’ demands in order to make a quantitative prediction
of their potential customers.
1.3 Challenges in development
The goal of our project is to apply rule-based for sentiment
analysis, or opinion mining, on user generated text on the web,
such as movie or product reviews, or comments on social
networks and forums. Given the content of this user generated
text, we are looking to classify the reviews/comments as being
positive or negative. An opinion is defined as a positive or
negative sentiment, view, attitude, emotion, or appraisal about
an entity or an aspect of the entity from an opinion holder. This
is a relevant problem in today’s world as the amount of user
generated text on the web is increasing and sentiment analysis
can be used to detect the mood of users on a forum or to detect
spam if the text is too negative. By building features to
categorize the content of a given text, we use rule-based
techniques to detect positive vs negative sentiment in the text.
Some of these challenges stem from the sheer rate and
volume of user generated social content, combined with the
contextual sparseness resulting from shortness of the text and a
tendency to use abbreviated language conventions to express
sentiments.
A comprehensive, high quality lexicon is often essential for
fast, accurate sentiment analysis on such large scales.
We use a combination of qualitative and quantitative
methods to produce, and then empirically validate, a gold-
standard sentiment lexicon that is especially attuned to
microblog-like contexts. We next combine these lexical
features with consideration for five generalizable rules that
embody grammatical and syntactical conventions that humans
use when expressing or emphasizing sentiment intensity.
M

Amazon Reviews Sentiment Analysis - Data Warehouse and Data Mining (UCS625) Project Report 2
II. EXISTING WORK
Sentiment analysis, or opinion mining, is an active area of
study in the field of natural language processing that analyzes
people's opinions, sentiments, evaluations, attitudes, and
emotions via the computational treatment of subjectivity in text.
It is not our intention to review the entire body of literature
concerning sentiment analysis. Indeed, such an endeavor would
not be possible within the limited space available (such
treatments are available in Liu (2012) and Pang & Lee (2008)).
We do provide a brief overview of canonical works and
techniques relevant to our study.
A. Sentiment Lexicons
A substantial number of sentiment analysis approaches rely
greatly on an underlying sentiment (or opinion) lexicon. A
sentiment lexicon is a list of lexical features (e.g., words) which
are generally labeled according to their semantic orientation as
either positive or negative (Liu, 2010). Manually creating and
validating such lists of opinion-bearing features, while being
among the most robust methods for generating reliable
sentiment lexicons, is also one of the most time-consuming.
For this reason, much of the applied research leveraging
sentiment analysis relies heavily on preexisting manually
constructed lexicons.
B. Sentiment Intensity (Valence-Based) Lexicons
Many applications would benefit from being able to
determine not just the binary polarity (positive versus negative),
but also the strength of the sentiment expressed in text. Just
how favorably or unfavorably do people feel about a new
product, movie, or legislation bill? Analysts and researchers
want (and need) to be able to recognize changes in sentiment
intensity over time in order to detect when rhetoric is heating
up or cooling down. It stands to reason that having a general
lexicon with strength valences would be beneficial.
C. Lexicons and Context-Awareness
Whether one is using binary polarity-based lexicons or more
nuanced valence-based lexicons, it is possible to improve
sentiment analysis performance by understanding deeper
lexical properties (e.g., parts-of-speech) for more context
awareness. Despite their ubiquity for evaluating sentiment in
social media contexts, there are generally three shortcomings of
lexicon-based sentiment analysis approaches: 1) they have
trouble with coverage, often ignoring important lexical
features which are especially relevant to social text in
microblogs, 2) some lexicons ignore general sentiment intensity
differentials for features within the lexicon, and 3) acquiring
a new set of (human validated gold standard) lexical features
– along with their associated sentiment valence scores – can
be a very time consuming and labor intensive process. We view
the current study as an opportunity not only to address this gap
by constructing just such a lexicon and providing it to the
broader research community, but also a chance to compare its
efficacy against other well-established lexicons with regards to
sentiment analysis of social media text and other domains.
III. WORKING OF THE PROPOSED SYSTEM
We will use VADER, which is a lexicon and rule based
sentiment analyzer tool. A sentiment lexicon is a list of lexical
features (e.g., words) which are generally labeled according to
their semantic orientation as either positive or negative (Liu,
2010). Manually creating and validating such lists of opinion-
bearing features, while being among the most robust methods
for generating reliable sentiment lexicons, is also one of the
most time-consuming. For this reason, much of the applied
research involving sentiment analysis relies heavily on pre-
existing manually constructed lexicons. We will use the
VADER sentiment lexicon which is a combination of many
lexicons provided by LIWC (Linguistic Inquiry
and Word Count), ANEW (The Affective Norms for English
Words) and GI (The General Inquirer) but the words which had
a mean sentiment of 0.0 were removed resulting in a total of
7,500 lexical features with validated valence scores that
indicated both the sentiment polarity (positive/negative), and
the sentiment intensity on a scale from –4 to +4. For example,
the word “okay” has a positive valence of 0.9, “good” is 1.9,
and “great” is 3.1, whereas “horrible” is –2.5, the frowning
emoticon “:(” is –2.2, and “sucks” and “sux” are both –1.5.
The proposed system will be fed a JSON file of the reviews
of any product from any website. Next we will provide it with
the attribute which is to be analyzed. After cleaning and
processing of the data, an output file will be returned which will
have 4 values:
1). Negative Sentiment
2). Neutral Sentiment
3). Positive Sentiment
4). Compound Sentiment
All the four values will have a total sum of 1. This will be fed
to another python code to produce a CSV file. Now we will
process it to calculate the average negative, positive, neutral and
the compound reviews. Now we can analyze the products on
the basis of that result.
1) If there is a greater positive sentiment, most of the
people gave positive reviews on the product and it is
actually good.
2) If there is a greater negative sentiment, most of the
people gave negative reviews on the product and it is not
actually a good product however good it might have been
advertised or predicted.
3) If there is a greater neutral sentiment, most of the
people gave neutral reviews on the product and did not
express much of content or satisfaction about the product.
Instead the reviews were more of a description of the
product.
4) If there is a greater compound sentiment, it means that
the product reviews have a greater use of the word “but”
and the product has both of pros and cons and not a single
majority of like or dislike.

Here is a sample input given to the system:
{"reviewerID": "A3155NWLKXEY1I", "asin":
"B00009RAX7", "reviewerName": "AKO California",
"helpful": [2, 2], "reviewText": "I had a cracked air intake hose
that caused the "check engine" light to go on, but a couple days
after I fixed it. I wasn't sure if that was the reason, and if it was,
I didn't want to take it to a dealer just to clear that code. This
did the trick. Very easy to use, gave me the code so I could
check what it was. Once I found out, I knew it was the cracked
hose. Cleared the message using this unit, and it never came
back. Had I taken it to a mechanic or dealer, they'd probably
have told me it was some other problem and I wouldn't have
known if they were telling me the truth or not.", "overall": 5.0,
"summary": "Pretty easy to use", "unixReviewTime":
1206748800, "reviewTime": "03 29, 2008"}^
{"reviewerID": "A31Y28UEDXQ0HB", "asin":
"B00009RAX7", "reviewerName": "Jack W. Wolfe",
"helpful": [0, 0], "reviewText": "this scanner works just like
they say you have everything you need to scan your auto.
WORTH EVERY PENNY!!", "overall": 5.0, "summary":
"great buy", "unixReviewTime": 1217116800, "reviewTime":
"07 27, 2008"}^
{"reviewerID": "A9GPCR9WJQCCJ", "asin":
"B00009RAX7", "reviewerName": "Jim "gearhead4"",
"helpful": [0, 0], "reviewText": "I purchased this Acton scanner
2 years ago when our older vehicle was repeatedly showing its
MIL (Malfunction Indicator Lamp). I was able to clear the code
immediately. When the MIL illuminated again later, I was able
to track down the problem (O2 sensor heater trouble) and
correct it by reseating the connector. Later, I used the scanner
to diagnose which cylinder was misfiring. By tapping on that
cylinder's fuel injector, I was able to correct the misfiring.
Again, the Acton cleared the MIL display. My mechanic uses a
more sophisticated (and much more expensive) OBD scanner
in his day to day work, but for someone who does occasional
maintenance on his own car, this tool is worth the $89
investment. If you paid more, you paid too much.", "overall":
4.0, "summary": "A good tool for $89", "unixReviewTime":
1214784000, "reviewTime": "06 30, 2008"}^
{"reviewerID": "A2TT3U4U8NMWEL", "asin":
"B00009RAX7", "reviewerName": "JX", "helpful": [3, 3],
"reviewText": "Actron seems to be a company that makes high
quality diagnostic equipment. This auto-scanner tool is very
affordable and excellent value for money. Works very well and
is pretty rugged as I had hoped. The only problem is that fixing
the vehicle based on the diagnosed code is NOT always helpful,
sorry. Repairing your car based on these codes requires a good
understanding of auto repair. For example: code PM001 -
Cylinder Misfire. What can a NOVICE do with ALL that
"helpful" information (LOL). Relatively easy to use. I will
give full marks for the quality of the physical device
otherwise.", "overall": 5.0, "summary": "Actron Auto Scanner",
"unixReviewTime": 1190764800, "reviewTime": "09 26,
2007"}^
{"reviewerID": "A3REK3OFONWB1Q", "asin":
"B00009RAX7", "reviewerName": "Paul M. Provencher
"ppro"", "helpful": [10, 19], "reviewText": "Have you ever felt
like your auto mechanic was "Mr. Wizard"? All the way down
to the bad attitude? Only to find he has a doo-dad sitting
somewhere out of eyesight that tells all, like a crystal ball...Well
for a reasonable price you can get your own crystal ball. It
might be able to predict the future and track flying monkeys but
it can tell you very quickly why the "Check Engine" light has
come on. Big or small you know what might be waiting for you
when you go see Mr. Wizard. Sometimes you might even be
able to track down the problem yourself and cut the cost to fix
it. I would not go to Oz without mine!", "overall": 5.0,
"summary": "Pay no attention to that man behind the curtain...",
"unixReviewTime": 1174780800, "reviewTime": "03 25,
2007"}^
After running this input through the system, we receive an
output which is of the form:
Negative Neutral Positive Compound
0 0.526 0.474 0.2023
0 1 0 0
0 0.519 0.481 0.5719
0 0.779 0.221 0.1779
0 0.58 0.42 0.4404
0 0.548 0.452 0.5106
0.412 0.336 0.252 -0.2263
0 0.256 0.744 0.4404
0.219 0.781 0 -0.1027
0.239 0.761 0 -0.296
0.147 0.6 0.253 0.3818
0 0 1 0.6588
0 0.404 0.596 0.7096
0.524 0.476 0 -0.296
0 0.448 0.552 0.5719
0 0.439 0.561 0.7506
0 0.182 0.818 0.6696
0 0.455 0.545 0.5859
0 1 0 0
0 0.507 0.493 0.7783
0 1 0 0
0 1 0 0
0 1 0 0
0 1 0 0
0 0.196 0.804 0.6249
0 1 0 0
0 0.196 0.804 0.6249
0 1 0 0
0 0.256 0.744 0.4404
After this, we will take the average of the four sentiments to

calculate the final result.
IV. DATA COLLECTION AND PREPARATION
We used Amazon reviews dataset provided by
https://snap.stanford.edu/data/web-Amazon.html
We had to use various data cleaning techniques to get dataset
for our use:
1) Separating multiple JSON objects.
2) Parsing the summary field of JSON data.
3) Parsing lexicon data to dictionary.
4) Separating words and emoticons from Data.
5) Filtering Negation words.
6) Filtering all upper case stressful words.
7) Filtering booster words like “very”, “greatly”, etc.
8) Filtering idioms and spam words
9) Converting the result to CSV format and calculating
averages of all sentiments..
Everything was done using python scripts
V. TRAINING OF THE MODEL
As discussed above, we use the VADER Lexicon file to train
the system and not actually a machine learning or data mining
algorithm to train the model. It is a rule based system in which
we will be using the rules already created and tested by
researchers. The training process involves reading the lexicon
file which has 7517 words and emoticons to be precise and put
them into a python dictionary. So that we can quickly extract
the sentiment of the words extracted from the reviews. The
structure of the lexicon file is as follows:
[Word] [Mean] [Standard Deviation] [A list of ratings based
on emotions varying from -4 to +4]
We will be creating a hash map which uses the first two
fields:
[Word]: [Mean Sentiment] to train the model and then use it
later.
VI. TESTING OF THE MODEL
Since the lexicon file is accepted by researchers worldwide,
there is little scope of errors in the analysis. Besides, we tested
some of the sentences for their sentiments and the results were
found out to be quite satisfactory. Here is the table with the
scores:
The product was very Bad:
'positive': 0.0, 'neutral': 0.513, 'negative': 0.487
I hated the product:
I hated the product!:
I hated the product!!:
I hated the product!!!:
I really hate the product!!:
I hate the service:
I hated the product:
I like the product:
I love the product:
The product is good:
The product is great:
The product is awesome:
I am happy with the product :):
I am happy with the product.:
Really?? You don't deserve to be in the market!:
The product is awesome:
The product works:
The product is very beneficial:
I would never recommend it to anyone:
Thus we can safely rely on the proposed system to analyse the
reviews.
VII. RESULTS AND DISCUSSIONS
We tested the system on Amazon Product – “Jumper cables
Automobile parts” Reviews. The dataset chosen had 1259
records. The results obtained were as follows:
Negative 0.035002
Neutral 0.595128
Positive 0.369875
Compound 0.296674
So we can analyze that almost 60% of the reviews were
neutral. 37% of the reviews were positive and appreciated the
product. A very low portion almost 3.5% reviews were
negative. A considerable portion of the reviews which is 30%
were compound reviews, which means that they had both pros
and cons in it indicated by the presence of the word “but”.

VIII. CONCLUSIONS AND FUTURE SCOPE
We can conclude that buying Automobile parts from
Amazon is a great deal since most of the users were either
positive or neutral regarding their reviews on their purchases.
Scope: VADER served as an efficient tool to predict the
sentiments and it can extended to almost any type of product
reviews. That can be any website which has its primary
language as English. Also, it can be used for sentiment analysis
in case of Facebook posts as well as twitter tweets related to
particular search term.
IX. REFERENCES
Hutto, C. J. & Gilbert, E. (2014). VADER: A Parsimonious
Rule-based Model for Sentiment Analysis of Social Media
Text. AAAI 2014.
Liu, B. (2012). Sentiment Analysis and Opinion Mining. San
Rafael, CA: Morgan & Claypool.
Pang, B., & Lee, L. (2008). Opinion mining and sentiment
analysis. Foundations & Trends in Information Retrieval, 2(1),
1–135.
Liu, B. (2010). Sentiment Analysis and Subjectivity. In N.
Indurkhya & F. Damerau (Eds.), Handbook of Natural
Language Processing (2nd Ed). Boca Raton, FL: Chapman &
Hall.

Amazon Reviews Sentiment Analysis

Recommended

Recommended

More Related Content

More from Akshit Arora

More from Akshit Arora (12)

Recently uploaded

Recently uploaded (20)

Amazon Reviews Sentiment Analysis