SlideShare a Scribd company logo
1 of 42
Download to read offline
Social Media & Sentiment Analysis
            How	
  I	
  learned	
  to	
  stop	
  worrying	
  and	
  love	
  the	
  internets	
  


                                                                     @michaelwilde | David Carasso
                                                                       Chief Mouth                 | Chief Mind
Copyright	
  ©	
  2012	
  Splunk	
  Inc.	
  
"What	
  is	
  social	
  data"?	
  



What is social data?


                  2	
  
"What	
  is	
  social	
  data"?	
  


data generated from
 human activity on
  social networks

                  3	
  
"What	
  is	
  social	
  data"?	
  


Oh yeah, right Twitter.
 But I work in IT… so,
  who cares, right?

                   4	
  
Social Data Should be in Splunk!
                               {[-]
                                 checkin : {[-]
•    easy to analyze with           badges : [],
     fields                          created : 1345093539,
                                    geolat : "41.7686007592",
•    easy to create realtime/       geolong : "-72.621648",
                                    mayor : {[-]
     historical dashboards and           type : "nochange"
                                    },
     views                          primarycategory : {[-]
                                         fullpathname : "Food:Mexican Restaurants"
•    easy to translate many              iconurl : "https://foursquare.com/img/cat
     word problems in to       mexican_32.png",
                                         id : "4bf58dd8d48988d1c1941735",
     questions                           nodename : "Mexican Restaurants"
                                    },
                                    timezone : "America/New_York",
                                    user : {[-]
                                         gender : "female"
                                  5	
  },
                                    venue : {[-]
"What	
  is	
  social	
  data"?	
  
Wilde, we just
 said we work
in IT and don’t
  care about
    Twitter!
                    6	
  
"What	
  is	
  social	
  data"?	
  

Except when we search
  on the words “site”
    AND “is down”


                   7	
  
"What	
  is	
  social	
  data"?	
  




                8	
  
"What	
  is	
  social	
  data"?	
  

     Except when I search
      on the words “site”
        AND “is down”

IT and the brand collide at times.
                        9	
  
Getting Social is	
  social	
  data"?	
  
         "What	
  Data
                                             Network




                                              Method




 Frequency:                                  3rd Parties

              Real-time Scheduled
               Push       Pull

                                    10	
  
Best thing about Social Data?
                 Its almost always
                 Structured JSON!




              11	
  
What can you do with it?

                          Map Conversations




Analyze People

                 12	
  
What can you do with it?

                       Enrich it with
                       lookups



Track Olympians
              13	
  
Indexing the social mother lode
                        A single stream of big data

                        @itayNeeman’s curl splitter
                        scripted input (TBR)

                        Multiple forwarders
                        installed on a single server
                        streaming to multiple
                        indexers



               14	
  
Sir Bill, I believe the demos cometh..




                           …whoa.

                  15	
  
The Double Rainbow
                              When it comes to
                              “numbers”, the search
                              language rocks!


            In social, what people “mean”
            matters. For that you’ll need
            some new tools that
            understand words and
            language


                                       “…what does it mean?!”

                   16	
  
Analyzing Sentiment

   Extract linguistic, subjective
information of opinions, attitudes,
    emotions, and perspectives



                17	
  
…and there are perspectives




             18	
  
…and there are perspectives




             19	
  
Understanding brings…

                   Empathy with customers
                   and prospects

                   Intelligent business and
                   design decisions




          20	
  
Brand Perception Impacts Stock
 In 2011, our friends at Netflix announced that it would be
 increasing its subscription prices. The feedback on its
 Facebook page was outrage and the impact on its stock
 price was dramatic.




                           21	
  
Sentiment complements and informs
 “We analyze several surveys on consumer
 confidence and political opinion over the
 2008 to 2009 period, and find they
 correlate to sentiment word frequencies in
 contemporaneous Twitter messages… …as
 high as 80%, and capture important large-
 scale trends.

 The results highlight the potential of text
 streams as a substitute and supplement for
 traditional polling.”


   From	
  Tweets	
  to	
  Polls:	
  Linking	
  Text	
  SenOment	
  to	
  Public	
  Opinion	
  Time	
  Series	
  (CMU:	
  
   O'Connor,	
  Balasubramanyan,	
  Routledge,	
  and	
  Smith	
  2010)	
  
   	
  
                                                                                                                             22	
  
Twitter vs. Traditional Polling




               23	
  
Box Office Revenue Forecasting

“We use the chatter from Twitter.com to forecast box-office
revenues for movies. We show that a simple model built from
the rate at which tweets are created about particular topics
can outperform market-based predictors. We further
demonstrate how sentiments extracted from Twitter can be
further utilized to improve the forecasting power of social
media.”
    Asur and Huberman 2010




                              24	
  
Easy	
  
25	
  
What’s in a word?
Terms have many context
dependent meanings.
"   depend on the writer, the
    reader, and their relationship,
    history, goals and preferences
"   “unpredictable” bad in general,
    but good in movie reviews.
"    “jobs” data was affected by
     iPhone release

      26	
  
How are you feeling right now?
     Plutchik's Wheel of Emotions

                                    Ekman’s Six Basic Emotions




                         27	
  
Sentiment analysis gone
wrong
When Anne Hathaway is mentioned, it’s
almost always in a positive context, and
as a result some trading algorithms
seem to purchase Berkshire Hathaway.




When she is mentioned
in the news, the stock
goes up.


 28	
  
29	
  
Bags of Words and Phrases

                       Many sentiment words and
                       expressions are not directly
                       influenced by what is around them:
                                That was fun :)

But certainly they can be!
      They said it would be wonderful, but they were wrong.
     This "wonderful" movie turned out to be boring.
                            30	
  
Human Engineering vs. Machine Learning
                  Hand-built expert systems and parse rules
                  Similarly, human engineered lists of good
                  and bad words (e.g., “good”, “great”, “bad”,
                  “awful”)



  Natural Language Processing & Speech
  Understand - statistical and data driven.
  Sentiment analysis generally uses statistics
  and training sets.
                               31	
  
Machine Learning Choices
"    Learning Type
     –  Supervised: + straightforward. – lots of training data.
     –  Unsupervised: + no training data. - may not find what you
        want.
     –  Semi-Supervised: + small initial training data. – interactive
        feedback.

"    Algorithms
     –  Naïve Bayes: +simplest probabilistic classifier model.
        – assumes words are independent
     –  EM: +performs better, doesn’t assume independence.
        - more complicated, over-fitting a problem
                                     32	
  
Supervised Learning
Labeled	
                                      New	
  
Training	
                       Learn	
     Unlabeled	
  
  Data	
                         Model	
       Data	
  



Labeled	
  	
                                                 New	
  
  Test	
  	
      Validate	
     Model	
  
                                             Predict	
       Labeled	
  
 Data	
             Model	
                    Labels	
       Data	
  




                                    33	
  
The Effect of Negation
“The food was not good”
Strategies: Negating
sentiment for all terms up to a
breaking punctuation (i.e.,
comma or sentence end)
Negation effect is dependent
on the term.

       • Mild words negate about the same: not bad ≈ good
       • Extreme words negate towards neutral: not horrible ≈ average
	
  
                                           34	
  
Learning Bias
A	
  common	
  feature	
  of	
  online	
  user-­‐supplied	
  reviews	
  is	
  that	
  the	
  posiOve	
  
     More occurrences
reviews	
  vastly	
  out-­‐number	
  the	
  negaOve	
  ones.	
  	
  Movie	
  reviews	
  at	
  IMDB:	
  
     of “bad” in 10-star
	
   reviews than in 2-
       star ones.

	
  
       Normalize by
       accounting for
       relative
       frequencies.


                                                     35	
  
Sentiment in Social Media
"    Emoticons: :-) ;( :/
     –  Reliable measure of sentiment
     –  Simple regex can cover more than 95% of emoticons on twitter
     –  Ignores complex emotions
"    Lengthening
     –  This talk is greeeeeat! David is the beeeeeeest! Ahhhhhhhhh!
     –  In English 3 or more of the same char in a row doesn’t exist,
        except for 7 obscure terms in unix dict.
     –  Can indicate heightened emotion, but actual lengths are probably
        not meaningful.
     –  Useful to normalize because of how common they are (hiiii è hi)

                                     36	
  
Maybe it’s not so hard?
“We are only interested in aggregate
sentiment. A high error rate merely implies
the sentiment detector is a noisy
measurement instrument. With a fairly
large number of measurements, these
errors will cancel out relative to the
quantity we are interested in estimating…

       From Tweets to Polls: Linking Text Sentiment to Public Opinion Time
       Series
	
  

                                        37	
  
Splunk Sentiment Analysis App	
  




               38	
  
Design	
  Decisions	
  
•    Use supervised learning. Why? Doesn’t require interactive
     feedback. Learning get almost the best they are going to
     do with only a few hundred or perhaps a few thousand
     documents
•    Use	
  naïve	
  bayes.	
  	
  Why?	
  Dirt	
  simple	
  and	
  understandable.	
  	
  The	
  difference	
  
     between	
  the	
  best	
  algorithms	
  and	
  a	
  simple	
  naïve	
  bayes	
  is	
  generally	
  only	
  
     a	
  few	
  percent.	
  	
  




                                                       39	
  
Design Decision
•    Handle lengthening. Greeeat!
•    Ignore negation. In the aggregate
     it won’t matter much.
•    Supply multiple trained models:
     •  Movie reviews (using IMDB ratings)
     •  Tweets (using emoticons to create
        training sets)
     •  Please suggest more




                                     40
Summary
•    Sentiment analysis helps you understand your customers
     and marketplace.
•    True sentiment analysis is hard.
•    Aggregate sentiment analysis is easier but still very
     valuable.
•    The simplest algorithms work almost as well as the most
     complex, given a few thousand training points.
•    Splunk has a Sentiment App.
     •  Download it and give feedback.
     •  Integrate Social data into your existing corporate data
     •  Share your trained models with others.
                                      41	
  
“splunk now knows when you’ve
 “I actually learned something! Not.”
                                             been naughty or nice #sentiment”



  “#splunk #sentiment niiice.”



                           Teh End              If you’re reading this, start
“keep-it-simple sentiment works #conf2012”
                                                clapping. The talk is over.


     “Worst talk. Ever.”                     Golf clapping at #sentiment_talk



                                        42

More Related Content

What's hot

Best Practices for Sentiment Analysis Webinar
Best Practices for Sentiment Analysis Webinar Best Practices for Sentiment Analysis Webinar
Best Practices for Sentiment Analysis Webinar Mechanical Turk
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisRexNige
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment AnalysisJaganadh Gopinadhan
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment AnalysisMakrand Patil
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesKarol Chlasta
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisKarthik Sharma
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysisAmenda Joy
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment AnalysisNihar Suryawanshi
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion MiningAli Habeeb
 
Sentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetSentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetMaham F'Rajput
 
Big Data & Sentiment Analysis
Big Data & Sentiment AnalysisBig Data & Sentiment Analysis
Big Data & Sentiment AnalysisMichel Bruley
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Kavita Ganesan
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis worksCJ Jenkins
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Global Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseGlobal Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseSeth Grimes
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...Prateek Singh
 
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and TweetsSentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and Tweets🧑‍💻 Manuel Coppotelli
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitterpiya chauhan
 

What's hot (20)

Best Practices for Sentiment Analysis Webinar
Best Practices for Sentiment Analysis Webinar Best Practices for Sentiment Analysis Webinar
Best Practices for Sentiment Analysis Webinar
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment Analysis
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment Analysis
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
Sentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetSentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews Dataset
 
Big Data & Sentiment Analysis
Big Data & Sentiment AnalysisBig Data & Sentiment Analysis
Big Data & Sentiment Analysis
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Global Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseGlobal Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and Sense
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
Sentiment mining- The Design and Implementation of an Internet PublicOpinion...Sentiment mining- The Design and Implementation of an Internet PublicOpinion...
Sentiment mining- The Design and Implementation of an Internet Public Opinion...
 
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and TweetsSentiCheNews - Sentiment Analysis on Newspapers and Tweets
SentiCheNews - Sentiment Analysis on Newspapers and Tweets
 
Twitter Analytics
Twitter AnalyticsTwitter Analytics
Twitter Analytics
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitter
 

Similar to Social media & sentiment analysis splunk conf2012

Build Smarter Internal and External Communities
Build Smarter Internal and External CommunitiesBuild Smarter Internal and External Communities
Build Smarter Internal and External CommunitiesDan Keldsen
 
Tim Estes - Generating dynamic social networks from large scale unstructured ...
Tim Estes - Generating dynamic social networks from large scale unstructured ...Tim Estes - Generating dynamic social networks from large scale unstructured ...
Tim Estes - Generating dynamic social networks from large scale unstructured ...Digital Reasoning
 
Weigend may2012slideshare
Weigend may2012slideshareWeigend may2012slideshare
Weigend may2012slideshareAndreas Weigend
 
Social Media for Caregivers
Social Media for CaregiversSocial Media for Caregivers
Social Media for CaregiversRed Shoes PR
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social MediaSeth Grimes
 
The Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongThe Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongMarTech Conference
 
Infographics overview e&e 20120608_final
Infographics overview e&e 20120608_finalInfographics overview e&e 20120608_final
Infographics overview e&e 20120608_finalJason Jercinovic
 
Using social media for market research and new product development: the case ...
Using social media for market research and new product development: the case ...Using social media for market research and new product development: the case ...
Using social media for market research and new product development: the case ...Merlien Institute
 
Mining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationMining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationDigital Reasoning
 
Tech transfer workshop 20101001 final
Tech transfer workshop 20101001 finalTech transfer workshop 20101001 final
Tech transfer workshop 20101001 finalLuke Harvey-Palmer
 
AMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsAMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsStephen Tracy
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
Snowforce 2017 Keynote - Peter Coffee
Snowforce 2017 Keynote - Peter CoffeeSnowforce 2017 Keynote - Peter Coffee
Snowforce 2017 Keynote - Peter CoffeePeter Coffee
 
Let's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational designLet's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational designNikita Lukianets
 
Data Driven Sales: Building AI That Searches, Learns, and Sells
Data Driven Sales: Building AI That Searches, Learns, and SellsData Driven Sales: Building AI That Searches, Learns, and Sells
Data Driven Sales: Building AI That Searches, Learns, and SellsLeadGenius
 
Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...
Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...
Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...nois3
 
Seth Grimes - Sentiment in Social Media
Seth Grimes - Sentiment in Social MediaSeth Grimes - Sentiment in Social Media
Seth Grimes - Sentiment in Social MediaInfluence People
 
Advanced social intelligence
Advanced social intelligenceAdvanced social intelligence
Advanced social intelligencePulsar Platform
 
Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebMatthew Russell
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 

Similar to Social media & sentiment analysis splunk conf2012 (20)

Build Smarter Internal and External Communities
Build Smarter Internal and External CommunitiesBuild Smarter Internal and External Communities
Build Smarter Internal and External Communities
 
Tim Estes - Generating dynamic social networks from large scale unstructured ...
Tim Estes - Generating dynamic social networks from large scale unstructured ...Tim Estes - Generating dynamic social networks from large scale unstructured ...
Tim Estes - Generating dynamic social networks from large scale unstructured ...
 
Weigend may2012slideshare
Weigend may2012slideshareWeigend may2012slideshare
Weigend may2012slideshare
 
Social Media for Caregivers
Social Media for CaregiversSocial Media for Caregivers
Social Media for Caregivers
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
The Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongThe Human Side of Data By Colin Strong
The Human Side of Data By Colin Strong
 
Infographics overview e&e 20120608_final
Infographics overview e&e 20120608_finalInfographics overview e&e 20120608_final
Infographics overview e&e 20120608_final
 
Using social media for market research and new product development: the case ...
Using social media for market research and new product development: the case ...Using social media for market research and new product development: the case ...
Using social media for market research and new product development: the case ...
 
Mining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your OrganizationMining the Social Web for Fun & Profit Within Your Organization
Mining the Social Web for Fun & Profit Within Your Organization
 
Tech transfer workshop 20101001 final
Tech transfer workshop 20101001 finalTech transfer workshop 20101001 final
Tech transfer workshop 20101001 final
 
AMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsAMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of Analytics
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Snowforce 2017 Keynote - Peter Coffee
Snowforce 2017 Keynote - Peter CoffeeSnowforce 2017 Keynote - Peter Coffee
Snowforce 2017 Keynote - Peter Coffee
 
Let's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational designLet's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational design
 
Data Driven Sales: Building AI That Searches, Learns, and Sells
Data Driven Sales: Building AI That Searches, Learns, and SellsData Driven Sales: Building AI That Searches, Learns, and Sells
Data Driven Sales: Building AI That Searches, Learns, and Sells
 
Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...
Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...
Data Driven UX - From social to eXperience - McGraw-Hill Education - Lunch & ...
 
Seth Grimes - Sentiment in Social Media
Seth Grimes - Sentiment in Social MediaSeth Grimes - Sentiment in Social Media
Seth Grimes - Sentiment in Social Media
 
Advanced social intelligence
Advanced social intelligenceAdvanced social intelligence
Advanced social intelligence
 
Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social Web
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 

More from Michael Wilde

DockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo CenterDockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo CenterMichael Wilde
 
Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!Michael Wilde
 
Interop - Exploring Machine Data
Interop - Exploring Machine DataInterop - Exploring Machine Data
Interop - Exploring Machine DataMichael Wilde
 
Big Data for Everyman
Big Data for EverymanBig Data for Everyman
Big Data for EverymanMichael Wilde
 
Field Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your BuddyField Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your BuddyMichael Wilde
 
Splunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff MeetingSplunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff MeetingMichael Wilde
 
Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008Michael Wilde
 

More from Michael Wilde (7)

DockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo CenterDockerCon17 - Building The Super-Dynamic Demo Center
DockerCon17 - Building The Super-Dynamic Demo Center
 
Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!Do gooders unite: Save the world with technology!
Do gooders unite: Save the world with technology!
 
Interop - Exploring Machine Data
Interop - Exploring Machine DataInterop - Exploring Machine Data
Interop - Exploring Machine Data
 
Big Data for Everyman
Big Data for EverymanBig Data for Everyman
Big Data for Everyman
 
Field Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your BuddyField Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your Buddy
 
Splunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff MeetingSplunk User Group - Austin - Kickoff Meeting
Splunk User Group - Austin - Kickoff Meeting
 
Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008Splunk @ Amazon Startup - Austin, TX - 9/11/2008
Splunk @ Amazon Startup - Austin, TX - 9/11/2008
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Social media & sentiment analysis splunk conf2012

  • 1. Social Media & Sentiment Analysis How  I  learned  to  stop  worrying  and  love  the  internets   @michaelwilde | David Carasso Chief Mouth | Chief Mind Copyright  ©  2012  Splunk  Inc.  
  • 2. "What  is  social  data"?   What is social data? 2  
  • 3. "What  is  social  data"?   data generated from human activity on social networks 3  
  • 4. "What  is  social  data"?   Oh yeah, right Twitter. But I work in IT… so, who cares, right? 4  
  • 5. Social Data Should be in Splunk! {[-] checkin : {[-] •  easy to analyze with badges : [], fields created : 1345093539, geolat : "41.7686007592", •  easy to create realtime/ geolong : "-72.621648", mayor : {[-] historical dashboards and type : "nochange" }, views primarycategory : {[-] fullpathname : "Food:Mexican Restaurants" •  easy to translate many iconurl : "https://foursquare.com/img/cat word problems in to mexican_32.png", id : "4bf58dd8d48988d1c1941735", questions nodename : "Mexican Restaurants" }, timezone : "America/New_York", user : {[-] gender : "female" 5  }, venue : {[-]
  • 6. "What  is  social  data"?   Wilde, we just said we work in IT and don’t care about Twitter! 6  
  • 7. "What  is  social  data"?   Except when we search on the words “site” AND “is down” 7  
  • 8. "What  is  social  data"?   8  
  • 9. "What  is  social  data"?   Except when I search on the words “site” AND “is down” IT and the brand collide at times. 9  
  • 10. Getting Social is  social  data"?   "What  Data Network Method Frequency: 3rd Parties Real-time Scheduled Push Pull 10  
  • 11. Best thing about Social Data? Its almost always Structured JSON! 11  
  • 12. What can you do with it? Map Conversations Analyze People 12  
  • 13. What can you do with it? Enrich it with lookups Track Olympians 13  
  • 14. Indexing the social mother lode A single stream of big data @itayNeeman’s curl splitter scripted input (TBR) Multiple forwarders installed on a single server streaming to multiple indexers 14  
  • 15. Sir Bill, I believe the demos cometh.. …whoa. 15  
  • 16. The Double Rainbow When it comes to “numbers”, the search language rocks! In social, what people “mean” matters. For that you’ll need some new tools that understand words and language “…what does it mean?!” 16  
  • 17. Analyzing Sentiment Extract linguistic, subjective information of opinions, attitudes, emotions, and perspectives 17  
  • 18. …and there are perspectives 18  
  • 19. …and there are perspectives 19  
  • 20. Understanding brings… Empathy with customers and prospects Intelligent business and design decisions 20  
  • 21. Brand Perception Impacts Stock In 2011, our friends at Netflix announced that it would be increasing its subscription prices. The feedback on its Facebook page was outrage and the impact on its stock price was dramatic. 21  
  • 22. Sentiment complements and informs “We analyze several surveys on consumer confidence and political opinion over the 2008 to 2009 period, and find they correlate to sentiment word frequencies in contemporaneous Twitter messages… …as high as 80%, and capture important large- scale trends. The results highlight the potential of text streams as a substitute and supplement for traditional polling.” From  Tweets  to  Polls:  Linking  Text  SenOment  to  Public  Opinion  Time  Series  (CMU:   O'Connor,  Balasubramanyan,  Routledge,  and  Smith  2010)     22  
  • 23. Twitter vs. Traditional Polling 23  
  • 24. Box Office Revenue Forecasting “We use the chatter from Twitter.com to forecast box-office revenues for movies. We show that a simple model built from the rate at which tweets are created about particular topics can outperform market-based predictors. We further demonstrate how sentiments extracted from Twitter can be further utilized to improve the forecasting power of social media.” Asur and Huberman 2010 24  
  • 26. What’s in a word? Terms have many context dependent meanings. "   depend on the writer, the reader, and their relationship, history, goals and preferences "   “unpredictable” bad in general, but good in movie reviews. "  “jobs” data was affected by iPhone release 26  
  • 27. How are you feeling right now? Plutchik's Wheel of Emotions Ekman’s Six Basic Emotions 27  
  • 28. Sentiment analysis gone wrong When Anne Hathaway is mentioned, it’s almost always in a positive context, and as a result some trading algorithms seem to purchase Berkshire Hathaway. When she is mentioned in the news, the stock goes up. 28  
  • 29. 29  
  • 30. Bags of Words and Phrases Many sentiment words and expressions are not directly influenced by what is around them: That was fun :) But certainly they can be! They said it would be wonderful, but they were wrong. This "wonderful" movie turned out to be boring. 30  
  • 31. Human Engineering vs. Machine Learning Hand-built expert systems and parse rules Similarly, human engineered lists of good and bad words (e.g., “good”, “great”, “bad”, “awful”) Natural Language Processing & Speech Understand - statistical and data driven. Sentiment analysis generally uses statistics and training sets. 31  
  • 32. Machine Learning Choices "  Learning Type –  Supervised: + straightforward. – lots of training data. –  Unsupervised: + no training data. - may not find what you want. –  Semi-Supervised: + small initial training data. – interactive feedback. "  Algorithms –  Naïve Bayes: +simplest probabilistic classifier model. – assumes words are independent –  EM: +performs better, doesn’t assume independence. - more complicated, over-fitting a problem 32  
  • 33. Supervised Learning Labeled   New   Training   Learn   Unlabeled   Data   Model   Data   Labeled     New   Test     Validate   Model   Predict   Labeled   Data   Model   Labels   Data   33  
  • 34. The Effect of Negation “The food was not good” Strategies: Negating sentiment for all terms up to a breaking punctuation (i.e., comma or sentence end) Negation effect is dependent on the term. • Mild words negate about the same: not bad ≈ good • Extreme words negate towards neutral: not horrible ≈ average   34  
  • 35. Learning Bias A  common  feature  of  online  user-­‐supplied  reviews  is  that  the  posiOve   More occurrences reviews  vastly  out-­‐number  the  negaOve  ones.    Movie  reviews  at  IMDB:   of “bad” in 10-star   reviews than in 2- star ones.   Normalize by accounting for relative frequencies. 35  
  • 36. Sentiment in Social Media "  Emoticons: :-) ;( :/ –  Reliable measure of sentiment –  Simple regex can cover more than 95% of emoticons on twitter –  Ignores complex emotions "  Lengthening –  This talk is greeeeeat! David is the beeeeeeest! Ahhhhhhhhh! –  In English 3 or more of the same char in a row doesn’t exist, except for 7 obscure terms in unix dict. –  Can indicate heightened emotion, but actual lengths are probably not meaningful. –  Useful to normalize because of how common they are (hiiii è hi) 36  
  • 37. Maybe it’s not so hard? “We are only interested in aggregate sentiment. A high error rate merely implies the sentiment detector is a noisy measurement instrument. With a fairly large number of measurements, these errors will cancel out relative to the quantity we are interested in estimating… From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series   37  
  • 39. Design  Decisions   •  Use supervised learning. Why? Doesn’t require interactive feedback. Learning get almost the best they are going to do with only a few hundred or perhaps a few thousand documents •  Use  naïve  bayes.    Why?  Dirt  simple  and  understandable.    The  difference   between  the  best  algorithms  and  a  simple  naïve  bayes  is  generally  only   a  few  percent.     39  
  • 40. Design Decision •  Handle lengthening. Greeeat! •  Ignore negation. In the aggregate it won’t matter much. •  Supply multiple trained models: •  Movie reviews (using IMDB ratings) •  Tweets (using emoticons to create training sets) •  Please suggest more 40
  • 41. Summary •  Sentiment analysis helps you understand your customers and marketplace. •  True sentiment analysis is hard. •  Aggregate sentiment analysis is easier but still very valuable. •  The simplest algorithms work almost as well as the most complex, given a few thousand training points. •  Splunk has a Sentiment App. •  Download it and give feedback. •  Integrate Social data into your existing corporate data •  Share your trained models with others. 41  
  • 42. “splunk now knows when you’ve “I actually learned something! Not.” been naughty or nice #sentiment” “#splunk #sentiment niiice.” Teh End If you’re reading this, start “keep-it-simple sentiment works #conf2012” clapping. The talk is over. “Worst talk. Ever.” Golf clapping at #sentiment_talk 42