Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School

Suppose that a customer who has given a high rating about a mobile phone writes the following review about the product: The front camera of the phone is excellent! Truly speaking, this is the best front camera I have experienced so far. From this review, we can understand two things. First, the customer holds a positive opinion about the phone. Secondly, the front camera of the phone is the targeted feature on which the opinions have been expressed in the review. In this workshop, we will be particularly interested in discovering patterns as indicated in the second case. We will discuss a framework that enables us to first discover the targets on which the opinions have been expressed in a review and then determine the polarity of the opinions. This kind of detailed analysis helps us to discover the components or features of the products which the customers have liked or disliked and thus help us to better summarize the information.

  • Login to see the comments

Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Praxis Business School

  1. 1. Feature-Based Opinion Mining Gourab Nath Faculty Member, Data Science Praxis Business School, Bangalore gourab@praxis.ac.in | 9038333245
  2. 2. Need to purchase a cellular phone…
  3. 3. BUDGET
  4. 4. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery backup is great. The speaker is not good though.” REVIEW
  5. 5. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery backup is great. The speaker is not good though.” NON-OPINIONATED PASSAGES OBJECTIVE SENTENCE
  6. 6. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery backup is great. The speaker is not good though.” OPINIONATED PASSAGES ON FEATURES SUBJECTIVE SENTENCE
  7. 7. “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery backup is great. The speaker is not good though.” OPINIONS ON FEATURES
  8. 8. front camera - extremely clear (+3) Rear camera - amazing (+4) Battery backup - great (+5) Speaker - not good (-2) FEATURE-BASED SENTIMENT SUMMARY
  9. 9. Select few reviews (a reasonable number) and probably design a summary like this. Camera = {awesome: 5 , great: 15, extremely good: 5, …, bad: 8, poor: 5 } Display = {beautiful: 25 , lovely: 18, wonderful: 12, …, clear: 7, poor: 11 } Battery = {fine: 8 , good: 22, fast: 7, …, bad: 9} Price = {high: 10 , comfortable: 15, … } SUMMARY
  10. 10. Summary From a Capstone Project at Praxis Business School, Bangalore Instruction: Click on Oneplus 6 in the webpage Click here
  11. 11. The Problem of Sentiment Analysis
  12. 12. Bing Liu Distinguished Professor Department of Computer Science University of Illinois Chicago (UIC) Minqing Hu Data Scientist at Signifyd PhD – Computer Science University of Illinois Chicago (UIC)
  13. 13. Mining Opinion Features in Customer Reviews M. Hu and B. Liu Proceedings of the ACM SIGKDD Conference on KDD, 2004 Mining and Summarizing Customer Reviews M. Hu and B. Liu Proceedings of the ACM SIGKDD Conference on KDD, 2004 Opinion Observer: Analysing and Comparing Opinions on the web M. Hu, B. Liu and J. Cheng Proceedings of WWW, 2005 Sentiment Analysis and Subjectivity B. Liu Handbook of Natural Language Processing 2 (2010), 627-666 1 2 3 4
  14. 14. Object Components Sub Components / Attributes Cellular Phone Camera Battery Display Front Camera Back Camera Rear Camera Battery life Battery Size Battery performance ROOT Size Quality Type Features being represented by its synonyms OBJECT Thus, an object can be represented as a tree, hierarchy or taxonomy.
  15. 15. Display Front Camera Rear Camera Battery life Display Size Phone Cellular Phone FEATURES Back Camera Battery Battery Size Battery performance Display Clarity Camera
  16. 16.  Explicit Feature Example: “The battery life of this phone is too short”  Implicit Feature Example: “The phone doesn’t fit in an usual jeans pocket though.” Size of the Phone FEATURES EXPLICIT VS IMPLICIT FEATURES “Don’t know why I had spent so much money for the phone” Not value for money
  17. 17.  Explicit Opinions Example: The display clarity of this phone is amazing!  Implicit Opinions Example: The phone doesn’t fit in an usual jeans pocket though. A fact which expresses dissatisfaction / disappointment OPINIONS EXPLICIT VS IMPLICIT OPINIONS
  18. 18. Feature Based Opinion Mining
  19. 19. 1. Identification of Frequent Features 2. Identification of Opinions on each features 3. Opinion Orientation Identification 4. Infrequent Feature Identification 5. Summary Generation THE PROCESS FLOW
  20. 20. Step 1: Frequent Feature Mining Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  21. 21. POS TAGGING Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation “The front camera is extremely clear. The phone comes with 8GB RAM. It has Gorilla glass 6 on front & Gorilla glass 5 on back with aluminium frame on side. With 48MP f/1.7 (Sony IMX 586 sensor) rear camera, it clicks amazing outdoor pics. The battery backup is great. The speaker is not good though.” N N N N N N N N
  22. 22. front | camera | phone | RAM | gorilla | glass | aluminium | frame | Review rear | camera | battery | backup | speaker EXTRACTING NOUNS Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  23. 23. Review 1 : front | camera | phone | gorilla | glass | aluminium | frame | rear | battery | backup | speaker Review 2 : price | descent | sound | battery | camera | body Review 3 : phone | battery | performance | camera Review 4 : phone | life | sound | quality | battery | picture Review 5 : phone | buy Review 6 : loudspeaker | time | sound | quality EXTRACTING NOUNS Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  24. 24. BINARY REPRESENTATION Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  25. 25. BINARY REPRESENTATIONEXAMPLE Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Front
  26. 26. Front ASSOCIATION RULES MINING SUPPORT 0.7 0.6 0.4 0.4 0.3 0.2 0.3 EXAMPLE Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  27. 27. Front ASSOCIATION RULES MINING P(Camera, Front) 0.4 EXAMPLE Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation SUPPORT
  28. 28. Front ASSOCIATION RULES MINING P(Battery, Front) EXAMPLE Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation 0.2SUPPORT
  29. 29. Front ASSOCIATION RULES MINING P(Battery, Life) 0.4 EXAMPLE Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation SUPPORT
  30. 30. Front ASSOCIATION RULES MINING P(Camera, Buy) 0.2 Minimum Support Threshold = 0.4 (say) EXAMPLE Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation SUPPORT
  31. 31. front ASSOCIATION RULES MINING EXPERIMENTAL RESULTS One Plus 6 Features – Extracted from the Reviews written in www.amazon.in Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  32. 32. FEATURE PRUNING COMPACTNESS PRUNING The method checks features that contains at least 2 words and remove those that are likely to be meaningless Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  33. 33. FEATURE PRUNING COMPACTNESS PRUNING The method checks features that contains at least 2 words and remove those that are likely to be meaningless Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  34. 34. FEATURE PRUNING COMPACTNESS PRUNING The method checks features that contains at least 2 words and remove those that are likely to be meaningless EXAMPLE “The camera quality is really good” “I love the quality of the camera” “awesome camera and the phone comes with a quality display” counter example “The phone has an awesome front camera and a quality display” Compact Compact Not Compact Compact Compact but has no dependency Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  35. 35. FEATURE PRUNING COMPACTNESS PRUNING The method checks features that contains at least 2 words and remove those that are likely to be meaningless COUNTER- EXAMPLES “Both the camera and the battery is good” “Although good camera but not good battery” “lovely camera quality and nice battery” Compact Compact Compact Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  36. 36. FEATURE PRUNING COMPACTNESS PRUNING The method checks features that contains at least 2 words and remove those that are likely to be meaningless COUNTER- EXAMPLES “Both the camera and the battery is good” “Although good camera but not good battery” “lovely camera quality and nice battery” Compact Compact Compact However note: The features here are separated by conjunctions (which is mostly the cases) Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  37. 37. FEATURE PRUNING COMPACTNESS PRUNING The method checks features that contains at least 2 words and remove those that are likely to be meaningless MODIFICATION Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  38. 38. FEATURE PRUNING EXPERIMENTAL RESULTS One Plus 6 Features – Extracted from the Reviews written in www.amazon.in Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation front
  39. 39. FEATURE PRUNING REDUNDANCY PRUNING The method checks features that contains SINGLE word and remove those that are likely to be meaningless p-support (pure support) – p support of a feature f is the number of sentences that f appears and these sentences must contain no feature phrase that is a superset of f Example: Consider the feature: camera Consider the other features that contains the word camera: front camera | rear Camera | back camera | camera quality. P-support of camera = number of reviews in which camera occurred along and not with any of its supersets = 100 – (20 + 15 + 23 + 10) = 32 A Feature will be considered meaningful if it satisfied the minimum threshold for p-support. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  40. 40. front FEATURE PRUNING EXPERIMENTAL RESULTS One Plus 6 Features – Extracted from the Reviews written in www.amazon.in Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  41. 41. FREQUENT FEATURE MINING FLOWCHART Review Database Frequent Features POS Tagging Frequent Feature Identification Feature Pruning Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation
  42. 42. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Step 2: Opinion Word Extraction
  43. 43. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation ADJECTIVES AS OPINION Mining and Summarizing Customer Reviews M. Hu and B. Liu Proceedings of the ACM SIGKDD Conference on KDD, 2004 Examples: “The camera of the phone is good” “The display looks dull” “the sound quality of the speaker is fantastic” “The phone has some really cool features” Adjective Adjective Adjective Adjective  This was based on previous research works on subjectivity The nearest adjective is considered as opinion
  44. 44. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation ADJECTIVES AS OPINION COUNTER- EXAMPLES Examples: “The camera of the phone is extremely good” “The headphone is not working” “The speaker of the phone is doing great” “The phone has some nice cool features” “The display is not bad” Adverb + Adjective Negation + Verb Verb + Adjective Adjective + Adjective Negation + Adjective
  45. 45. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation OPINION EXTRACTIONALGORITHM Opinion Word/s Extraction:
  46. 46. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Step 3 Opinion Orientation Identification
  47. 47. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation OPINION ORIENTATION IDENTIFICATIOIN ONLY ADJECTIVES Adjective list: Seed list:
  48. 48. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation OPINION ORIENTATION IDENTIFICATIOIN WORDNET In WordNet , adjectives are organized into bipolar clusters
  49. 49. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation OPINION ORIENTATION IDENTIFICATIOIN WORDNET Fast = + 2 Seed list: In general, adjectives share the same orientation as their synonyms and opposite orientation as their antonyms.
  50. 50. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation OPINION ORIENTATION IDENTIFICATIOIN ALGORITHM
  51. 51. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Examples: “The camera of the phone is extremely good” “The headphone is not working” “The speaker of the phone is doing great” “The phone has some nice cool features” “The display is not bad” Adverb + Adjective Negation + Verb Verb + Adjective Adjective + Adjective Negation + Adjective OPINION ORIENTATION IDENTIFICATIOIN LIMITATION
  52. 52. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation FLOWCHARTTILL NOW! Review Database POS Tagging Frequent Feature Identification Feature Pruning Frequent Features Opinion Word Identification Opinion Orientation Identification
  53. 53. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Step 4: Infrequent Feature Mining
  54. 54. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation “The picture is absolutely amazing.” “The software that comes with it is amazing” Note: The above two sentences shares same opinion ‘easy’ yet describing different features. INFREQUENT FEATURE MINING
  55. 55. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation “The picture is absolutely amazing.” “The software that comes with it is amazing” Note: The above two sentences shares same opinion ‘easy’ yet describing different features. INFREQUENT FEATURE MINING COUNTER- EXAMPLE “The delivery guy was amazingly patient” Shares the same opinion but is not a relevant feature
  56. 56. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation INFREQUENT FEATURE MINING Algorithm
  57. 57. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Review Database POS Tagging Frequent Feature Identification Feature Pruning Frequent Features Opinion Word Identification Opinion Orientation Identification Opinion Words Infrequent Features Infrequent Feature Identification FLOWCHARTTILL NOW!
  58. 58. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Step 5: Summary Generation
  59. 59. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 = 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑂𝑝𝑖𝑛𝑖𝑜𝑛 𝑂𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛𝑠 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 = − 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑂𝑝𝑖𝑛𝑖𝑜𝑛 𝑂𝑟𝑖𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛𝑠 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 % = 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 % = 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
  60. 60. Frequent Features Mining Opinion Word Extraction Opinion Orientation Identification Infrequent Features Mining Summary Generation Review Database POS Tagging Frequent Feature Identification Feature Pruning Frequent Features Opinion Word Identification Opinion Orientation Identification Opinion Words Infrequent Features Infrequent Feature Identification Summary Generation

×