SlideShare a Scribd company logo
1 of 26
Download to read offline
Vietnamese	
  Sen+ment	
  Analysis	
  in	
  a	
  Big	
  Data	
  
Scenario:	
  The	
  Deep	
  Learning	
  Approach
Quan Thanh Tho
Faculty of Computer Science and Engineering
Back Khoa University
Agenda	
  
Sen$ment	
  Analysis	
   Big	
  Data	
  
Aspect-­‐based	
  Vietnamese	
  Sen+ment	
  Analysis	
  
Ontological	
  Taxonomy	
  
Sen+ment	
  Dic+onary	
  
A?ribute	
   Alias	
  Keywords	
   	
  Posi+ve	
  Term	
   Nega+ve	
  Term	
  
Design	
   ["thiết	
  kế","ngoại	
  
hình","kiểu	
  dáng”,,…	
  
["bắt	
  mắt","nhỏ	
  gọn","trau	
  
chuốt”,"mỏng","đẹp	
  
mắt","mềm	
  mại”,	
  …	
  
["cục	
  mịch","phẳng	
  lì","to	
  
bè","mỏng	
  dính","đơn	
  điệu”,”…	
  
Màn	
  hình	
   ["mànhình”,,"inch","
điểm	
  ảnh”,	
  …	
  
["cực	
  nét","sắc	
  nét","khủng”,	
  
…	
  
"ám	
  vàng","ngả	
  xanh","ngả	
  
Zm","chói","chá”,”…	
  
Camera	
   ["ống	
  kính","lấy	
  
nét","camera","hình	
  
ảnh","điểm	
  nét",”	
  
["góc	
  rộng","sáng","độ	
  phân	
  
giải	
  cao","siêu	
  nét","không	
  bị	
  
out	
  nét","rõ	
  nét”,	
  
["rung","rung	
  nhòe","không	
  trực	
  
quan","giật","màu	
  cháy","nhợt	
  
nhạt","lag","mờ	
  đục”,	
  
Analyzed	
  Phrased	
  
Subject	
   Men+on	
   Analyzed	
  Phrase	
   Sen+ment	
  
Galaxy	
  A	
   Galaxy	
  A	
  có	
  mấy	
  cái	
  cạnh	
  
nhìn	
  đã	
  ghê,	
  tròn	
  tròn	
  vuông	
  
vuông	
  nhìn	
  sướng	
  cả	
  mắt.	
  	
  
nhìn/v	
  đã/r	
  |	
  3.75	
  
tròn/a	
  tròn/a	
  vuông/a	
  
vuông/a	
  nhìn/v	
  sướng/a	
  |	
  
0.375	
  
Posi$ve	
  
Galaxy	
  A	
   Galaxy	
  A	
  nhìn	
  hình	
  vây	
  chứ	
  
ngoài	
  xấu	
  lắm	
  
xấu/a	
  lắm/r	
  |	
  -­‐7.5	
   Nega$ve	
  
Note	
  4	
   Về	
  phần	
  cứng	
  thì	
  ko	
  hề	
  thua	
  
note4..	
  mấy	
  cái	
  râu	
  ria	
  thì	
  ko	
  
bằng	
  note4	
  đc	
  
không/r	
  hề/v	
  thua/v	
  |	
  -­‐1.5	
  
không/r	
  bằng/a	
  |	
  10	
  
Posi$ve	
  
“Nhân	
  viên	
  Ngân	
  Hàng	
  A	
  rất	
  chảnh	
  chọe	
  và	
  tư	
  vấn	
  không	
  nhiệt	
  
Unh	
  như	
  bên	
  Ngân	
  Hàng	
  B,	
  chắc	
  mình	
  sẽ	
  chọn	
  Ngân	
  Hàng	
  B	
  ”	
  
#1:	
  Volume	
  
#2:	
  Velocity	
  
#3:	
  Variety	
  	
  
#4:	
  Veracity	
  	
  
Word2Vec	
  
Word2Vec	
  
Word2Vec	
  
Long	
  Short	
  Term	
  Memory	
  
Some	
  demo	
  

More Related Content

More from Grokking VN

Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking VN
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...Grokking VN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problemGrokking VN
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoringGrokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking VN
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design PatternsGrokking VN
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking VN
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking VN
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking VN
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking VN
 

More from Grokking VN (20)

Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKIGrokking TechTalk #33: High Concurrency Architecture at TIKI
Grokking TechTalk #33: High Concurrency Architecture at TIKI
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platform
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocols
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer Vision
 

Grokking TechTalk #18A: Vietnamese Sentiment Analysis in a Big Data Scenario: The Deep Learning Approach

  • 1. Vietnamese  Sen+ment  Analysis  in  a  Big  Data   Scenario:  The  Deep  Learning  Approach Quan Thanh Tho Faculty of Computer Science and Engineering Back Khoa University
  • 2. Agenda   Sen$ment  Analysis   Big  Data  
  • 3.
  • 4.
  • 5.
  • 8. Sen+ment  Dic+onary   A?ribute   Alias  Keywords    Posi+ve  Term   Nega+ve  Term   Design   ["thiết  kế","ngoại   hình","kiểu  dáng”,,…   ["bắt  mắt","nhỏ  gọn","trau   chuốt”,"mỏng","đẹp   mắt","mềm  mại”,  …   ["cục  mịch","phẳng  lì","to   bè","mỏng  dính","đơn  điệu”,”…   Màn  hình   ["mànhình”,,"inch"," điểm  ảnh”,  …   ["cực  nét","sắc  nét","khủng”,   …   "ám  vàng","ngả  xanh","ngả   Zm","chói","chá”,”…   Camera   ["ống  kính","lấy   nét","camera","hình   ảnh","điểm  nét",”   ["góc  rộng","sáng","độ  phân   giải  cao","siêu  nét","không  bị   out  nét","rõ  nét”,   ["rung","rung  nhòe","không  trực   quan","giật","màu  cháy","nhợt   nhạt","lag","mờ  đục”,  
  • 9. Analyzed  Phrased   Subject   Men+on   Analyzed  Phrase   Sen+ment   Galaxy  A   Galaxy  A  có  mấy  cái  cạnh   nhìn  đã  ghê,  tròn  tròn  vuông   vuông  nhìn  sướng  cả  mắt.     nhìn/v  đã/r  |  3.75   tròn/a  tròn/a  vuông/a   vuông/a  nhìn/v  sướng/a  |   0.375   Posi$ve   Galaxy  A   Galaxy  A  nhìn  hình  vây  chứ   ngoài  xấu  lắm   xấu/a  lắm/r  |  -­‐7.5   Nega$ve   Note  4   Về  phần  cứng  thì  ko  hề  thua   note4..  mấy  cái  râu  ria  thì  ko   bằng  note4  đc   không/r  hề/v  thua/v  |  -­‐1.5   không/r  bằng/a  |  10   Posi$ve  
  • 10. “Nhân  viên  Ngân  Hàng  A  rất  chảnh  chọe  và  tư  vấn  không  nhiệt   Unh  như  bên  Ngân  Hàng  B,  chắc  mình  sẽ  chọn  Ngân  Hàng  B  ”  
  • 11.
  • 16.
  • 17.
  • 18.
  • 19.
  • 23.
  • 24. Long  Short  Term  Memory  
  • 25.