More Related Content Similar to Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based (20) More from KNIMESlides (16) Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based1. © 2018 KNIME AG. All rights reserved.
Sentiment Analysis with
Deep Learning, Machine Learning or
Lexicon Based
2. © 2018 KNIME AG. All rights reserved. 2
Sentiment Analysis – An Example
3. © 2018 KNIME AG. All rights reserved. 3
Sentiment Analysis
Task: Determine the expressed opinion in a document/text, e.g.
positive, negative
Sentiment Analysis = Opinion Mining = Emotion AI
Lexicon Based Machine Learning Deep Learning
4. © 2018 KNIME AG. All rights reserved. 4
Philosophy
… perhaps your
name is
Rumpelstiltskin[Pers
on] ? …
… perhaps your
name is
Rumpelstiltskin[Per
son] ? …
Visualizatio
n
Cluster
-ing
Classifi-
cation
Reading/Parsing Data Enrichment Preprocessing
… perhaps your
name is
Rumpelstiltskin[Per
son] ? …
Transformations / Frequencies Classification/Clustering/Visualization
5. © 2018 KNIME AG. All rights reserved. 5
Additional Data Types
• Document Cell
– Encapsulates a document
• Title, sentences, terms, words
• Authors, categories, sources
• Generic meta data (key, value pairs)
• Term Cell
– Encapsulates a term
• Words, tags
6. © 2018 KNIME AG. All rights reserved. 6
What is KNIME Analytics Platform?
• A tool for data analysis, manipulation, visualization, and reporting
• Based on the graphical programming paradigm
• Provides a diverse array of extensions:
• Text Mining
• Network Mining
• Cheminformatics
• Many integrations
such as Java, R, Python,
Weka, H2O, etc.
6
7. © 2018 KNIME AG. All rights reserved. 10
Part 1: Reading and Parsing Data
Read/Parse textual data
Other Reader nodes
8. © 2018 KNIME AG. All rights reserved. 11
Part 2: Enrichment
Enrich documents with semantic information
9. © 2018 KNIME AG. All rights reserved. 13
Part 3: Preprocessing
Preprocess documents and filter words
10. © 2018 KNIME AG. All rights reserved. 14
Bag of Words
• A so-called ‘Bag of Words’ represents each document as the bag
(multiset) of its words
• Grammar and word order aren’t taken into account
11. © 2018 KNIME AG. All rights reserved. 15
Frequency Nodes
• Node Repository:
Other Data Types /Text Processing /Frequencies
• Available Frequency Nodes
– TF
– IDF
– Ngram creator
– …
12. © 2018 KNIME AG. All rights reserved. 16
Document Vector
• Transforms a bag of words into a sequence
– of 0/1 (“one-hot encoding”)
– or frequency numbers
16
BoW with Frequency
Column
Document
Vector
13. © 2018 KNIME AG. All rights reserved. 17
Part 4: Transformation and Frequencies’ Computation
Preprocess documents
14. © 2018 KNIME AG. All rights reserved. 18
Part 5: Classification
Lexicon based Machine learning
v
15. © 2018 KNIME AG. All rights reserved. 19
Transformation for Deep Learning
Expected input of a network:
• Numerical representation of each document encoding the words and their order
• Equivalent input shape of each document
– Truncate too long documents
– Zero pad too short documents
16. © 2018 KNIME AG. All rights reserved. 20
Approach 3: Deep Learning
18. © 2018 KNIME AG. All rights reserved. 23
From Words to Wisdom Book
Free Copy of “From Words to Wisdom” Book from KNIME Press
https://www.knime.com/knimepress
with code: USMEETUPS-0918
19. 26© 2018 KNIME AG. All rights reserved.
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by
KNIME.com AG under license from KNIME GmbH, and are registered in the United States.
KNIME® is also registered in Germany.