Natural Language Processing

Natural Language Processing
An Introduction
Ashwin Ittoo

About Myself – Ashwin Ittoo
Associate Professor HEC Liège, ULiège
Research Associate, JAIST (Japan)
Associate Editor, Elsevier (Computers in Industry)

• 3 PhD , ULiège, Belgium
• Finance
• Marketing
• Medicine
• 1 PhD , JAIST Japan (Aug. 2018)
3
Team

• Natural Language Processing (NLP)
• Traitement automatique de langues naturelles (TAL)
• Methods for “analysing” language
• Expressed in written form, text data
• Text data common in NLP
• Tweets
• Amazon/Yelp reviews
• Wikipedia
• Domain-specific articles (finance, medicine, …)
4
Introduction

• Variety of Analysis
• Document classification, e.g.
• Sentiment analysis
• Information extraction, e.g.
• Extracting facts from legal texts
• Machine translation
• Methods Evolution
• From formal logics, linguistics
• To machine learning, deep learning
5
Introduction (cont)

• Distinction in methods
• Pipeline organization
6
Methods & Pipeline
Pre-processing
Feature Engineering
Document Classification
Sentiment Analysis
Machine Translation
Text
collection
Low-level NLP Tasks
High-level NLP Tasks

• Clean the data
• Removing stopwords (“a”, “the”,….)
• Removing non-ASCII characters
• Straightforward
• No learning (machine/deep) involved
8
Low-Level: Pre-processing
Pre-processing
Feature Engineering

• Text  Number transformation
• Individual tokens from sentence
• Tokens: words, numbers, punctuations…
• Tokens = features
• How to best represent features?
9
Low-Level: Feature Engineering
Pre-processing
Feature Engineering

• As-is
• Each token = 1 feature
• Eat, ate, eaten: 3 tokens, 3 distinct features
• Huge number of features
• Curse of dimensionality
• Morphology
• Replace token with lemma (root)
• Eat, ate, eaten  eat: 3 tokens, 1 feature
• Demo
10
Feature Representation

• Grammatical Information
• Use Part-of-Speech (POS)/POS-tagging
• Defined in Penn Tree Bank (UPenn)
• E.g. 2 nice movies  CD JJ NNS
• Several tools for POS-tagging
• Stanford NLP (Java)
• Scikitlearn/NLTK (Python)
• Demo
11
Feature Representation (cont)

• Application of machine learning for NLP
• Large number of classes (each POS-tag)
• Temporal sequence of word occurrence
• Hidden Markov Model
• 𝑡 𝑛 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑡 𝑛 𝑃 𝑡 𝑛 𝑤 𝑛
≈ 𝑎𝑟𝑔𝑚𝑎𝑥 𝑡 𝑛 𝑃 𝑤𝑖 𝑡𝑖
𝑛
𝑖=1 𝑃 𝑡𝑖 𝑡𝑖−1
• 𝑃 𝑤𝑖 𝑡𝑖 : prob. pos-tag 𝑡𝑖 given word 𝑤𝑖
• 𝑡𝑖 𝑡𝑖−1 : prob. pos-tag 𝑡𝑖−1 given pos-tag 𝑡𝑖
12
Part-of-Speech Tagging

• How to select best features?
• Intuitively: some words are more important than others
• E.g. “doping”  sports documents
• Tf-Idf
• Term frequency-Inverse document frequency
• Standard statistical tests
• Chi-square
• Mutual Information
• Demo
13
Low-Level: Feature Engineering

• High-level tasks
• Features (low-level task) as input
• Sentiment Analysis
• Determine sentiment in customer reviews
• E.g. movie reviews, Amazon product reviews
• Classification Problem
• 2 (3) classes/categories
• +, - (neutral)
• Supervised Learning
• Movie reviews, annotated with sentiment class, available
• Train classification algorithm
• Naïve-Bayes, SVM, Random Forests, Neural Networks
14
High-Level: Sentiment Analysis
Sentiment Analysis
Machine Translation
…
Low-level NLP Tasks
High-level NLP Tasks
Features

• Confusion matrix
• True positive, false negative
• True negative, false positive
• Precision
• Fraction of reviews correctly classified
• How precise our model is?
• Recall
• Fraction of correct reviews (from gold standard set) correctly classified
• What is the coverage of the model
• F1-score
• Balances precision, recall
15
High-Level: Evaluation Metrics

• Feature Engineering
• Core of machine learning, NLP but…
• Manual, time-consuming
• Bottleneck in machine learning, NLP
• Deep Learning
• Neural network with many hidden layers
• Supervised Learning Approach
• Trained on annotated data
• Movie reviews with sentiment class
• Input: word (vectors) from reviews
• Output: class label (+,-, neutral)
• Hidden layers learn feature representation
• No (minimum) feature engineering
16
Deep Learning in NLP

• Different Deep Learning Architectures
• E.g. CNN for image processing
• RNN (Recurrent Neural Network)
• State of the art for text
• Considers temporal nature of tokens in sentence
17
Deep Learning in NLP (cont)

18
RNN for Sentiment Analysis
• Sentiment Challenge
• Each clause can express a different sentiment
• Need to keep track of word sequences
• Need to compose individual sentiments for overall sentiment
- This movie doesn't care about cleverness, wit or any other kind of
intelligent humor.
-Those who find ugly meanings in beautiful things are corrupt without
being charming.

19
Language Processing/Sentiment Analysis (cont)
• Trained over sentiment treebank
• Phrases, clauses, sentences, e.g. “This isn’ a new idea”
• Annotated with respective sentiments (blue: +, red: -)
Java Demo (Stanford Libraries)

20
Unsupervised Learning/Word Embeddings
• Neural language models/word embeddings
• Word2Vec (shallow neural network, not deep learning)
• Predict context given centre word (skip gram)
• E.g. given “bankrupt”, predict “the bank went bankrupt last year”
• Words/contexts from Google news

21
Towards Unsupervised Learning (cont)
• Word vectors representation capture semantic properties
• Word meaning and geometry
• King – queen – man = woman

22
THE END
Thank you for your attention

Natural Language Processing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Natural Language Processing

Similar to Natural Language Processing (20)

More from Geeks Anonymes

More from Geeks Anonymes (20)

Recently uploaded

Recently uploaded (20)

Natural Language Processing