SlideShare a Scribd company logo
1 of 195
Download to read offline
Introduction to Natural Language Processing
in 3 Sessions
Dr. Alexandra M. Liguori
Incubio – The Big Data Academy
Barcelona, March - April, 2015
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Outline: Lecture 1
1 Introduction
2 Natural Language Processing
3 Linguistic Ambiguities
4 Definition of corpus
5 Typical NLP tasks
6 POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Outline: Lecture 2
1 Recap: Typical NLP tasks → practical examples with GATE
2 Def. of semantics
3 Frames approach
1 FrameNet
2 GATE for semantic/content analysis
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Outline: Lecture 3
1 Recap: Typical NLP tasks
2 Automatic Question Answering
3 Reference resolution
4 Named Entity Recognition (NER)
5 Keyword / topic / information extraction
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Welcome!
Here we go...!!!
Main references:
Text book: Speech and Language Processing by D. Jurafsky
and J. H. Martin
English FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/
Spanish FrameNet: http://sfn.uab.es:8080/SFN
GATE: https://gate.ac.uk/
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Outline: Lecture 1
1 Introduction
2 Natural Language Processing
3 Linguistic Ambiguities
4 Definition of corpus
5 Typical NLP tasks
6 POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
Video:
https://www.youtube.com/watch?v=dSIKBliboIo
(Stanley Kubrick and Arthur C. Clarke,
screenplay of 2001: A Space Odyssey)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
Dave Bowman: Open the pod bay doors, HAL.
HAL: I’m sorry Dave, I’m afraid I can’t do that.
(Stanley Kubrick and Arthur C. Clarke,
screenplay of 2001: A Space Odyssey)
https://www.youtube.com/watch?v=dSIKBliboIo
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
5 Compositional semantics → knowledge of how
components combine to form larger meanings
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
5 Compositional semantics → knowledge of how
components combine to form larger meanings
6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t
vs. No, I won’t open the door.
vs. No.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
5 Compositional semantics → knowledge of how
components combine to form larger meanings
6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t
vs. No, I won’t open the door.
vs. No.
7 Discourse conventions → engaging in structured
conversation using reference that in I’m sorry Dave, I’m
afraid I can’t do that
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Natural Language Processing
NLP: techniques that process written human language as
language.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Natural Language Processing
NLP: techniques that process written human language as
language.
Applications
word counting
automatic hyphenation
automated question answering
named entity extraction (NER)
information/content extraction
semantic analysis
sentiment analysis
machine translation
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Natural Language Processing
NLP: techniques that process written human language as
language.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Natural Language Processing
NLP: techniques that process written human language as
language.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Natural Language Processing
An ideal NLP team is very interdisciplinary, including:
Language experts (linguists)
Maths experts (mathematicians, physicists, statisticians)
Programmers (computer scientists)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Maths & Computer Science
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
3 Syntax ↔ She ate a mammoth breakfast - She eating a
mammoth breakfast
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
3 Syntax ↔ She ate a mammoth breakfast - She eating a
mammoth breakfast
4 Semantics ↔ book (verb) - book (noun);
duck (verb) - duck (noun)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
3 Syntax ↔ She ate a mammoth breakfast - She eating a
mammoth breakfast
4 Semantics ↔ book (verb) - book (noun);
duck (verb) - duck (noun)
5 Pragmatics ↔ open the door - can you open the door? -
could you open the door, please?
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
6 Discourse
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
6 Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were
having matrimonial trouble, and my brother was hired to
watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and
night for six months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Six categories of linguistic knowledge
6 Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were
having matrimonial trouble, and my brother was hired to
watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and
night for six months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
John went to Bill’s car dealership to check out an
Acura Integra. He looked at it for about an hour.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
3 I created the (plaster?) duck she owns.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
3 I created the (plaster?) duck she owns.
4 I caused her to quickly lower her head or body.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
3 I created the (plaster?) duck she owns.
4 I caused her to quickly lower her head or body.
5 I waved my magic wand and turned her into
undifferentiated waterfowl.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Morphological ambiguity
duck: verb or noun
her: dative pronoun or possessive pronoun
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Morphological ambiguity
duck: verb or noun
her: dative pronoun or possessive pronoun
Syntactic ambiguity: make
transitive: taking a single direct object (case 2)
ditransitive: taking two objects, meaning that the first object
(her) got made into the second object (duck)
taking a direct object and a verb, meaning that the object
(her) got caused to perform the verbal action (duck)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Linguistic Ambiguities
Morphological ambiguity
duck: verb or noun
her: dative pronoun or possessive pronoun
Syntactic ambiguity: make
transitive: taking a single direct object (case 2)
ditransitive: taking two objects, meaning that the first object
(her) got made into the second object (duck)
taking a direct object and a verb, meaning that the object
(her) got caused to perform the verbal action (duck)
Semantic ambiguity: make
cook
create
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Corpus
Definition
Corpus = Large and structured set of texts.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Corpus
Definition
Corpus = Large and structured set of texts.
NLP
Two types of corpora:
Training corpus ↔ to make the list of rules or to get the
statistical data
Test corpus ↔ to test the results found with the training
corpus
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
POS-tagging algorithms and
tag sets
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Topic extraction
NER
Semantic analysis
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Topic extraction
NER
Semantic analysis
...
Ad hoc tools, e.g.
dictionaries, ontologies,
Frames, GATE, NLTK,
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Example with Penn Treebank POS-tags:
A/DT woman/NN came/VBD from/IN the/DT back/NN of/IN
the/DT store/NN ./. She/PP appeared/VBD to/TO be/VB
sleepy/JJ and/CC quite/RB a/DT bit/NN younger/JJR than/IN
Mr./NNP Dobbs/NNP and/CC to/TO be/VB wearing/VBG
too/RB much/RB makeup/NN ./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Example of ambiguity:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
Three main tagging algorithms or methods:
1 rule-based tagging, e.g. ENGTWOL
2 stochastic tagging, e.g. HMM tagger
3 transformation-based tagging, e.g. Brill tagger
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Rule-based POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Rule-based POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Rule-based POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Large database of hand-written disambiguation rules, e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Rule-based POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Large database of hand-written disambiguation rules, e.g.:
TO + VB → YES
TO + NN → NO
DT + NN → YES
DT + VB → NO
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Stochastic POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Stochastic POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Stochastic POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Training corpus to compute probability of given word having
given tag in given context, e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Stochastic POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Training corpus to compute probability of given word having
given tag in given context, e.g.:
is/VBZ expected/VBN to/TO race/VB → 98%
is/VBZ expected/VBN to/TO race/NN → 2%
reason/NN for/IN the/DT race/NN → 97%
reason/NN for/IN the/DT race/VB → 3%
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
2 Machine Learning to learn conditional probabilities:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
2 Machine Learning to learn conditional probabilities:
3 is/VBZ expected/VBN to/TO race/VB → 98%
reason/NN for/IN the/DT race/NN → 97%
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
2 Machine Learning to learn conditional probabilities:
3 is/VBZ expected/VBN to/TO race/VB → 98%
reason/NN for/IN the/DT race/NN → 97%
4 system takes race = NN or race = VB depending on
context.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
POS-tagging
POS-tagging tools for English:
Brill tagger
Stanford Log-linear POS-tagger (Java)
POS-tagger integrated in GATE (Java)
POS-tagger with NLTK (Python)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Outline: Lecture 2
1 Recap: Typical NLP tasks → practical examples with GATE
2 Def. of semantics
3 Frames approach
1 FrameNet
2 GATE for semantic/content analysis
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
POS-tagging algorithms and
tag sets
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Topic extraction
NER
Semantic analysis
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Topic extraction
NER
Semantic analysis
...
Ad hoc tools, e.g.
dictionaries, ontologies,
Frames, GATE, NLTK, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
EX.: GATE
Concrete examples with GATE:
1 Tokenizer
2 Sentence-splitter
3 POS-tagger
4 Stemmer
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
EX.: GATE
Concrete examples with GATE:
1 Tokenizer
2 Sentence-splitter
3 POS-tagger
4 Stemmer
GATE
https://gate.ac.uk/
development at the University of Sheffield, UK.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Semantics
’Then you should say what you mean,’
the March Hare went on.
’I do,’ Alice hastily replied;
’at least, I mean what I say –
that’s the same thing, you know.’
’Not the same thing a bit!’ said the Hatter. ’You might just as
well say that
”I see what I eat” is the same thing as ”I eat what I see”! ’
Lewis Carroll,
Alice in Wonderland
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Frames and FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles.
E.g.:
Abby bought a car from Robin for $5,000.
Robin sold a car to Abby for $5,000.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Frames and FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles.
E.g.:
Abby bought a car from Robin for $5,000.
Robin sold a car to Abby for $5,000.
English FrameNet
https://framenet.icsi.berkeley.edu/fndrupal/
development at the International Computer Science Institute in
Berkeley, California.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
English FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Frame Relations
FrameNet additionally captures relationships between different
frames using relations. These include the following:
Inheritance: When one frame is a more specific version of
another, more abstract parent frame. Anything that is true
about the parent frame must also be true about the child
frame, and a mapping is specified between the frame
elements of the parent and the frame elements of the child.
Perspectivized-in: A neutral frame (like
Commerce-transfer-goods) is connected to a frame with a
specific perspective of the same scenario (e.g. the
Commerce-sell frame, which assumes the perspective of
the seller or the Commerce-buy frame, which assumes the
perspective of the buyer)
Subframe: Some frames like the Criminal-process frame
refer to complex scenarios that consist of several individual
states or events that can be described by separate frames
like Arrest, Trial, and so on.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Frame Relations
Precedes: The Precedes relation captures a temporal
order that holds between subframes of a complex scenario.
Causative-of and Inchoative-of: There is a fairly systematic
relationship between stative descriptions (e.g. the
Position-on-a-scale frame, "She had a high salary") and
causative descriptions (Cause-change-of-scalar-position,
"She raised his salary") or inchoative descriptions
(Change-position-on-a-scale, e.g. "Her salary increased").
Using: A relationship that holds between a frame that in
some way involves another frame. For instance, the
Judgment-communication frame uses both the Judgment
frame and the Statement frame, but does not inherit from
either of them because there is no clear correspondence of
the frame elements.
See-also: Connects frames that bear some resemblance
but need to be distinguished carefully.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles. E.g.:
El rock influye en los artistas de hoy en día
para sus producciones artísticas.
Los artistas de hoy en día se inspiran al rock
para sus producciones artísticas.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles. E.g.:
El rock influye en los artistas de hoy en día
para sus producciones artísticas.
Los artistas de hoy en día se inspiran al rock
para sus producciones artísticas.
Spanish FrameNet
http://sfn.uab.es:8080/SFN
development at the Universidad Autónoma de Barcelona and
International Computer Science Institute in Berkeley, California.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Spanish FrameNet Example
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Frames and GATE
And now...
Ex. in English implementing FRAMES, LUs, and FEs with
GATE !!!
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Frames and GATE
And now...
Ex. in English implementing FRAMES, LUs, and FEs with
GATE !!!
GATE
https://gate.ac.uk/
development at the University of Sheffield, UK.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Outline: Lecture 3
1 Recap: Typical NLP tasks
2 Automatic Question Answering
3 Reference resolution
4 Named Entity Recognition (NER)
5 Keyword / topic / information extraction
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
POS-tagging algorithms and
tag sets
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Question answering
Topic extraction
NER
Semantic analysis
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Question answering
Topic extraction
NER
Semantic analysis
...
Ad hoc tools, e.g.
dictionaries, ontologies,
Frames, GATE, NLTK, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Question Answering
Example from Who’s on first
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Question Answering
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Simple Question Answering
ELIZA
ELIZA simulated a Rogerian psychologist, and could carry on
conversations with the user like the following:
User-1: Men are all alike.
ELIZA-1: IN WHAT WAY
User-2: They’re always bugging us about something or other..
ELIZA-2: CAN YOU THINK OF A SPECIFIC EXAMPLE
User-3: Well, my boyfriend made me come here.
ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE
User-4: Yes, he says I’m depressed much of the time.
ELIZA-4: I AM SORRY TO HEAR YOU ARE DEPRESSED.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
change all instances of my to YOUR, and I’m to YOU ARE, etc.,
e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
change all instances of my to YOUR, and I’m to YOU ARE, etc.,
e.g.:
1 User-3: Well, my boyfriend made me come here.
ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
change all instances of my to YOUR, and I’m to YOU ARE, etc.,
e.g.:
1 User-3: Well, my boyfriend made me come here.
ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE
2 User-4: ... I’m depressed ... .
ELIZA-4: ... YOU ARE DEPRESSED.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK
YOU ARE 1 /
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK
YOU ARE 1 /
3 s/.* all .*/IN WHAT WAY/
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK
YOU ARE 1 /
3 s/.* all .*/IN WHAT WAY/
4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Discourse
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were having
matrimonial trouble, and my brother was hired to watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and night for six
months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were having
matrimonial trouble, and my brother was hired to watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and night for six
months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
John went to Bill’s car dealership to check out an Acura Integra.
He looked at it for about an hour.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
1 Reference phenomena
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
1 Reference phenomena
2 Constraints on coreference
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
1 Reference phenomena
2 Constraints on coreference
3 Preferences in pronoun interpretation
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
1 Reference phenomena
2 Constraints on coreference
3 Preferences in pronoun interpretation
4 Example of algorithm for pronoun resolution
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Reference phenomena
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
3 Pronouns ↔ I saw an Honda Civic today. It was blue.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
3 Pronouns ↔ I saw an Honda Civic today. It was blue.
4 Demonstratives ↔ I bought an Honda Civic today. It’s
similar to the one I bought five years ago. That one was
really nice, but I like this one even better.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
3 Pronouns ↔ I saw an Honda Civic today. It was blue.
4 Demonstratives ↔ I bought an Honda Civic today. It’s
similar to the one I bought five years ago. That one was
really nice, but I like this one even better.
5 One-anaphora ↔ I saw no less than 6 Honda Civics
today. Now I want one.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Constraints on coreference
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
3 gender agreement ↔ John has a new car. It is attractive.
/ John has a new car. He is attractive.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
3 gender agreement ↔ John has a new car. It is attractive.
/ John has a new car. He is attractive.
4 syntactic constraints ↔ John bought himself a new car. /
John bought him a new car.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
3 gender agreement ↔ John has a new car. It is attractive.
/ John has a new car. He is attractive.
4 syntactic constraints ↔ John bought himself a new car. /
John bought him a new car.
5 selectional restrictions ↔ John parked his car in the
garage. He had driven it around for hours.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Preferences in pronoun interpretation
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
3 repeated mention ↔ Anne needed a car to drive to her
new job. She decided she wanted something roomy. Carol
went to the Honda dealership with her. She bought a Civic.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
3 repeated mention ↔ Anne needed a car to drive to her
new job. She decided she wanted something roomy. Carol
went to the Honda dealership with her. She bought a Civic.
4 parallelism ↔ Anne went with Carol to the Honda
dealership. Sally went with her to the VW dealership.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
3 repeated mention ↔ Anne needed a car to drive to her
new job. She decided she wanted something roomy. Carol
went to the Honda dealership with her. She bought a Civic.
4 parallelism ↔ Anne went with Carol to the Honda
dealership. Sally went with her to the VW dealership.
5 verb semantics ↔ Peter seized the Honda pamphlet from
Bob. He loves reading about cars. / Peter passed the
Honda pamphlet to Bob. He loves reading about cars.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)
3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,
”Barcelona”, etc.)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)
3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,
”Barcelona”, etc.)
4 other (e.g. ”Hotel Sunshine”, etc. )
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
NER
Tools for Named Entity Recognition
GATE for English, Spanish, and many more, via graphical
interface and Java API (development at the University of
Sheffield, UK)
https://gate.ac.uk/
NETagger: Java based Illinois Named Entity Recognition
(development by Cognitive Computation Group at University of
Illinois at Urbana - Champaign)
http://cogcomp.cs.illinois.edu/page/software_view/NETagger
OpenNLP: rule based and statistical Named Entity Recognition
(development by Apache)
http://opennlp.apache.org/index.html
Stanford CoreNLP: Java-based CRF Named Entity Recognition
(development by Stanford Natural Language Processing Group)
http://nlp.stanford.edu/software/CRF-NER.shtml
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Keyword / topic / information extraction
Tools
Keyword extraction: e.g. GATE (ANNIE tool) for English,
Spanish, and many more, via graphical interface and Java
API
→ simply using jape files for the LUs
tool from Volker ?
Topic / information extraction: e.g. GATE (ANNIE tool)
for English, Spanish, and many more, via graphical
interface and Java API
→ using jape files for the LUs, FEs, and FRAMES
GATE
https://gate.ac.uk/
development at the University of Sheffield, UK
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
Thank you for your attention!
For more information:
Example text book: Speech and Language Processing
by D. Jurafsky and J. H. Martin
Web page: www.alexandramliguoriphd.com
Linkedin profile: Alexandra M. Liguori, Ph.D.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

More Related Content

Viewers also liked

Why Now Is The Time For NLP
Why Now Is The Time For NLPWhy Now Is The Time For NLP
Why Now Is The Time For NLPLinda Ferguson
 
Neuro Linguistic Programming
Neuro Linguistic ProgrammingNeuro Linguistic Programming
Neuro Linguistic Programmingsmjk
 
Dorset NLP Forum May 2012 - Evolution
Dorset NLP Forum May 2012 - EvolutionDorset NLP Forum May 2012 - Evolution
Dorset NLP Forum May 2012 - EvolutionMike Forte
 
Machine Learning for NLP
Machine Learning for NLPMachine Learning for NLP
Machine Learning for NLPbutest
 
NLP (Neurolingusitic Programming for IT Professionals)
NLP (Neurolingusitic Programming for IT Professionals)NLP (Neurolingusitic Programming for IT Professionals)
NLP (Neurolingusitic Programming for IT Professionals)QBI Institute
 
Advanced Communications Using NLP Methods
Advanced Communications Using NLP MethodsAdvanced Communications Using NLP Methods
Advanced Communications Using NLP MethodsDr.Arivalan Ramaiyah
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
neuro-linguistic programming
neuro-linguistic programmingneuro-linguistic programming
neuro-linguistic programmingMichael Buckley
 
Intorduction to Neuro Linguistic Programming (NLP)
Intorduction to Neuro Linguistic Programming (NLP)Intorduction to Neuro Linguistic Programming (NLP)
Intorduction to Neuro Linguistic Programming (NLP)eohart
 
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...Health Catalyst
 
Deep dive into android restoration - DroidCon Paris 2014
Deep dive into android restoration - DroidCon Paris 2014Deep dive into android restoration - DroidCon Paris 2014
Deep dive into android restoration - DroidCon Paris 2014Paris Android User Group
 
Nlp For Entrepreneurs
Nlp For EntrepreneursNlp For Entrepreneurs
Nlp For Entrepreneursguest0ceca1a
 

Viewers also liked (20)

Why Now Is The Time For NLP
Why Now Is The Time For NLPWhy Now Is The Time For NLP
Why Now Is The Time For NLP
 
Neuro Linguistic Programming
Neuro Linguistic ProgrammingNeuro Linguistic Programming
Neuro Linguistic Programming
 
NLP in English
NLP in EnglishNLP in English
NLP in English
 
Intro To NlP
Intro To NlPIntro To NlP
Intro To NlP
 
Nlp lotus
Nlp  lotusNlp  lotus
Nlp lotus
 
Dorset NLP Forum May 2012 - Evolution
Dorset NLP Forum May 2012 - EvolutionDorset NLP Forum May 2012 - Evolution
Dorset NLP Forum May 2012 - Evolution
 
Machine Learning for NLP
Machine Learning for NLPMachine Learning for NLP
Machine Learning for NLP
 
NLP (Neurolingusitic Programming for IT Professionals)
NLP (Neurolingusitic Programming for IT Professionals)NLP (Neurolingusitic Programming for IT Professionals)
NLP (Neurolingusitic Programming for IT Professionals)
 
Deeplearning NLP
Deeplearning NLPDeeplearning NLP
Deeplearning NLP
 
Advanced Communications Using NLP Methods
Advanced Communications Using NLP MethodsAdvanced Communications Using NLP Methods
Advanced Communications Using NLP Methods
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Neuro linguistic programming(nlp)
Neuro linguistic programming(nlp)Neuro linguistic programming(nlp)
Neuro linguistic programming(nlp)
 
neuro-linguistic programming
neuro-linguistic programmingneuro-linguistic programming
neuro-linguistic programming
 
Intorduction to Neuro Linguistic Programming (NLP)
Intorduction to Neuro Linguistic Programming (NLP)Intorduction to Neuro Linguistic Programming (NLP)
Intorduction to Neuro Linguistic Programming (NLP)
 
NLP for project managers
NLP for project managersNLP for project managers
NLP for project managers
 
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
How to Use Text Analytics in Healthcare to Improve Outcomes: Why You Need Mor...
 
Develope yourself nlp
Develope yourself nlpDevelope yourself nlp
Develope yourself nlp
 
Named Entities
Named EntitiesNamed Entities
Named Entities
 
Deep dive into android restoration - DroidCon Paris 2014
Deep dive into android restoration - DroidCon Paris 2014Deep dive into android restoration - DroidCon Paris 2014
Deep dive into android restoration - DroidCon Paris 2014
 
Nlp For Entrepreneurs
Nlp For EntrepreneursNlp For Entrepreneurs
Nlp For Entrepreneurs
 

Similar to NLP_lectures_English

Class 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phx
Class 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phxClass 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phx
Class 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phxLisa Lavoie
 
Lin101 introduction to linguistics
Lin101 introduction to linguisticsLin101 introduction to linguistics
Lin101 introduction to linguisticsDr. Russell Rodrigo
 
Professor Michael Hoey: The hidden similarities across languages - some good ...
Professor Michael Hoey: The hidden similarities across languages - some good ...Professor Michael Hoey: The hidden similarities across languages - some good ...
Professor Michael Hoey: The hidden similarities across languages - some good ...eaquals
 
The lexical approach and lexical priming(1)
The lexical approach and lexical priming(1)The lexical approach and lexical priming(1)
The lexical approach and lexical priming(1)walkea
 
The Psychology of Language Chapter 3
The Psychology of Language Chapter 3The Psychology of Language Chapter 3
The Psychology of Language Chapter 3Ami Spears
 
36314 6b62b6ca1ce95921b9a88b86f4f1bd58
36314 6b62b6ca1ce95921b9a88b86f4f1bd5836314 6b62b6ca1ce95921b9a88b86f4f1bd58
36314 6b62b6ca1ce95921b9a88b86f4f1bd58robinbad123100
 
егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...
егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...
егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...etigyasyujired73
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningIJMER
 
Linguistics: Aids to Teaching
Linguistics: Aids to TeachingLinguistics: Aids to Teaching
Linguistics: Aids to Teachingchxlabastilla
 
Towards a lingua universalis
Towards a lingua universalisTowards a lingua universalis
Towards a lingua universalisWalid Saba
 
Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...
Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...
Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...Edi Brata
 
Giving able pupils a solid theoretical framework for analysing language
Giving able pupils a solid theoretical framework for analysing languageGiving able pupils a solid theoretical framework for analysing language
Giving able pupils a solid theoretical framework for analysing languageFrancis Gilbert
 
LANGUAGE PRODUCTION IN PSYCOLINGUISTIC
LANGUAGE PRODUCTION IN PSYCOLINGUISTICLANGUAGE PRODUCTION IN PSYCOLINGUISTIC
LANGUAGE PRODUCTION IN PSYCOLINGUISTICAnisa Asharie
 
Introduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptxIntroduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptxValeryRamirezMendez
 
A Primer On Freelance Writing Work
A Primer On Freelance Writing WorkA Primer On Freelance Writing Work
A Primer On Freelance Writing WorkMonique Davis
 

Similar to NLP_lectures_English (20)

Class 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phx
Class 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phxClass 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phx
Class 06 emerson_phonetics_fall2014_intro_to_linguistics_clinical_phx
 
Lin101 introduction to linguistics
Lin101 introduction to linguisticsLin101 introduction to linguistics
Lin101 introduction to linguistics
 
Professor Michael Hoey: The hidden similarities across languages - some good ...
Professor Michael Hoey: The hidden similarities across languages - some good ...Professor Michael Hoey: The hidden similarities across languages - some good ...
Professor Michael Hoey: The hidden similarities across languages - some good ...
 
Introduction to linguistics
Introduction to linguisticsIntroduction to linguistics
Introduction to linguistics
 
The lexical approach and lexical priming(1)
The lexical approach and lexical priming(1)The lexical approach and lexical priming(1)
The lexical approach and lexical priming(1)
 
The Psychology of Language Chapter 3
The Psychology of Language Chapter 3The Psychology of Language Chapter 3
The Psychology of Language Chapter 3
 
5W1H of English Mastery
5W1H of English Mastery5W1H of English Mastery
5W1H of English Mastery
 
36314
3631436314
36314
 
36314 6b62b6ca1ce95921b9a88b86f4f1bd58
36314 6b62b6ca1ce95921b9a88b86f4f1bd5836314 6b62b6ca1ce95921b9a88b86f4f1bd58
36314 6b62b6ca1ce95921b9a88b86f4f1bd58
 
егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...
егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...
егэ английский язык. автор мильруд р.п. тренировочные тесты, устная часть. , ...
 
Natural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine LearningNatural Language Ambiguity and its Effect on Machine Learning
Natural Language Ambiguity and its Effect on Machine Learning
 
Linguistics: Aids to Teaching
Linguistics: Aids to TeachingLinguistics: Aids to Teaching
Linguistics: Aids to Teaching
 
Towards a lingua universalis
Towards a lingua universalisTowards a lingua universalis
Towards a lingua universalis
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...
Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...
Introduction to Linguistics_2 Linguistics, Language and the Origin of Languag...
 
Giving able pupils a solid theoretical framework for analysing language
Giving able pupils a solid theoretical framework for analysing languageGiving able pupils a solid theoretical framework for analysing language
Giving able pupils a solid theoretical framework for analysing language
 
The Natural Approach
The Natural ApproachThe Natural Approach
The Natural Approach
 
LANGUAGE PRODUCTION IN PSYCOLINGUISTIC
LANGUAGE PRODUCTION IN PSYCOLINGUISTICLANGUAGE PRODUCTION IN PSYCOLINGUISTIC
LANGUAGE PRODUCTION IN PSYCOLINGUISTIC
 
Introduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptxIntroduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptx
 
A Primer On Freelance Writing Work
A Primer On Freelance Writing WorkA Primer On Freelance Writing Work
A Primer On Freelance Writing Work
 

More from Alexandra M. Liguori, Ph.D.

More from Alexandra M. Liguori, Ph.D. (6)

AlexandraLiguori_CogniCor_Talk_UPC
AlexandraLiguori_CogniCor_Talk_UPCAlexandraLiguori_CogniCor_Talk_UPC
AlexandraLiguori_CogniCor_Talk_UPC
 
PHD_Final_exam_AlexandraM_Liguori
PHD_Final_exam_AlexandraM_LiguoriPHD_Final_exam_AlexandraM_Liguori
PHD_Final_exam_AlexandraM_Liguori
 
DPG_Talk_March2011_AlexandraM_Liguori
DPG_Talk_March2011_AlexandraM_LiguoriDPG_Talk_March2011_AlexandraM_Liguori
DPG_Talk_March2011_AlexandraM_Liguori
 
QuantumBiology_AlexandraM_Liguori
QuantumBiology_AlexandraM_LiguoriQuantumBiology_AlexandraM_Liguori
QuantumBiology_AlexandraM_Liguori
 
Benasque_Sept2010_AlexandraM_Liguori
Benasque_Sept2010_AlexandraM_LiguoriBenasque_Sept2010_AlexandraM_Liguori
Benasque_Sept2010_AlexandraM_Liguori
 
Quantum_Mechanics_Intro_AlexandraM_Liguori
Quantum_Mechanics_Intro_AlexandraM_LiguoriQuantum_Mechanics_Intro_AlexandraM_Liguori
Quantum_Mechanics_Intro_AlexandraM_Liguori
 

NLP_lectures_English

  • 1. Introduction to Natural Language Processing in 3 Sessions Dr. Alexandra M. Liguori Incubio – The Big Data Academy Barcelona, March - April, 2015 Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 2. Outline: Lecture 1 1 Introduction 2 Natural Language Processing 3 Linguistic Ambiguities 4 Definition of corpus 5 Typical NLP tasks 6 POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 3. Outline: Lecture 2 1 Recap: Typical NLP tasks → practical examples with GATE 2 Def. of semantics 3 Frames approach 1 FrameNet 2 GATE for semantic/content analysis Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 4. Outline: Lecture 3 1 Recap: Typical NLP tasks 2 Automatic Question Answering 3 Reference resolution 4 Named Entity Recognition (NER) 5 Keyword / topic / information extraction Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 5. Welcome! Here we go...!!! Main references: Text book: Speech and Language Processing by D. Jurafsky and J. H. Martin English FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/ Spanish FrameNet: http://sfn.uab.es:8080/SFN GATE: https://gate.ac.uk/ Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 6. Outline: Lecture 1 1 Introduction 2 Natural Language Processing 3 Linguistic Ambiguities 4 Definition of corpus 5 Typical NLP tasks 6 POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 7. Introduction: Intelligent machines? Video: https://www.youtube.com/watch?v=dSIKBliboIo (Stanley Kubrick and Arthur C. Clarke, screenplay of 2001: A Space Odyssey) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 8. Introduction: Intelligent machines? Dave Bowman: Open the pod bay doors, HAL. HAL: I’m sorry Dave, I’m afraid I can’t do that. (Stanley Kubrick and Arthur C. Clarke, screenplay of 2001: A Space Odyssey) https://www.youtube.com/watch?v=dSIKBliboIo Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 9. Introduction: Intelligent machines? Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 10. Introduction: Intelligent machines? 1 Phonetics and phonology Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 11. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 12. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 13. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 14. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words 5 Compositional semantics → knowledge of how components combine to form larger meanings Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 15. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words 5 Compositional semantics → knowledge of how components combine to form larger meanings 6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t vs. No, I won’t open the door. vs. No. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 16. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words 5 Compositional semantics → knowledge of how components combine to form larger meanings 6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t vs. No, I won’t open the door. vs. No. 7 Discourse conventions → engaging in structured conversation using reference that in I’m sorry Dave, I’m afraid I can’t do that Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 17. Natural Language Processing NLP: techniques that process written human language as language. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 18. Natural Language Processing NLP: techniques that process written human language as language. Applications word counting automatic hyphenation automated question answering named entity extraction (NER) information/content extraction semantic analysis sentiment analysis machine translation Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 19. Natural Language Processing NLP: techniques that process written human language as language. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 20. Natural Language Processing NLP: techniques that process written human language as language. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 21. Natural Language Processing An ideal NLP team is very interdisciplinary, including: Language experts (linguists) Maths experts (mathematicians, physicists, statisticians) Programmers (computer scientists) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 22. NLP: Maths & Computer Science Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 23. NLP: Six categories of linguistic knowledge Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 24. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 25. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 26. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking 3 Syntax ↔ She ate a mammoth breakfast - She eating a mammoth breakfast Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 27. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking 3 Syntax ↔ She ate a mammoth breakfast - She eating a mammoth breakfast 4 Semantics ↔ book (verb) - book (noun); duck (verb) - duck (noun) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 28. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking 3 Syntax ↔ She ate a mammoth breakfast - She eating a mammoth breakfast 4 Semantics ↔ book (verb) - book (noun); duck (verb) - duck (noun) 5 Pragmatics ↔ open the door - can you open the door? - could you open the door, please? Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 29. NLP: Six categories of linguistic knowledge 6 Discourse Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 30. NLP: Six categories of linguistic knowledge 6 Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 31. NLP: Six categories of linguistic knowledge 6 Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. John went to Bill’s car dealership to check out an Acura Integra. He looked at it for about an hour. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 32. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 33. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 34. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 35. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 36. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 37. Linguistic Ambiguities Example I made her duck. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 38. Linguistic Ambiguities Example I made her duck. Five possible interpretations: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 39. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 40. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 41. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. 3 I created the (plaster?) duck she owns. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 42. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. 3 I created the (plaster?) duck she owns. 4 I caused her to quickly lower her head or body. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 43. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. 3 I created the (plaster?) duck she owns. 4 I caused her to quickly lower her head or body. 5 I waved my magic wand and turned her into undifferentiated waterfowl. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 44. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 45. Linguistic Ambiguities Morphological ambiguity duck: verb or noun her: dative pronoun or possessive pronoun Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 46. Linguistic Ambiguities Morphological ambiguity duck: verb or noun her: dative pronoun or possessive pronoun Syntactic ambiguity: make transitive: taking a single direct object (case 2) ditransitive: taking two objects, meaning that the first object (her) got made into the second object (duck) taking a direct object and a verb, meaning that the object (her) got caused to perform the verbal action (duck) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 47. Linguistic Ambiguities Morphological ambiguity duck: verb or noun her: dative pronoun or possessive pronoun Syntactic ambiguity: make transitive: taking a single direct object (case 2) ditransitive: taking two objects, meaning that the first object (her) got made into the second object (duck) taking a direct object and a verb, meaning that the object (her) got caused to perform the verbal action (duck) Semantic ambiguity: make cook create Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 48. Corpus Definition Corpus = Large and structured set of texts. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 49. Corpus Definition Corpus = Large and structured set of texts. NLP Two types of corpora: Training corpus ↔ to make the list of rules or to get the statistical data Test corpus ↔ to test the results found with the training corpus Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 50. Typical NLP tasks: Basic and simpler tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 51. Typical NLP tasks: Basic and simpler tasks Tokenization Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 52. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 53. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 54. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 55. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 56. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging POS-tagging algorithms and tag sets Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 57. Typical NLP tasks: Complex tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 58. Typical NLP tasks: Complex tasks Lemmatization or Stemming Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 59. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 60. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 61. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 62. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Topic extraction NER Semantic analysis ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 63. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Topic extraction NER Semantic analysis ... Ad hoc tools, e.g. dictionaries, ontologies, Frames, GATE, NLTK, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 64. POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 65. POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 66. POS-tagging Example with Penn Treebank POS-tags: A/DT woman/NN came/VBD from/IN the/DT back/NN of/IN the/DT store/NN ./. She/PP appeared/VBD to/TO be/VB sleepy/JJ and/CC quite/RB a/DT bit/NN younger/JJR than/IN Mr./NNP Dobbs/NNP and/CC to/TO be/VB wearing/VBG too/RB much/RB makeup/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 67. POS-tagging Example of ambiguity: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 68. POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 69. POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 70. POS-tagging Three main tagging algorithms or methods: 1 rule-based tagging, e.g. ENGTWOL 2 stochastic tagging, e.g. HMM tagger 3 transformation-based tagging, e.g. Brill tagger Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 71. Rule-based POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 72. Rule-based POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 73. Rule-based POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Large database of hand-written disambiguation rules, e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 74. Rule-based POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Large database of hand-written disambiguation rules, e.g.: TO + VB → YES TO + NN → NO DT + NN → YES DT + VB → NO Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 75. Stochastic POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 76. Stochastic POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 77. Stochastic POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Training corpus to compute probability of given word having given tag in given context, e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 78. Stochastic POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Training corpus to compute probability of given word having given tag in given context, e.g.: is/VBZ expected/VBN to/TO race/VB → 98% is/VBZ expected/VBN to/TO race/NN → 2% reason/NN for/IN the/DT race/NN → 97% reason/NN for/IN the/DT race/VB → 3% Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 79. Transformation-based tagging POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 80. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 81. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 82. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 83. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN 2 Machine Learning to learn conditional probabilities: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 84. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN 2 Machine Learning to learn conditional probabilities: 3 is/VBZ expected/VBN to/TO race/VB → 98% reason/NN for/IN the/DT race/NN → 97% Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 85. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN 2 Machine Learning to learn conditional probabilities: 3 is/VBZ expected/VBN to/TO race/VB → 98% reason/NN for/IN the/DT race/NN → 97% 4 system takes race = NN or race = VB depending on context. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 86. POS-tagging POS-tagging tools for English: Brill tagger Stanford Log-linear POS-tagger (Java) POS-tagger integrated in GATE (Java) POS-tagger with NLTK (Python) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 87. Outline: Lecture 2 1 Recap: Typical NLP tasks → practical examples with GATE 2 Def. of semantics 3 Frames approach 1 FrameNet 2 GATE for semantic/content analysis Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 88. Typical NLP tasks: Basic and simpler tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 89. Typical NLP tasks: Basic and simpler tasks Tokenization Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 90. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 91. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 92. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 93. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 94. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging POS-tagging algorithms and tag sets Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 95. Typical NLP tasks: Complex tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 96. Typical NLP tasks: Complex tasks Lemmatization or Stemming Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 97. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 98. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Topic extraction NER Semantic analysis ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 99. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Topic extraction NER Semantic analysis ... Ad hoc tools, e.g. dictionaries, ontologies, Frames, GATE, NLTK, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 100. EX.: GATE Concrete examples with GATE: 1 Tokenizer 2 Sentence-splitter 3 POS-tagger 4 Stemmer Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 101. EX.: GATE Concrete examples with GATE: 1 Tokenizer 2 Sentence-splitter 3 POS-tagger 4 Stemmer GATE https://gate.ac.uk/ development at the University of Sheffield, UK. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 102. Semantics ’Then you should say what you mean,’ the March Hare went on. ’I do,’ Alice hastily replied; ’at least, I mean what I say – that’s the same thing, you know.’ ’Not the same thing a bit!’ said the Hatter. ’You might just as well say that ”I see what I eat” is the same thing as ”I eat what I see”! ’ Lewis Carroll, Alice in Wonderland Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 103. Frames and FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: Abby bought a car from Robin for $5,000. Robin sold a car to Abby for $5,000. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 104. Frames and FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: Abby bought a car from Robin for $5,000. Robin sold a car to Abby for $5,000. English FrameNet https://framenet.icsi.berkeley.edu/fndrupal/ development at the International Computer Science Institute in Berkeley, California. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 105. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 106. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 107. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 108. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 109. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 110. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 111. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 112. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 113. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 114. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 115. Frame Relations FrameNet additionally captures relationships between different frames using relations. These include the following: Inheritance: When one frame is a more specific version of another, more abstract parent frame. Anything that is true about the parent frame must also be true about the child frame, and a mapping is specified between the frame elements of the parent and the frame elements of the child. Perspectivized-in: A neutral frame (like Commerce-transfer-goods) is connected to a frame with a specific perspective of the same scenario (e.g. the Commerce-sell frame, which assumes the perspective of the seller or the Commerce-buy frame, which assumes the perspective of the buyer) Subframe: Some frames like the Criminal-process frame refer to complex scenarios that consist of several individual states or events that can be described by separate frames like Arrest, Trial, and so on. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 116. Frame Relations Precedes: The Precedes relation captures a temporal order that holds between subframes of a complex scenario. Causative-of and Inchoative-of: There is a fairly systematic relationship between stative descriptions (e.g. the Position-on-a-scale frame, "She had a high salary") and causative descriptions (Cause-change-of-scalar-position, "She raised his salary") or inchoative descriptions (Change-position-on-a-scale, e.g. "Her salary increased"). Using: A relationship that holds between a frame that in some way involves another frame. For instance, the Judgment-communication frame uses both the Judgment frame and the Statement frame, but does not inherit from either of them because there is no clear correspondence of the frame elements. See-also: Connects frames that bear some resemblance but need to be distinguished carefully. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 117. Spanish FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: El rock influye en los artistas de hoy en día para sus producciones artísticas. Los artistas de hoy en día se inspiran al rock para sus producciones artísticas. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 118. Spanish FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: El rock influye en los artistas de hoy en día para sus producciones artísticas. Los artistas de hoy en día se inspiran al rock para sus producciones artísticas. Spanish FrameNet http://sfn.uab.es:8080/SFN development at the Universidad Autónoma de Barcelona and International Computer Science Institute in Berkeley, California. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 119. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 120. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 121. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 122. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 123. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 124. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 125. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 126. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 127. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 128. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 129. Frames and GATE And now... Ex. in English implementing FRAMES, LUs, and FEs with GATE !!! Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 130. Frames and GATE And now... Ex. in English implementing FRAMES, LUs, and FEs with GATE !!! GATE https://gate.ac.uk/ development at the University of Sheffield, UK. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 131. Outline: Lecture 3 1 Recap: Typical NLP tasks 2 Automatic Question Answering 3 Reference resolution 4 Named Entity Recognition (NER) 5 Keyword / topic / information extraction Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 132. Typical NLP tasks: Basic and simpler tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 133. Typical NLP tasks: Basic and simpler tasks Tokenization Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 134. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 135. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 136. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 137. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 138. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging POS-tagging algorithms and tag sets Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 139. Typical NLP tasks: Complex tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 140. Typical NLP tasks: Complex tasks Lemmatization or Stemming Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 141. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 142. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 143. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 144. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Question answering Topic extraction NER Semantic analysis ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 145. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Question answering Topic extraction NER Semantic analysis ... Ad hoc tools, e.g. dictionaries, ontologies, Frames, GATE, NLTK, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 146. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 147. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 148. Question Answering Example from Who’s on first Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 149. Question Answering Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 150. Simple Question Answering ELIZA ELIZA simulated a Rogerian psychologist, and could carry on conversations with the user like the following: User-1: Men are all alike. ELIZA-1: IN WHAT WAY User-2: They’re always bugging us about something or other.. ELIZA-2: CAN YOU THINK OF A SPECIFIC EXAMPLE User-3: Well, my boyfriend made me come here. ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE User-4: Yes, he says I’m depressed much of the time. ELIZA-4: I AM SORRY TO HEAR YOU ARE DEPRESSED. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 151. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: change all instances of my to YOUR, and I’m to YOU ARE, etc., e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 152. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: change all instances of my to YOUR, and I’m to YOU ARE, etc., e.g.: 1 User-3: Well, my boyfriend made me come here. ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 153. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: change all instances of my to YOUR, and I’m to YOU ARE, etc., e.g.: 1 User-3: Well, my boyfriend made me come here. ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE 2 User-4: ... I’m depressed ... . ELIZA-4: ... YOU ARE DEPRESSED. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 154. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 155. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 156. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / 2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK YOU ARE 1 / Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 157. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / 2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK YOU ARE 1 / 3 s/.* all .*/IN WHAT WAY/ Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 158. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / 2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK YOU ARE 1 / 3 s/.* all .*/IN WHAT WAY/ 4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/ Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 159. Reference resolution Discourse Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 160. Reference resolution Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 161. Reference resolution Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. John went to Bill’s car dealership to check out an Acura Integra. He looked at it for about an hour. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 162. Reference resolution Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 163. Reference resolution 1 Reference phenomena Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 164. Reference resolution 1 Reference phenomena 2 Constraints on coreference Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 165. Reference resolution 1 Reference phenomena 2 Constraints on coreference 3 Preferences in pronoun interpretation Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 166. Reference resolution 1 Reference phenomena 2 Constraints on coreference 3 Preferences in pronoun interpretation 4 Example of algorithm for pronoun resolution Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 167. Reference resolution Reference phenomena Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 168. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 169. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 170. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. 3 Pronouns ↔ I saw an Honda Civic today. It was blue. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 171. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. 3 Pronouns ↔ I saw an Honda Civic today. It was blue. 4 Demonstratives ↔ I bought an Honda Civic today. It’s similar to the one I bought five years ago. That one was really nice, but I like this one even better. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 172. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. 3 Pronouns ↔ I saw an Honda Civic today. It was blue. 4 Demonstratives ↔ I bought an Honda Civic today. It’s similar to the one I bought five years ago. That one was really nice, but I like this one even better. 5 One-anaphora ↔ I saw no less than 6 Honda Civics today. Now I want one. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 173. Reference resolution Constraints on coreference Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 174. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 175. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 176. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. 3 gender agreement ↔ John has a new car. It is attractive. / John has a new car. He is attractive. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 177. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. 3 gender agreement ↔ John has a new car. It is attractive. / John has a new car. He is attractive. 4 syntactic constraints ↔ John bought himself a new car. / John bought him a new car. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 178. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. 3 gender agreement ↔ John has a new car. It is attractive. / John has a new car. He is attractive. 4 syntactic constraints ↔ John bought himself a new car. / John bought him a new car. 5 selectional restrictions ↔ John parked his car in the garage. He had driven it around for hours. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 179. Reference resolution Preferences in pronoun interpretation Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 180. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 181. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 182. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. 3 repeated mention ↔ Anne needed a car to drive to her new job. She decided she wanted something roomy. Carol went to the Honda dealership with her. She bought a Civic. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 183. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. 3 repeated mention ↔ Anne needed a car to drive to her new job. She decided she wanted something roomy. Carol went to the Honda dealership with her. She bought a Civic. 4 parallelism ↔ Anne went with Carol to the Honda dealership. Sally went with her to the VW dealership. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 184. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. 3 repeated mention ↔ Anne needed a car to drive to her new job. She decided she wanted something roomy. Carol went to the Honda dealership with her. She bought a Civic. 4 parallelism ↔ Anne went with Carol to the Honda dealership. Sally went with her to the VW dealership. 5 verb semantics ↔ Peter seized the Honda pamphlet from Bob. He loves reading about cars. / Peter passed the Honda pamphlet to Bob. He loves reading about cars. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 185. NER Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 186. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 187. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 188. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 189. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 190. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) 2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 191. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) 2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.) 3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”, ”Barcelona”, etc.) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 192. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) 2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.) 3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”, ”Barcelona”, etc.) 4 other (e.g. ”Hotel Sunshine”, etc. ) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 193. NER Tools for Named Entity Recognition GATE for English, Spanish, and many more, via graphical interface and Java API (development at the University of Sheffield, UK) https://gate.ac.uk/ NETagger: Java based Illinois Named Entity Recognition (development by Cognitive Computation Group at University of Illinois at Urbana - Champaign) http://cogcomp.cs.illinois.edu/page/software_view/NETagger OpenNLP: rule based and statistical Named Entity Recognition (development by Apache) http://opennlp.apache.org/index.html Stanford CoreNLP: Java-based CRF Named Entity Recognition (development by Stanford Natural Language Processing Group) http://nlp.stanford.edu/software/CRF-NER.shtml Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 194. Keyword / topic / information extraction Tools Keyword extraction: e.g. GATE (ANNIE tool) for English, Spanish, and many more, via graphical interface and Java API → simply using jape files for the LUs tool from Volker ? Topic / information extraction: e.g. GATE (ANNIE tool) for English, Spanish, and many more, via graphical interface and Java API → using jape files for the LUs, FEs, and FRAMES GATE https://gate.ac.uk/ development at the University of Sheffield, UK Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  • 195. Thank you for your attention! For more information: Example text book: Speech and Language Processing by D. Jurafsky and J. H. Martin Web page: www.alexandramliguoriphd.com Linkedin profile: Alexandra M. Liguori, Ph.D. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions