1. Introduction to Natural Language Processing
in 3 Sessions
Dr. Alexandra M. Liguori
Incubio – The Big Data Academy
Barcelona, March - April, 2015
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
2. Outline: Lecture 1
1 Introduction
2 Natural Language Processing
3 Linguistic Ambiguities
4 Definition of corpus
5 Typical NLP tasks
6 POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
3. Outline: Lecture 2
1 Recap: Typical NLP tasks → practical examples with GATE
2 Def. of semantics
3 Frames approach
1 FrameNet
2 GATE for semantic/content analysis
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
4. Outline: Lecture 3
1 Recap: Typical NLP tasks
2 Automatic Question Answering
3 Reference resolution
4 Named Entity Recognition (NER)
5 Keyword / topic / information extraction
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
5. Welcome!
Here we go...!!!
Main references:
Text book: Speech and Language Processing by D. Jurafsky
and J. H. Martin
English FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/
Spanish FrameNet: http://sfn.uab.es:8080/SFN
GATE: https://gate.ac.uk/
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
6. Outline: Lecture 1
1 Introduction
2 Natural Language Processing
3 Linguistic Ambiguities
4 Definition of corpus
5 Typical NLP tasks
6 POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
8. Introduction: Intelligent machines?
Dave Bowman: Open the pod bay doors, HAL.
HAL: I’m sorry Dave, I’m afraid I can’t do that.
(Stanley Kubrick and Arthur C. Clarke,
screenplay of 2001: A Space Odyssey)
https://www.youtube.com/watch?v=dSIKBliboIo
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
11. Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
12. Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
13. Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
14. Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
5 Compositional semantics → knowledge of how
components combine to form larger meanings
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
15. Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
5 Compositional semantics → knowledge of how
components combine to form larger meanings
6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t
vs. No, I won’t open the door.
vs. No.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
16. Introduction: Intelligent machines?
1 Phonetics and phonology
2 Morphology → produce contractions I’m and can’t
3 Syntax → cfr. Open the pod bay doors, HAL.
vs. HAL, the pod bay door is open.
vs. HAL, is the pod bay door open?
4 Lexical semantics → meaning of component words
5 Compositional semantics → knowledge of how
components combine to form larger meanings
6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t
vs. No, I won’t open the door.
vs. No.
7 Discourse conventions → engaging in structured
conversation using reference that in I’m sorry Dave, I’m
afraid I can’t do that
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
17. Natural Language Processing
NLP: techniques that process written human language as
language.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
18. Natural Language Processing
NLP: techniques that process written human language as
language.
Applications
word counting
automatic hyphenation
automated question answering
named entity extraction (NER)
information/content extraction
semantic analysis
sentiment analysis
machine translation
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
19. Natural Language Processing
NLP: techniques that process written human language as
language.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
20. Natural Language Processing
NLP: techniques that process written human language as
language.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
21. Natural Language Processing
An ideal NLP team is very interdisciplinary, including:
Language experts (linguists)
Maths experts (mathematicians, physicists, statisticians)
Programmers (computer scientists)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
22. NLP: Maths & Computer Science
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
23. NLP: Six categories of linguistic knowledge
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
24. NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
25. NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
26. NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
3 Syntax ↔ She ate a mammoth breakfast - She eating a
mammoth breakfast
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
27. NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
3 Syntax ↔ She ate a mammoth breakfast - She eating a
mammoth breakfast
4 Semantics ↔ book (verb) - book (noun);
duck (verb) - duck (noun)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
28. NLP: Six categories of linguistic knowledge
1 Phonetics and phonology ↔ red - read - read;
sleigh - slay
2 Morphology ↔ I/you/we/you/they walk - he/she/it walks;
walked; walking
3 Syntax ↔ She ate a mammoth breakfast - She eating a
mammoth breakfast
4 Semantics ↔ book (verb) - book (noun);
duck (verb) - duck (noun)
5 Pragmatics ↔ open the door - can you open the door? -
could you open the door, please?
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
29. NLP: Six categories of linguistic knowledge
6 Discourse
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
30. NLP: Six categories of linguistic knowledge
6 Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were
having matrimonial trouble, and my brother was hired to
watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and
night for six months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
31. NLP: Six categories of linguistic knowledge
6 Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were
having matrimonial trouble, and my brother was hired to
watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and
night for six months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
John went to Bill’s car dealership to check out an
Acura Integra. He looked at it for about an hour.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
32. NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
33. NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
38. Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
39. Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
40. Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
41. Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
3 I created the (plaster?) duck she owns.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
42. Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
3 I created the (plaster?) duck she owns.
4 I caused her to quickly lower her head or body.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
43. Linguistic Ambiguities
Example
I made her duck.
Five possible interpretations:
1 I cooked waterfowl for her.
2 I cooked waterfowl belonging to her.
3 I created the (plaster?) duck she owns.
4 I caused her to quickly lower her head or body.
5 I waved my magic wand and turned her into
undifferentiated waterfowl.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
46. Linguistic Ambiguities
Morphological ambiguity
duck: verb or noun
her: dative pronoun or possessive pronoun
Syntactic ambiguity: make
transitive: taking a single direct object (case 2)
ditransitive: taking two objects, meaning that the first object
(her) got made into the second object (duck)
taking a direct object and a verb, meaning that the object
(her) got caused to perform the verbal action (duck)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
47. Linguistic Ambiguities
Morphological ambiguity
duck: verb or noun
her: dative pronoun or possessive pronoun
Syntactic ambiguity: make
transitive: taking a single direct object (case 2)
ditransitive: taking two objects, meaning that the first object
(her) got made into the second object (duck)
taking a direct object and a verb, meaning that the object
(her) got caused to perform the verbal action (duck)
Semantic ambiguity: make
cook
create
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
48. Corpus
Definition
Corpus = Large and structured set of texts.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
49. Corpus
Definition
Corpus = Large and structured set of texts.
NLP
Two types of corpora:
Training corpus ↔ to make the list of rules or to get the
statistical data
Test corpus ↔ to test the results found with the training
corpus
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
50. Typical NLP tasks: Basic and simpler tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
51. Typical NLP tasks: Basic and simpler tasks
Tokenization
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
52. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
53. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
54. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
55. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
56. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
POS-tagging algorithms and
tag sets
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
57. Typical NLP tasks: Complex tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
58. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
59. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
60. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
61. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
62. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Topic extraction
NER
Semantic analysis
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
63. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Topic extraction
NER
Semantic analysis
...
Ad hoc tools, e.g.
dictionaries, ontologies,
Frames, GATE, NLTK,
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
68. POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
69. POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
70. POS-tagging
Three main tagging algorithms or methods:
1 rule-based tagging, e.g. ENGTWOL
2 stochastic tagging, e.g. HMM tagger
3 transformation-based tagging, e.g. Brill tagger
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
72. Rule-based POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
73. Rule-based POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Large database of hand-written disambiguation rules, e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
74. Rule-based POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Large database of hand-written disambiguation rules, e.g.:
TO + VB → YES
TO + NN → NO
DT + NN → YES
DT + VB → NO
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
76. Stochastic POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
77. Stochastic POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Training corpus to compute probability of given word having
given tag in given context, e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
78. Stochastic POS-tagging
Example of ambiguity:
1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
2 People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
./.
Training corpus to compute probability of given word having
given tag in given context, e.g.:
is/VBZ expected/VBN to/TO race/VB → 98%
is/VBZ expected/VBN to/TO race/NN → 2%
reason/NN for/IN the/DT race/NN → 97%
reason/NN for/IN the/DT race/VB → 3%
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
80. Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
81. Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
82. Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
83. Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
2 Machine Learning to learn conditional probabilities:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
84. Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
2 Machine Learning to learn conditional probabilities:
3 is/VBZ expected/VBN to/TO race/VB → 98%
reason/NN for/IN the/DT race/NN → 97%
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
85. Transformation-based tagging POS-tagging
Example of ambiguity:
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN ./.
People/NNS continue/VBP to/TO inquire/VB the/DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN
Rules automatically induced from data using Machine Learning
techniques, e.g.:
1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35%
→ system would always take race = NN
2 Machine Learning to learn conditional probabilities:
3 is/VBZ expected/VBN to/TO race/VB → 98%
reason/NN for/IN the/DT race/NN → 97%
4 system takes race = NN or race = VB depending on
context.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
86. POS-tagging
POS-tagging tools for English:
Brill tagger
Stanford Log-linear POS-tagger (Java)
POS-tagger integrated in GATE (Java)
POS-tagger with NLTK (Python)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
87. Outline: Lecture 2
1 Recap: Typical NLP tasks → practical examples with GATE
2 Def. of semantics
3 Frames approach
1 FrameNet
2 GATE for semantic/content analysis
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
88. Typical NLP tasks: Basic and simpler tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
89. Typical NLP tasks: Basic and simpler tasks
Tokenization
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
90. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
91. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
92. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
93. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
94. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
POS-tagging algorithms and
tag sets
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
95. Typical NLP tasks: Complex tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
96. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
97. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
98. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Topic extraction
NER
Semantic analysis
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
99. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Topic extraction
NER
Semantic analysis
...
Ad hoc tools, e.g.
dictionaries, ontologies,
Frames, GATE, NLTK, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
100. EX.: GATE
Concrete examples with GATE:
1 Tokenizer
2 Sentence-splitter
3 POS-tagger
4 Stemmer
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
101. EX.: GATE
Concrete examples with GATE:
1 Tokenizer
2 Sentence-splitter
3 POS-tagger
4 Stemmer
GATE
https://gate.ac.uk/
development at the University of Sheffield, UK.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
102. Semantics
’Then you should say what you mean,’
the March Hare went on.
’I do,’ Alice hastily replied;
’at least, I mean what I say –
that’s the same thing, you know.’
’Not the same thing a bit!’ said the Hatter. ’You might just as
well say that
”I see what I eat” is the same thing as ”I eat what I see”! ’
Lewis Carroll,
Alice in Wonderland
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
103. Frames and FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles.
E.g.:
Abby bought a car from Robin for $5,000.
Robin sold a car to Abby for $5,000.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
104. Frames and FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles.
E.g.:
Abby bought a car from Robin for $5,000.
Robin sold a car to Abby for $5,000.
English FrameNet
https://framenet.icsi.berkeley.edu/fndrupal/
development at the International Computer Science Institute in
Berkeley, California.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
115. Frame Relations
FrameNet additionally captures relationships between different
frames using relations. These include the following:
Inheritance: When one frame is a more specific version of
another, more abstract parent frame. Anything that is true
about the parent frame must also be true about the child
frame, and a mapping is specified between the frame
elements of the parent and the frame elements of the child.
Perspectivized-in: A neutral frame (like
Commerce-transfer-goods) is connected to a frame with a
specific perspective of the same scenario (e.g. the
Commerce-sell frame, which assumes the perspective of
the seller or the Commerce-buy frame, which assumes the
perspective of the buyer)
Subframe: Some frames like the Criminal-process frame
refer to complex scenarios that consist of several individual
states or events that can be described by separate frames
like Arrest, Trial, and so on.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
116. Frame Relations
Precedes: The Precedes relation captures a temporal
order that holds between subframes of a complex scenario.
Causative-of and Inchoative-of: There is a fairly systematic
relationship between stative descriptions (e.g. the
Position-on-a-scale frame, "She had a high salary") and
causative descriptions (Cause-change-of-scalar-position,
"She raised his salary") or inchoative descriptions
(Change-position-on-a-scale, e.g. "Her salary increased").
Using: A relationship that holds between a frame that in
some way involves another frame. For instance, the
Judgment-communication frame uses both the Judgment
frame and the Statement frame, but does not inherit from
either of them because there is no clear correspondence of
the frame elements.
See-also: Connects frames that bear some resemblance
but need to be distinguished carefully.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
117. Spanish FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles. E.g.:
El rock influye en los artistas de hoy en día
para sus producciones artísticas.
Los artistas de hoy en día se inspiran al rock
para sus producciones artísticas.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
118. Spanish FrameNet
Frame
A schematic representation of a situation involving various
participants, and other conceptual roles. E.g.:
El rock influye en los artistas de hoy en día
para sus producciones artísticas.
Los artistas de hoy en día se inspiran al rock
para sus producciones artísticas.
Spanish FrameNet
http://sfn.uab.es:8080/SFN
development at the Universidad Autónoma de Barcelona and
International Computer Science Institute in Berkeley, California.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
129. Frames and GATE
And now...
Ex. in English implementing FRAMES, LUs, and FEs with
GATE !!!
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
130. Frames and GATE
And now...
Ex. in English implementing FRAMES, LUs, and FEs with
GATE !!!
GATE
https://gate.ac.uk/
development at the University of Sheffield, UK.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
131. Outline: Lecture 3
1 Recap: Typical NLP tasks
2 Automatic Question Answering
3 Reference resolution
4 Named Entity Recognition (NER)
5 Keyword / topic / information extraction
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
132. Typical NLP tasks: Basic and simpler tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
133. Typical NLP tasks: Basic and simpler tasks
Tokenization
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
134. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
135. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
136. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
137. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
138. Typical NLP tasks: Basic and simpler tasks
Tokenization RegEx
Sentence splitting RegEx
POS-tagging
POS-tagging algorithms and
tag sets
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
139. Typical NLP tasks: Complex tasks
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
140. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
141. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
142. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
143. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
144. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Question answering
Topic extraction
NER
Semantic analysis
...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
145. Typical NLP tasks: Complex tasks
Lemmatization or Stemming
Implementations of Porter
Stemmer (e.g. in Java),
Stanford NLP tool, GATE, ...
Syntactic parsing
Early algorithm, CYK
algorithm, GHR algorithm,
Stanford Parser (Java
implementation of
probabilistic algorithm)
Question answering
Topic extraction
NER
Semantic analysis
...
Ad hoc tools, e.g.
dictionaries, ontologies,
Frames, GATE, NLTK, ...
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
146. NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
147. NLP: Ambiguities and Solutions
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
148. Question Answering
Example from Who’s on first
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
150. Simple Question Answering
ELIZA
ELIZA simulated a Rogerian psychologist, and could carry on
conversations with the user like the following:
User-1: Men are all alike.
ELIZA-1: IN WHAT WAY
User-2: They’re always bugging us about something or other..
ELIZA-2: CAN YOU THINK OF A SPECIFIC EXAMPLE
User-3: Well, my boyfriend made me come here.
ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE
User-4: Yes, he says I’m depressed much of the time.
ELIZA-4: I AM SORRY TO HEAR YOU ARE DEPRESSED.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
151. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
change all instances of my to YOUR, and I’m to YOU ARE, etc.,
e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
152. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
change all instances of my to YOUR, and I’m to YOU ARE, etc.,
e.g.:
1 User-3: Well, my boyfriend made me come here.
ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
153. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
change all instances of my to YOUR, and I’m to YOU ARE, etc.,
e.g.:
1 User-3: Well, my boyfriend made me come here.
ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE
2 User-4: ... I’m depressed ... .
ELIZA-4: ... YOU ARE DEPRESSED.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
154. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
155. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
156. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK
YOU ARE 1 /
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
157. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK
YOU ARE 1 /
3 s/.* all .*/IN WHAT WAY/
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
158. ELIZA
Regular expression substitutions
Eliza worked by having a cascade of regular expression substitutions
that each matched some part of the input lines and changed them:
relevant patterns in the input → creat an appropriate output;
e.g.:
1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR
YOU ARE 1 /
2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK
YOU ARE 1 /
3 s/.* all .*/IN WHAT WAY/
4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
160. Reference resolution
Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were having
matrimonial trouble, and my brother was hired to watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and night for six
months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
161. Reference resolution
Discourse
Gracie: Oh yeah... And then Mr. and Mrs. Jones were having
matrimonial trouble, and my brother was hired to watch Mrs. Jones.
George: Well, I imagine she was a very attractive woman.
Gracie: She was, and my brother watched her day and night for six
months.
George: Well, what happened?
Gracie: She finally got a divorce.
George: Mrs. Jones?
Gracie: No, my brother’s wife.
John went to Bill’s car dealership to check out an Acura Integra.
He looked at it for about an hour.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
164. Reference resolution
1 Reference phenomena
2 Constraints on coreference
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
165. Reference resolution
1 Reference phenomena
2 Constraints on coreference
3 Preferences in pronoun interpretation
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
166. Reference resolution
1 Reference phenomena
2 Constraints on coreference
3 Preferences in pronoun interpretation
4 Example of algorithm for pronoun resolution
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
168. Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
169. Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
170. Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
3 Pronouns ↔ I saw an Honda Civic today. It was blue.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
171. Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
3 Pronouns ↔ I saw an Honda Civic today. It was blue.
4 Demonstratives ↔ I bought an Honda Civic today. It’s
similar to the one I bought five years ago. That one was
really nice, but I like this one even better.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
172. Reference resolution
Reference phenomena
1 Indefinite noun phrases ↔ I saw an Honda Civic today.
2 Definite noun phrases ↔ I saw an Honda Civic today.
The Honda Civic was blue.
3 Pronouns ↔ I saw an Honda Civic today. It was blue.
4 Demonstratives ↔ I bought an Honda Civic today. It’s
similar to the one I bought five years ago. That one was
really nice, but I like this one even better.
5 One-anaphora ↔ I saw no less than 6 Honda Civics
today. Now I want one.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
174. Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
175. Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
176. Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
3 gender agreement ↔ John has a new car. It is attractive.
/ John has a new car. He is attractive.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
177. Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
3 gender agreement ↔ John has a new car. It is attractive.
/ John has a new car. He is attractive.
4 syntactic constraints ↔ John bought himself a new car. /
John bought him a new car.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
178. Reference resolution
Constraints on coreference
1 number agreement ↔ John has a new car. It is red. /
John has a new car. They are red.
2 person and case agreement ↔ John and Mary have new
cars. They love them.
3 gender agreement ↔ John has a new car. It is attractive.
/ John has a new car. He is attractive.
4 syntactic constraints ↔ John bought himself a new car. /
John bought him a new car.
5 selectional restrictions ↔ John parked his car in the
garage. He had driven it around for hours.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
179. Reference resolution
Preferences in pronoun interpretation
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
180. Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
181. Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
182. Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
3 repeated mention ↔ Anne needed a car to drive to her
new job. She decided she wanted something roomy. Carol
went to the Honda dealership with her. She bought a Civic.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
183. Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
3 repeated mention ↔ Anne needed a car to drive to her
new job. She decided she wanted something roomy. Carol
went to the Honda dealership with her. She bought a Civic.
4 parallelism ↔ Anne went with Carol to the Honda
dealership. Sally went with her to the VW dealership.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
184. Reference resolution
Preferences in pronoun interpretation
1 recency ↔ Peter has an Audi. Bob has a Honda. Anne
likes to drive it.
2 grammatical role ↔ Peter went to the Honda dealership
with Bob. He bought a Civic. / Bob went to the Honda
dealership with Peter. He bought a Civic.
3 repeated mention ↔ Anne needed a car to drive to her
new job. She decided she wanted something roomy. Carol
went to the Honda dealership with her. She bought a Civic.
4 parallelism ↔ Anne went with Carol to the Honda
dealership. Sally went with her to the VW dealership.
5 verb semantics ↔ Peter seized the Honda pamphlet from
Bob. He loves reading about cars. / Peter passed the
Honda pamphlet to Bob. He loves reading about cars.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
185. NER
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
186. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
187. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
188. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
189. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
190. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
191. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)
3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,
”Barcelona”, etc.)
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
192. NER
Named Entity Recognition
Can be broken down in two distinct problems, i.e.:
1 detection of names
2 classification of the names by the type of entity to which
they refer → 4 standard types:
1 person (e.g. ”Carol”, ”Tom Hanks”, etc.)
2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.)
3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”,
”Barcelona”, etc.)
4 other (e.g. ”Hotel Sunshine”, etc. )
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
193. NER
Tools for Named Entity Recognition
GATE for English, Spanish, and many more, via graphical
interface and Java API (development at the University of
Sheffield, UK)
https://gate.ac.uk/
NETagger: Java based Illinois Named Entity Recognition
(development by Cognitive Computation Group at University of
Illinois at Urbana - Champaign)
http://cogcomp.cs.illinois.edu/page/software_view/NETagger
OpenNLP: rule based and statistical Named Entity Recognition
(development by Apache)
http://opennlp.apache.org/index.html
Stanford CoreNLP: Java-based CRF Named Entity Recognition
(development by Stanford Natural Language Processing Group)
http://nlp.stanford.edu/software/CRF-NER.shtml
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
194. Keyword / topic / information extraction
Tools
Keyword extraction: e.g. GATE (ANNIE tool) for English,
Spanish, and many more, via graphical interface and Java
API
→ simply using jape files for the LUs
tool from Volker ?
Topic / information extraction: e.g. GATE (ANNIE tool)
for English, Spanish, and many more, via graphical
interface and Java API
→ using jape files for the LUs, FEs, and FRAMES
GATE
https://gate.ac.uk/
development at the University of Sheffield, UK
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
195. Thank you for your attention!
For more information:
Example text book: Speech and Language Processing
by D. Jurafsky and J. H. Martin
Web page: www.alexandramliguoriphd.com
Linkedin profile: Alexandra M. Liguori, Ph.D.
Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions