Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NLP_lectures_English

1,518 views

Published on

  • There is a useful site for you that will help you to write a perfect and valuable essay and so on. Check out, please ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I think you need a perfect and 100% unique academic essays papers have a look once this site i hope you will get valuable papers, ⇒ www.WritePaper.info ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Simple Words and Phrases That Capture His Heart. New video reveals how to speak your man in a language that touches a primal inner part of his mind and become a constant source of excitement, interest, and pursuit for him.You'll discover how to understand him on a deep emotional level, and how the subtle things you say affect him much more than you might think. When you know how to do this, you'll be able to deeply connect with a man, and powerfully attract him. Click Here To Watch The Video Now! 》》》 https://t.cn/A6yxiH0S
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Did You Get Dumped? Do you still want her back? If you act now, I can help you. ●●● http://goo.gl/FXTq7P
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

NLP_lectures_English

  1. 1. Introduction to Natural Language Processing in 3 Sessions Dr. Alexandra M. Liguori Incubio – The Big Data Academy Barcelona, March - April, 2015 Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  2. 2. Outline: Lecture 1 1 Introduction 2 Natural Language Processing 3 Linguistic Ambiguities 4 Definition of corpus 5 Typical NLP tasks 6 POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  3. 3. Outline: Lecture 2 1 Recap: Typical NLP tasks → practical examples with GATE 2 Def. of semantics 3 Frames approach 1 FrameNet 2 GATE for semantic/content analysis Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  4. 4. Outline: Lecture 3 1 Recap: Typical NLP tasks 2 Automatic Question Answering 3 Reference resolution 4 Named Entity Recognition (NER) 5 Keyword / topic / information extraction Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  5. 5. Welcome! Here we go...!!! Main references: Text book: Speech and Language Processing by D. Jurafsky and J. H. Martin English FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/ Spanish FrameNet: http://sfn.uab.es:8080/SFN GATE: https://gate.ac.uk/ Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  6. 6. Outline: Lecture 1 1 Introduction 2 Natural Language Processing 3 Linguistic Ambiguities 4 Definition of corpus 5 Typical NLP tasks 6 POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  7. 7. Introduction: Intelligent machines? Video: https://www.youtube.com/watch?v=dSIKBliboIo (Stanley Kubrick and Arthur C. Clarke, screenplay of 2001: A Space Odyssey) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  8. 8. Introduction: Intelligent machines? Dave Bowman: Open the pod bay doors, HAL. HAL: I’m sorry Dave, I’m afraid I can’t do that. (Stanley Kubrick and Arthur C. Clarke, screenplay of 2001: A Space Odyssey) https://www.youtube.com/watch?v=dSIKBliboIo Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  9. 9. Introduction: Intelligent machines? Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  10. 10. Introduction: Intelligent machines? 1 Phonetics and phonology Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  11. 11. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  12. 12. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  13. 13. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  14. 14. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words 5 Compositional semantics → knowledge of how components combine to form larger meanings Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  15. 15. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words 5 Compositional semantics → knowledge of how components combine to form larger meanings 6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t vs. No, I won’t open the door. vs. No. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  16. 16. Introduction: Intelligent machines? 1 Phonetics and phonology 2 Morphology → produce contractions I’m and can’t 3 Syntax → cfr. Open the pod bay doors, HAL. vs. HAL, the pod bay door is open. vs. HAL, is the pod bay door open? 4 Lexical semantics → meaning of component words 5 Compositional semantics → knowledge of how components combine to form larger meanings 6 Pragmatics → cfr. I’m sorry ... , I’m afraid I can’t vs. No, I won’t open the door. vs. No. 7 Discourse conventions → engaging in structured conversation using reference that in I’m sorry Dave, I’m afraid I can’t do that Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  17. 17. Natural Language Processing NLP: techniques that process written human language as language. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  18. 18. Natural Language Processing NLP: techniques that process written human language as language. Applications word counting automatic hyphenation automated question answering named entity extraction (NER) information/content extraction semantic analysis sentiment analysis machine translation Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  19. 19. Natural Language Processing NLP: techniques that process written human language as language. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  20. 20. Natural Language Processing NLP: techniques that process written human language as language. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  21. 21. Natural Language Processing An ideal NLP team is very interdisciplinary, including: Language experts (linguists) Maths experts (mathematicians, physicists, statisticians) Programmers (computer scientists) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  22. 22. NLP: Maths & Computer Science Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  23. 23. NLP: Six categories of linguistic knowledge Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  24. 24. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  25. 25. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  26. 26. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking 3 Syntax ↔ She ate a mammoth breakfast - She eating a mammoth breakfast Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  27. 27. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking 3 Syntax ↔ She ate a mammoth breakfast - She eating a mammoth breakfast 4 Semantics ↔ book (verb) - book (noun); duck (verb) - duck (noun) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  28. 28. NLP: Six categories of linguistic knowledge 1 Phonetics and phonology ↔ red - read - read; sleigh - slay 2 Morphology ↔ I/you/we/you/they walk - he/she/it walks; walked; walking 3 Syntax ↔ She ate a mammoth breakfast - She eating a mammoth breakfast 4 Semantics ↔ book (verb) - book (noun); duck (verb) - duck (noun) 5 Pragmatics ↔ open the door - can you open the door? - could you open the door, please? Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  29. 29. NLP: Six categories of linguistic knowledge 6 Discourse Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  30. 30. NLP: Six categories of linguistic knowledge 6 Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  31. 31. NLP: Six categories of linguistic knowledge 6 Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. John went to Bill’s car dealership to check out an Acura Integra. He looked at it for about an hour. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  32. 32. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  33. 33. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  34. 34. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  35. 35. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  36. 36. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  37. 37. Linguistic Ambiguities Example I made her duck. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  38. 38. Linguistic Ambiguities Example I made her duck. Five possible interpretations: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  39. 39. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  40. 40. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  41. 41. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. 3 I created the (plaster?) duck she owns. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  42. 42. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. 3 I created the (plaster?) duck she owns. 4 I caused her to quickly lower her head or body. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  43. 43. Linguistic Ambiguities Example I made her duck. Five possible interpretations: 1 I cooked waterfowl for her. 2 I cooked waterfowl belonging to her. 3 I created the (plaster?) duck she owns. 4 I caused her to quickly lower her head or body. 5 I waved my magic wand and turned her into undifferentiated waterfowl. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  44. 44. Linguistic Ambiguities Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  45. 45. Linguistic Ambiguities Morphological ambiguity duck: verb or noun her: dative pronoun or possessive pronoun Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  46. 46. Linguistic Ambiguities Morphological ambiguity duck: verb or noun her: dative pronoun or possessive pronoun Syntactic ambiguity: make transitive: taking a single direct object (case 2) ditransitive: taking two objects, meaning that the first object (her) got made into the second object (duck) taking a direct object and a verb, meaning that the object (her) got caused to perform the verbal action (duck) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  47. 47. Linguistic Ambiguities Morphological ambiguity duck: verb or noun her: dative pronoun or possessive pronoun Syntactic ambiguity: make transitive: taking a single direct object (case 2) ditransitive: taking two objects, meaning that the first object (her) got made into the second object (duck) taking a direct object and a verb, meaning that the object (her) got caused to perform the verbal action (duck) Semantic ambiguity: make cook create Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  48. 48. Corpus Definition Corpus = Large and structured set of texts. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  49. 49. Corpus Definition Corpus = Large and structured set of texts. NLP Two types of corpora: Training corpus ↔ to make the list of rules or to get the statistical data Test corpus ↔ to test the results found with the training corpus Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  50. 50. Typical NLP tasks: Basic and simpler tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  51. 51. Typical NLP tasks: Basic and simpler tasks Tokenization Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  52. 52. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  53. 53. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  54. 54. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  55. 55. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  56. 56. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging POS-tagging algorithms and tag sets Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  57. 57. Typical NLP tasks: Complex tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  58. 58. Typical NLP tasks: Complex tasks Lemmatization or Stemming Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  59. 59. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  60. 60. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  61. 61. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  62. 62. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Topic extraction NER Semantic analysis ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  63. 63. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Topic extraction NER Semantic analysis ... Ad hoc tools, e.g. dictionaries, ontologies, Frames, GATE, NLTK, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  64. 64. POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  65. 65. POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  66. 66. POS-tagging Example with Penn Treebank POS-tags: A/DT woman/NN came/VBD from/IN the/DT back/NN of/IN the/DT store/NN ./. She/PP appeared/VBD to/TO be/VB sleepy/JJ and/CC quite/RB a/DT bit/NN younger/JJR than/IN Mr./NNP Dobbs/NNP and/CC to/TO be/VB wearing/VBG too/RB much/RB makeup/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  67. 67. POS-tagging Example of ambiguity: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  68. 68. POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  69. 69. POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  70. 70. POS-tagging Three main tagging algorithms or methods: 1 rule-based tagging, e.g. ENGTWOL 2 stochastic tagging, e.g. HMM tagger 3 transformation-based tagging, e.g. Brill tagger Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  71. 71. Rule-based POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  72. 72. Rule-based POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  73. 73. Rule-based POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Large database of hand-written disambiguation rules, e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  74. 74. Rule-based POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Large database of hand-written disambiguation rules, e.g.: TO + VB → YES TO + NN → NO DT + NN → YES DT + VB → NO Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  75. 75. Stochastic POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  76. 76. Stochastic POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  77. 77. Stochastic POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Training corpus to compute probability of given word having given tag in given context, e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  78. 78. Stochastic POS-tagging Example of ambiguity: 1 Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. 2 People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN ./. Training corpus to compute probability of given word having given tag in given context, e.g.: is/VBZ expected/VBN to/TO race/VB → 98% is/VBZ expected/VBN to/TO race/NN → 2% reason/NN for/IN the/DT race/NN → 97% reason/NN for/IN the/DT race/VB → 3% Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  79. 79. Transformation-based tagging POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  80. 80. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  81. 81. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  82. 82. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  83. 83. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN 2 Machine Learning to learn conditional probabilities: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  84. 84. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN 2 Machine Learning to learn conditional probabilities: 3 is/VBZ expected/VBN to/TO race/VB → 98% reason/NN for/IN the/DT race/NN → 97% Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  85. 85. Transformation-based tagging POS-tagging Example of ambiguity: Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN ./. People/NNS continue/VBP to/TO inquire/VB the/DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN Rules automatically induced from data using Machine Learning techniques, e.g.: 1 a priori, prob(race = NN)= 65%, prob(race = VB)= 35% → system would always take race = NN 2 Machine Learning to learn conditional probabilities: 3 is/VBZ expected/VBN to/TO race/VB → 98% reason/NN for/IN the/DT race/NN → 97% 4 system takes race = NN or race = VB depending on context. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  86. 86. POS-tagging POS-tagging tools for English: Brill tagger Stanford Log-linear POS-tagger (Java) POS-tagger integrated in GATE (Java) POS-tagger with NLTK (Python) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  87. 87. Outline: Lecture 2 1 Recap: Typical NLP tasks → practical examples with GATE 2 Def. of semantics 3 Frames approach 1 FrameNet 2 GATE for semantic/content analysis Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  88. 88. Typical NLP tasks: Basic and simpler tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  89. 89. Typical NLP tasks: Basic and simpler tasks Tokenization Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  90. 90. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  91. 91. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  92. 92. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  93. 93. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  94. 94. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging POS-tagging algorithms and tag sets Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  95. 95. Typical NLP tasks: Complex tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  96. 96. Typical NLP tasks: Complex tasks Lemmatization or Stemming Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  97. 97. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  98. 98. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Topic extraction NER Semantic analysis ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  99. 99. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Topic extraction NER Semantic analysis ... Ad hoc tools, e.g. dictionaries, ontologies, Frames, GATE, NLTK, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  100. 100. EX.: GATE Concrete examples with GATE: 1 Tokenizer 2 Sentence-splitter 3 POS-tagger 4 Stemmer Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  101. 101. EX.: GATE Concrete examples with GATE: 1 Tokenizer 2 Sentence-splitter 3 POS-tagger 4 Stemmer GATE https://gate.ac.uk/ development at the University of Sheffield, UK. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  102. 102. Semantics ’Then you should say what you mean,’ the March Hare went on. ’I do,’ Alice hastily replied; ’at least, I mean what I say – that’s the same thing, you know.’ ’Not the same thing a bit!’ said the Hatter. ’You might just as well say that ”I see what I eat” is the same thing as ”I eat what I see”! ’ Lewis Carroll, Alice in Wonderland Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  103. 103. Frames and FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: Abby bought a car from Robin for $5,000. Robin sold a car to Abby for $5,000. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  104. 104. Frames and FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: Abby bought a car from Robin for $5,000. Robin sold a car to Abby for $5,000. English FrameNet https://framenet.icsi.berkeley.edu/fndrupal/ development at the International Computer Science Institute in Berkeley, California. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  105. 105. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  106. 106. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  107. 107. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  108. 108. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  109. 109. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  110. 110. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  111. 111. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  112. 112. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  113. 113. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  114. 114. English FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  115. 115. Frame Relations FrameNet additionally captures relationships between different frames using relations. These include the following: Inheritance: When one frame is a more specific version of another, more abstract parent frame. Anything that is true about the parent frame must also be true about the child frame, and a mapping is specified between the frame elements of the parent and the frame elements of the child. Perspectivized-in: A neutral frame (like Commerce-transfer-goods) is connected to a frame with a specific perspective of the same scenario (e.g. the Commerce-sell frame, which assumes the perspective of the seller or the Commerce-buy frame, which assumes the perspective of the buyer) Subframe: Some frames like the Criminal-process frame refer to complex scenarios that consist of several individual states or events that can be described by separate frames like Arrest, Trial, and so on. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  116. 116. Frame Relations Precedes: The Precedes relation captures a temporal order that holds between subframes of a complex scenario. Causative-of and Inchoative-of: There is a fairly systematic relationship between stative descriptions (e.g. the Position-on-a-scale frame, "She had a high salary") and causative descriptions (Cause-change-of-scalar-position, "She raised his salary") or inchoative descriptions (Change-position-on-a-scale, e.g. "Her salary increased"). Using: A relationship that holds between a frame that in some way involves another frame. For instance, the Judgment-communication frame uses both the Judgment frame and the Statement frame, but does not inherit from either of them because there is no clear correspondence of the frame elements. See-also: Connects frames that bear some resemblance but need to be distinguished carefully. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  117. 117. Spanish FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: El rock influye en los artistas de hoy en día para sus producciones artísticas. Los artistas de hoy en día se inspiran al rock para sus producciones artísticas. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  118. 118. Spanish FrameNet Frame A schematic representation of a situation involving various participants, and other conceptual roles. E.g.: El rock influye en los artistas de hoy en día para sus producciones artísticas. Los artistas de hoy en día se inspiran al rock para sus producciones artísticas. Spanish FrameNet http://sfn.uab.es:8080/SFN development at the Universidad Autónoma de Barcelona and International Computer Science Institute in Berkeley, California. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  119. 119. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  120. 120. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  121. 121. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  122. 122. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  123. 123. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  124. 124. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  125. 125. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  126. 126. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  127. 127. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  128. 128. Spanish FrameNet Example Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  129. 129. Frames and GATE And now... Ex. in English implementing FRAMES, LUs, and FEs with GATE !!! Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  130. 130. Frames and GATE And now... Ex. in English implementing FRAMES, LUs, and FEs with GATE !!! GATE https://gate.ac.uk/ development at the University of Sheffield, UK. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  131. 131. Outline: Lecture 3 1 Recap: Typical NLP tasks 2 Automatic Question Answering 3 Reference resolution 4 Named Entity Recognition (NER) 5 Keyword / topic / information extraction Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  132. 132. Typical NLP tasks: Basic and simpler tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  133. 133. Typical NLP tasks: Basic and simpler tasks Tokenization Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  134. 134. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  135. 135. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  136. 136. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  137. 137. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  138. 138. Typical NLP tasks: Basic and simpler tasks Tokenization RegEx Sentence splitting RegEx POS-tagging POS-tagging algorithms and tag sets Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  139. 139. Typical NLP tasks: Complex tasks Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  140. 140. Typical NLP tasks: Complex tasks Lemmatization or Stemming Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  141. 141. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  142. 142. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  143. 143. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  144. 144. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Question answering Topic extraction NER Semantic analysis ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  145. 145. Typical NLP tasks: Complex tasks Lemmatization or Stemming Implementations of Porter Stemmer (e.g. in Java), Stanford NLP tool, GATE, ... Syntactic parsing Early algorithm, CYK algorithm, GHR algorithm, Stanford Parser (Java implementation of probabilistic algorithm) Question answering Topic extraction NER Semantic analysis ... Ad hoc tools, e.g. dictionaries, ontologies, Frames, GATE, NLTK, ... Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  146. 146. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  147. 147. NLP: Ambiguities and Solutions Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  148. 148. Question Answering Example from Who’s on first Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  149. 149. Question Answering Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  150. 150. Simple Question Answering ELIZA ELIZA simulated a Rogerian psychologist, and could carry on conversations with the user like the following: User-1: Men are all alike. ELIZA-1: IN WHAT WAY User-2: They’re always bugging us about something or other.. ELIZA-2: CAN YOU THINK OF A SPECIFIC EXAMPLE User-3: Well, my boyfriend made me come here. ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE User-4: Yes, he says I’m depressed much of the time. ELIZA-4: I AM SORRY TO HEAR YOU ARE DEPRESSED. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  151. 151. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: change all instances of my to YOUR, and I’m to YOU ARE, etc., e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  152. 152. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: change all instances of my to YOUR, and I’m to YOU ARE, etc., e.g.: 1 User-3: Well, my boyfriend made me come here. ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  153. 153. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: change all instances of my to YOUR, and I’m to YOU ARE, etc., e.g.: 1 User-3: Well, my boyfriend made me come here. ELIZA-3: YOUR BOYFRIEND MADE YOU COME HERE 2 User-4: ... I’m depressed ... . ELIZA-4: ... YOU ARE DEPRESSED. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  154. 154. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  155. 155. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  156. 156. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / 2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK YOU ARE 1 / Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  157. 157. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / 2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK YOU ARE 1 / 3 s/.* all .*/IN WHAT WAY/ Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  158. 158. ELIZA Regular expression substitutions Eliza worked by having a cascade of regular expression substitutions that each matched some part of the input lines and changed them: relevant patterns in the input → creat an appropriate output; e.g.: 1 s/.* YOU ARE (depressed | sad) .*/I AM SORRY TO HEAR YOU ARE 1 / 2 s/.* YOU ARE (depressed | sad) .*/WHY DO YOU THINK YOU ARE 1 / 3 s/.* all .*/IN WHAT WAY/ 4 s/.* always .*/CAN YOU THINK OF A SPECIFIC EXAMPLE/ Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  159. 159. Reference resolution Discourse Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  160. 160. Reference resolution Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  161. 161. Reference resolution Discourse Gracie: Oh yeah... And then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother’s wife. John went to Bill’s car dealership to check out an Acura Integra. He looked at it for about an hour. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  162. 162. Reference resolution Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  163. 163. Reference resolution 1 Reference phenomena Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  164. 164. Reference resolution 1 Reference phenomena 2 Constraints on coreference Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  165. 165. Reference resolution 1 Reference phenomena 2 Constraints on coreference 3 Preferences in pronoun interpretation Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  166. 166. Reference resolution 1 Reference phenomena 2 Constraints on coreference 3 Preferences in pronoun interpretation 4 Example of algorithm for pronoun resolution Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  167. 167. Reference resolution Reference phenomena Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  168. 168. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  169. 169. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  170. 170. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. 3 Pronouns ↔ I saw an Honda Civic today. It was blue. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  171. 171. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. 3 Pronouns ↔ I saw an Honda Civic today. It was blue. 4 Demonstratives ↔ I bought an Honda Civic today. It’s similar to the one I bought five years ago. That one was really nice, but I like this one even better. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  172. 172. Reference resolution Reference phenomena 1 Indefinite noun phrases ↔ I saw an Honda Civic today. 2 Definite noun phrases ↔ I saw an Honda Civic today. The Honda Civic was blue. 3 Pronouns ↔ I saw an Honda Civic today. It was blue. 4 Demonstratives ↔ I bought an Honda Civic today. It’s similar to the one I bought five years ago. That one was really nice, but I like this one even better. 5 One-anaphora ↔ I saw no less than 6 Honda Civics today. Now I want one. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  173. 173. Reference resolution Constraints on coreference Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  174. 174. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  175. 175. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  176. 176. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. 3 gender agreement ↔ John has a new car. It is attractive. / John has a new car. He is attractive. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  177. 177. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. 3 gender agreement ↔ John has a new car. It is attractive. / John has a new car. He is attractive. 4 syntactic constraints ↔ John bought himself a new car. / John bought him a new car. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  178. 178. Reference resolution Constraints on coreference 1 number agreement ↔ John has a new car. It is red. / John has a new car. They are red. 2 person and case agreement ↔ John and Mary have new cars. They love them. 3 gender agreement ↔ John has a new car. It is attractive. / John has a new car. He is attractive. 4 syntactic constraints ↔ John bought himself a new car. / John bought him a new car. 5 selectional restrictions ↔ John parked his car in the garage. He had driven it around for hours. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  179. 179. Reference resolution Preferences in pronoun interpretation Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  180. 180. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  181. 181. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  182. 182. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. 3 repeated mention ↔ Anne needed a car to drive to her new job. She decided she wanted something roomy. Carol went to the Honda dealership with her. She bought a Civic. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  183. 183. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. 3 repeated mention ↔ Anne needed a car to drive to her new job. She decided she wanted something roomy. Carol went to the Honda dealership with her. She bought a Civic. 4 parallelism ↔ Anne went with Carol to the Honda dealership. Sally went with her to the VW dealership. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  184. 184. Reference resolution Preferences in pronoun interpretation 1 recency ↔ Peter has an Audi. Bob has a Honda. Anne likes to drive it. 2 grammatical role ↔ Peter went to the Honda dealership with Bob. He bought a Civic. / Bob went to the Honda dealership with Peter. He bought a Civic. 3 repeated mention ↔ Anne needed a car to drive to her new job. She decided she wanted something roomy. Carol went to the Honda dealership with her. She bought a Civic. 4 parallelism ↔ Anne went with Carol to the Honda dealership. Sally went with her to the VW dealership. 5 verb semantics ↔ Peter seized the Honda pamphlet from Bob. He loves reading about cars. / Peter passed the Honda pamphlet to Bob. He loves reading about cars. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  185. 185. NER Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  186. 186. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  187. 187. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  188. 188. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  189. 189. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  190. 190. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) 2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  191. 191. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) 2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.) 3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”, ”Barcelona”, etc.) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  192. 192. NER Named Entity Recognition Can be broken down in two distinct problems, i.e.: 1 detection of names 2 classification of the names by the type of entity to which they refer → 4 standard types: 1 person (e.g. ”Carol”, ”Tom Hanks”, etc.) 2 organization (e.g. ”WWF”, IBM”, ”Bank of America”, etc.) 3 location (e.g. ”London”, "Washington D.C.”, ”L.A.”, ”Barcelona”, etc.) 4 other (e.g. ”Hotel Sunshine”, etc. ) Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  193. 193. NER Tools for Named Entity Recognition GATE for English, Spanish, and many more, via graphical interface and Java API (development at the University of Sheffield, UK) https://gate.ac.uk/ NETagger: Java based Illinois Named Entity Recognition (development by Cognitive Computation Group at University of Illinois at Urbana - Champaign) http://cogcomp.cs.illinois.edu/page/software_view/NETagger OpenNLP: rule based and statistical Named Entity Recognition (development by Apache) http://opennlp.apache.org/index.html Stanford CoreNLP: Java-based CRF Named Entity Recognition (development by Stanford Natural Language Processing Group) http://nlp.stanford.edu/software/CRF-NER.shtml Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  194. 194. Keyword / topic / information extraction Tools Keyword extraction: e.g. GATE (ANNIE tool) for English, Spanish, and many more, via graphical interface and Java API → simply using jape files for the LUs tool from Volker ? Topic / information extraction: e.g. GATE (ANNIE tool) for English, Spanish, and many more, via graphical interface and Java API → using jape files for the LUs, FEs, and FRAMES GATE https://gate.ac.uk/ development at the University of Sheffield, UK Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions
  195. 195. Thank you for your attention! For more information: Example text book: Speech and Language Processing by D. Jurafsky and J. H. Martin Web page: www.alexandramliguoriphd.com Linkedin profile: Alexandra M. Liguori, Ph.D. Dr. Alexandra M. Liguori Introduction to Natural Language Processing in 3 Sessions

×