SlideShare a Scribd company logo
1 of 29
Download to read offline
Engineering
Intelligent NLP
Applications Using
Deep Learning –
Part 1
Saurabh Kaushik
• Part 1:
• Why NLP?
• What is NLP?
• What is the Word & Sentence
Modelling in NLP?
• What is Word Representation in
NLP?
• What is Language Processing in
NLP?
Agenda
• PART 2 :
• WHY DL FOR NLP?
• WHAT IS DL?
• WHAT IS DL FOR NLP?
• HOW RNN WORKS FOR NLP?
• HOW CNN WORKS FOR NLP?
WHY NLP?
What are Generally Known NLPApplications?
Search
Customer SupportQ & A
Summarization
Are there More DeeperApplications of NLP?
Group 1
Cleanup, tokenization
Stemming
Lemmatization
Part-of-speech tagging
Query expansion
Parsing
Topic segmentation and recognition
Morphological segmentation
(word/Sentences)
Group 2
Information retrieval and Extraction
(IR)
Relationship Extraction
Named entity recognition (NER)
Sentiment analysis /Sentence
boundary disambiguation
Word sense and disambiguation
Text similarity
Coreference resolution
Discourse analysis
Group 3
Machine translation
Automatic summarization /
Paraphrasing
Natural language generation
Reasoning over Knowledge base
Question answering System
Dialog System
Image Captioning & other multimodal
tasks
WHAT IS NLP?
• According to Wikipedia:
• Natural language processing (NLP) is a field
of Computer science and Linguistics
concerned with the
• Interactions between computers and
human (natural) languages.
What is NLP?
So far, Computing Device and its Interaction
with Human are two separate thing. But in true
Digital World, this gap needs to bridged by
integrating Human Conversational
Understanding into Intelligent
Apps/Systems/Things, in order to achieve its
true potential.
Ref: https://en.wikipedia.org/wiki/Natural_language_processing
Why Language is so Challenging for Computer?
• Every sentence has
many possible
interpretations.
Language
is
ambiguous
• We will always
encounter new
words or new
constructions
Language
is
productive
• Same word has
different meaning.
Language
is culturally
specific
• Lexical Analysis − It involves identifying and analyzing
the structure of words. Lexicon of a language means the
collection of words and phrases in a language. Lexical
analysis is dividing the whole chunk of txt into
paragraphs, sentences, and words.
• Syntactic Analysis (Parsing) − It involves analysis of
words in the sentence for grammar and arranging words
in a manner that shows the relationship among the
words. The sentence such as “The school goes to boy” is
rejected by English syntactic analyzer.
• Semantic Analysis − It draws the exact meaning or the
dictionary meaning from the text. The text is checked for
meaningfulness. It is done by mapping syntactic
structures and objects in the task domain. The semantic
analyzer disregards sentence such as “hot ice-cream”.
Also called Compositional Semantic.
• Discourse Integration − The meaning of any sentence
depends upon the meaning of the sentence just before it.
In addition, it also brings about the meaning of
immediately succeeding sentence.
• Pragmatic Analysis − During this, what was said is re-
interpreted on what it actually meant. It involves deriving
those aspects of language which require real world
knowledge.
What is NLP Processing?
• Grammar Parsing:
• Articles (DET) − a | an | the
• Nouns − bird | birds | grain | grains
• Noun Phrase (NP) − Article + Noun | Article + Adjective
+ Noun = DET N | DET ADJ N
• Verbs − pecks | pecking | pecked
• Verb Phrase (VP) − NP V | V NP
• Adjectives (ADJ) − beautiful | small | chirping
• POS Tagging:
• Parsing:
• S → NP VP
• NP → DET N | DET ADJ N
• VP → V NP
• Lexicon:
• DET → a | the
• ADJ → beautiful | perching
• N → bird | birds | grain | grains
• V → peck | pecks | pecking
What are Basics Component of NLP?
“The bird pecks the grains”
Parse Tree:
How does NLP understand Syntactically?
Part of Speech – Tagging
WHAT WORD &
SENTENCE MODELLED IN
NLP?
• What is the meaning of words?
• Most words have many different senses:
• E.g. dog = animal or sausage?
How does NLP get Word Meanings?
Word Meaning:
• Polysemy:
• A lexeme is polysemous if it has different related senses
• E.g. bank = financial institution or building
• Homonyms:
• Two lexemes are homonyms if their senses are
unrelated, but they happen to have the same spelling
and pronunciation
• E.g. bank = (financial) bank or (river) bank
• How are the meanings of different words related?
• Specific relations between senses:
• E.g. Animal is more general than dog.
• Semantic fields:
• E.g. money is related to bank
How does NLP get Word Relationships?
Word Relationships:
 Symmetric Relations:
– Synonyms: couch/sofa
 Two lemmas with the same sense
– Antonyms: cold/hot, rise/fall, in/out
 Two lemmas with the opposite sense
 Hierarchical relations:
 Hypernyms and Hyponyms: pet/dog
– The hyponym (dog) is more specific than the
hypernym (pet)
 Homonyms and Meronyms: car/wheel
– The meronym (wheel) is a part of the holonym (car)
• Principle of compositionality:
• The meaning (vector) of a complex expression (sentence) is determined by:
• the meanings of its constituent expressions (words) and
• the rules (grammar) used to combine them”
How does NLP get Sentence Composability?
• SCENE PARSING:
• THE MEANING OF A SCENE IMAGE IS ALSO A
FUNCTION OF SMALLER REGION.
• HOW THEY COMBINE TO FORM AN LARGE OBJECT.
• AND HOW OBJECT INTERACT.
• Sentence Parsing:
• The meaning of a sentence is a function of words.
• How they combine to form an large sentences.
• And how Word Interact in a given sentence.
WHAT IS WORD
REPRESENTATION IN
NLP?
What is basic Linear Representation of Words?
Definition
• Documents are treated as a “bag” of words or
terms.
• Any document can be represented as a vector: a
list of terms and their associated weights
Pros
• Simple Model to start with
Cons
• Disregarding grammar (term.baseform?)
• Disregarding word order (term.position)
• Keeping only multiplicity (term.frequency)
• Less Accurate
Technique : TFIDF:
• Term frequency – inverse document frequency
• TF - is term frequency in a document function - i.e.
measure on how much information the term brings in
one document
• IDF - is inverse document frequency of the term
function - i.e. inversed measure on how much
information the term brings in all documents (corpus)
• Formula:
• t - term, d - one document, D - all documents
Bag of Words
• Statistical Modeling
• Word ordering information lost
• Data sparsity
• Words as atomic symbols
• Very hard to find higher level features
• Features other than BOW
What is Distributed Representation?
Neural Network Modeling
• Trained in a completely unsupervised way
• Reduce data sparsity
• Semantic Hashing
• Appear to carry semantic information about the words
• Freely available for Out of Box usage
Linguistic items with similar distributions have similar meanings. Generally, it is based on co-occurrence/ context and
based on the Distributional hypothesis. Distributional meaning as co-occurrence vector.
What is One Hot Encoding?
Definition:
• The vast majority of rule-‐based and Statistical NLP work
regards words as atomic symbols.
• Form vocabulary of words that maps lemmatized words to a
unique ID (position of word in vocabulary).
• Typical vocabulary sizes will vary between 10 000 and 250
000.
• The one-hot vector of an ID is a vector filled with 0s, except
for a 1 at the position associated with the ID.
• ex.: for vocabulary size D=10, the one-hot vector of word
ID w=4 is e(w) = [ 0 0 0 1 0 0 0 0 0 0 ]
• A one-hot encoding makes no assumption about word
similarity. All words are equally different from each other.
Pros
• Simplicity
Cons
• Notion of word similarity is undefined with one-hot encoding
social [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
public [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
• Impossible to generalize to unseen words
• One-hot encoding can be memory inefficient
• One of the most successful ideas of modern statistical NLP!
What is Word Embedding?
“You shall know a word by the company it keeps”
(J. R. Firth 1957)
these words represent banking
Definition:
• Help to find Syntactical as well as Semantical Similarity
Pros
• Simplicity
• Possible to generalize to unseen words
Cons
• All words are equal, but some words are more equal than
others.
What is Word Embedding?
Cosine similarity
Vector Representation
• Allow ability to map each document in a corpus to a n-
dimensional vector, where n is the size of the
vocabulary.
• represent each unique word as a dimension and the
magnitude along this dimension is the count of that
word in the document.
• Given such vectors a, b, …, we can compute the
vector dot product and cosine of the angle between
them.
• The angle is a measure of alignment between 2
vectors and hence similarity.
• An example of its use in information retrieval is to:
Vectorize both the query string and the documents and
find similarity(q, di) for all from 1 to n.
Word2Vec Vector for “Sweden”
What is Word Embedding?
Classical Example to show, How vector can help computer understand semantic meanings between words of a
language.
WHAT IS LANGUAGE
MODELING IN NLP?
• A language model is a probabilistic model that assigns probabilities to any sequence of words p(w1, ...
,wT)
• Language modeling is the task of learning a language model that assigns high probabilities to well
formed sentences
• Plays a crucial role in speech recognition and machine translation systems
• There are three Types of Language Modelling
• Linear Language Modelling – Addressed by finding probability of a word appearing in corpus
• Statistical Language Modelling – Addressed by finding probability of a word in sequence/presence
of other words.
• Neural Language Modelling – Addressed by understanding the context of word in its neighbor?
• Recursive Language Modelling – Addressed by understanding the sequence of words appearing
one after another. .
What is Language Modeling?
• An n-gram is a sequence of n words
• unigrams(n=1):’‘is’’,‘‘a’’,‘‘sequence’’,etc.
• bigrams(n=2): [‘‘is’’,‘‘a’’], [‘’a’’,‘‘sequence’’],etc.
• trigrams(n=3): [‘’is’’,‘‘a’’,‘‘sequence’’], [‘‘a’’,‘‘sequence’’,‘‘of’’], etc.
• n-gram models estimate the conditional from n-grams counts
What is Linear Language Modelling? (N-Gram)
What is Statistical Language Modelling?
• Problem:
• How can we handle co-occurrence of language in our
models?
• Solution
• Using probabilistic modeling any co-occurrence of word
can be modelled.
• A language model is a probabilistic model that assigns
probabilities to any sequence of words p(w1, ... ,wT)
• Language modeling is the task of learning a language
model that assigns high probabilities to well formed
sentences
• Plays a crucial role in speech recognition and machine
translation systems
• Language models define probability distributions over
(natural language) strings or sentences
• Joint and Conditional Probability
• Problem:
• How can we handle context of language in our models?
• Solution:
• Can theoretically (given enough units) approximate “any” function and fit to “any”
kind of data.
• Efficient for NLP: hidden layers can be used as word lookup tables
• Dense distributed word vectors + efficient NN training algorithms: Can scale to
billions of words !
Neural Language Modelling
• Problem
• How do we handle the compositionality of language in
our models?
• Solution:
• Recursion: the same operator (same parameters) is
applied repeatedly on different components. Also
called Recurrent Neural Networks (RNN).
What is Recursive Language Modelling?
Recursive Neural Networks (RNN)
Thank You
Saurabh Kaushik

More Related Content

What's hot

Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesFelipe Moraes
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Daniele Di Mitri
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer ConnectAnuj Gupta
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processingMinh Pham
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4DigiGurukul
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language ProcessingSebastian Ruder
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结君 廖
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Hady Elsahar
 

What's hot (20)

Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and Phrases
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
 

Viewers also liked

Winning Deals with Design Thinking
Winning Deals with Design Thinking Winning Deals with Design Thinking
Winning Deals with Design Thinking Saurabh Kaushik
 
Nlp & Hypnosis 2014
Nlp & Hypnosis 2014Nlp & Hypnosis 2014
Nlp & Hypnosis 2014Grant Hamel
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Saurabh Kaushik
 
First-passage percolation on random planar maps
First-passage percolation on random planar mapsFirst-passage percolation on random planar maps
First-passage percolation on random planar mapsTimothy Budd
 
mtc All Hands 8/15 Werte
mtc All Hands 8/15 Wertemtc All Hands 8/15 Werte
mtc All Hands 8/15 WerteArne Krueger
 
20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design PatternsAllen Day, PhD
 
Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Kai-Wen Zhao
 
Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?Sergey Shelpuk
 
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Shu Tanaka
 
Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices Daniel Berman
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestKrishna Gade
 
Percolation
PercolationPercolation
PercolationESUG
 
Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...
Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...
Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...Shu Tanaka
 
Predictive analytics in mobility
Predictive analytics in mobilityPredictive analytics in mobility
Predictive analytics in mobilityEktimo
 

Viewers also liked (20)

Winning Deals with Design Thinking
Winning Deals with Design Thinking Winning Deals with Design Thinking
Winning Deals with Design Thinking
 
Nlp & Hypnosis 2014
Nlp & Hypnosis 2014Nlp & Hypnosis 2014
Nlp & Hypnosis 2014
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
Machine Learning at Scale
Machine Learning at ScaleMachine Learning at Scale
Machine Learning at Scale
 
Logging in moodle
Logging in moodleLogging in moodle
Logging in moodle
 
Percolation Model and Controllability
Percolation Model and ControllabilityPercolation Model and Controllability
Percolation Model and Controllability
 
First-passage percolation on random planar maps
First-passage percolation on random planar mapsFirst-passage percolation on random planar maps
First-passage percolation on random planar maps
 
mtc All Hands 8/15 Werte
mtc All Hands 8/15 Wertemtc All Hands 8/15 Werte
mtc All Hands 8/15 Werte
 
20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns20131011 - Los Gatos - Netflix - Big Data Design Patterns
20131011 - Los Gatos - Netflix - Big Data Design Patterns
 
Percolation
PercolationPercolation
Percolation
 
Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?Artificial intelligence 2015: Quo Vadis?
Artificial intelligence 2015: Quo Vadis?
 
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
 
Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices Machine Learning and Logging for Monitoring Microservices
Machine Learning and Logging for Monitoring Microservices
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 
Percolation
PercolationPercolation
Percolation
 
Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...
Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...
Interlayer-Interaction Dependence of Latent Heat in the Heisenberg Model on a...
 
Predictive analytics in mobility
Predictive analytics in mobilityPredictive analytics in mobility
Predictive analytics in mobility
 

Similar to Engineering Intelligent NLP Applications Using Deep Learning – Part 1

Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxSHIBDASDUTTA
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Alia Hamwi
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectorsOsebe Sammi
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptxPriyadharshiniG41
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptxPriyadharshiniG41
 
Natural Language Processing - Unit 1
Natural Language Processing - Unit 1Natural Language Processing - Unit 1
Natural Language Processing - Unit 1Mithun B N
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMassimo Schenone
 
Natural Language Processing Course in AI
Natural Language Processing Course in AINatural Language Processing Course in AI
Natural Language Processing Course in AISATHYANARAYANAKB
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translationMarcis Pinnis
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 wordsananth
 

Similar to Engineering Intelligent NLP Applications Using Deep Learning – Part 1 (20)

Nlp
NlpNlp
Nlp
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectors
 
REPORT.doc
REPORT.docREPORT.doc
REPORT.doc
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
NLP todo
NLP todoNLP todo
NLP todo
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Natural Language Processing - Unit 1
Natural Language Processing - Unit 1Natural Language Processing - Unit 1
Natural Language Processing - Unit 1
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 
Natural Language Processing.pptx
Natural Language Processing.pptxNatural Language Processing.pptx
Natural Language Processing.pptx
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Natural Language Processing Course in AI
Natural Language Processing Course in AINatural Language Processing Course in AI
Natural Language Processing Course in AI
 
NLP_KASHK:Text Normalization
NLP_KASHK:Text NormalizationNLP_KASHK:Text Normalization
NLP_KASHK:Text Normalization
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 

More from Saurabh Kaushik

Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking Saurabh Kaushik
 
AI Product Thinking for Product Managers
AI Product Thinking for Product Managers AI Product Thinking for Product Managers
AI Product Thinking for Product Managers Saurabh Kaushik
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Saurabh Kaushik
 
Project Management Using Design Thinking
Project Management Using Design Thinking Project Management Using Design Thinking
Project Management Using Design Thinking Saurabh Kaushik
 
Design Thinking - Case Studies
Design Thinking  - Case Studies Design Thinking  - Case Studies
Design Thinking - Case Studies Saurabh Kaushik
 
An Assessment Framework for Strategic Digital Marketing Effectiveness
An Assessment Framework for Strategic Digital Marketing EffectivenessAn Assessment Framework for Strategic Digital Marketing Effectiveness
An Assessment Framework for Strategic Digital Marketing EffectivenessSaurabh Kaushik
 
A Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketingA Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketingSaurabh Kaushik
 
Air Pollution Control by Tax and Subsidies
Air Pollution Control by Tax and Subsidies Air Pollution Control by Tax and Subsidies
Air Pollution Control by Tax and Subsidies Saurabh Kaushik
 

More from Saurabh Kaushik (9)

MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking
 
AI Product Thinking for Product Managers
AI Product Thinking for Product Managers AI Product Thinking for Product Managers
AI Product Thinking for Product Managers
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 
Project Management Using Design Thinking
Project Management Using Design Thinking Project Management Using Design Thinking
Project Management Using Design Thinking
 
Design Thinking - Case Studies
Design Thinking  - Case Studies Design Thinking  - Case Studies
Design Thinking - Case Studies
 
An Assessment Framework for Strategic Digital Marketing Effectiveness
An Assessment Framework for Strategic Digital Marketing EffectivenessAn Assessment Framework for Strategic Digital Marketing Effectiveness
An Assessment Framework for Strategic Digital Marketing Effectiveness
 
A Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketingA Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketing
 
Air Pollution Control by Tax and Subsidies
Air Pollution Control by Tax and Subsidies Air Pollution Control by Tax and Subsidies
Air Pollution Control by Tax and Subsidies
 

Recently uploaded

Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 

Recently uploaded (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 

Engineering Intelligent NLP Applications Using Deep Learning – Part 1

  • 1. Engineering Intelligent NLP Applications Using Deep Learning – Part 1 Saurabh Kaushik
  • 2. • Part 1: • Why NLP? • What is NLP? • What is the Word & Sentence Modelling in NLP? • What is Word Representation in NLP? • What is Language Processing in NLP? Agenda • PART 2 : • WHY DL FOR NLP? • WHAT IS DL? • WHAT IS DL FOR NLP? • HOW RNN WORKS FOR NLP? • HOW CNN WORKS FOR NLP?
  • 4. What are Generally Known NLPApplications? Search Customer SupportQ & A Summarization
  • 5. Are there More DeeperApplications of NLP? Group 1 Cleanup, tokenization Stemming Lemmatization Part-of-speech tagging Query expansion Parsing Topic segmentation and recognition Morphological segmentation (word/Sentences) Group 2 Information retrieval and Extraction (IR) Relationship Extraction Named entity recognition (NER) Sentiment analysis /Sentence boundary disambiguation Word sense and disambiguation Text similarity Coreference resolution Discourse analysis Group 3 Machine translation Automatic summarization / Paraphrasing Natural language generation Reasoning over Knowledge base Question answering System Dialog System Image Captioning & other multimodal tasks
  • 7. • According to Wikipedia: • Natural language processing (NLP) is a field of Computer science and Linguistics concerned with the • Interactions between computers and human (natural) languages. What is NLP? So far, Computing Device and its Interaction with Human are two separate thing. But in true Digital World, this gap needs to bridged by integrating Human Conversational Understanding into Intelligent Apps/Systems/Things, in order to achieve its true potential. Ref: https://en.wikipedia.org/wiki/Natural_language_processing
  • 8. Why Language is so Challenging for Computer? • Every sentence has many possible interpretations. Language is ambiguous • We will always encounter new words or new constructions Language is productive • Same word has different meaning. Language is culturally specific
  • 9. • Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a language means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs, sentences, and words. • Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for grammar and arranging words in a manner that shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by English syntactic analyzer. • Semantic Analysis − It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer disregards sentence such as “hot ice-cream”. Also called Compositional Semantic. • Discourse Integration − The meaning of any sentence depends upon the meaning of the sentence just before it. In addition, it also brings about the meaning of immediately succeeding sentence. • Pragmatic Analysis − During this, what was said is re- interpreted on what it actually meant. It involves deriving those aspects of language which require real world knowledge. What is NLP Processing?
  • 10. • Grammar Parsing: • Articles (DET) − a | an | the • Nouns − bird | birds | grain | grains • Noun Phrase (NP) − Article + Noun | Article + Adjective + Noun = DET N | DET ADJ N • Verbs − pecks | pecking | pecked • Verb Phrase (VP) − NP V | V NP • Adjectives (ADJ) − beautiful | small | chirping • POS Tagging: • Parsing: • S → NP VP • NP → DET N | DET ADJ N • VP → V NP • Lexicon: • DET → a | the • ADJ → beautiful | perching • N → bird | birds | grain | grains • V → peck | pecks | pecking What are Basics Component of NLP? “The bird pecks the grains” Parse Tree:
  • 11. How does NLP understand Syntactically? Part of Speech – Tagging
  • 12. WHAT WORD & SENTENCE MODELLED IN NLP?
  • 13. • What is the meaning of words? • Most words have many different senses: • E.g. dog = animal or sausage? How does NLP get Word Meanings? Word Meaning: • Polysemy: • A lexeme is polysemous if it has different related senses • E.g. bank = financial institution or building • Homonyms: • Two lexemes are homonyms if their senses are unrelated, but they happen to have the same spelling and pronunciation • E.g. bank = (financial) bank or (river) bank
  • 14. • How are the meanings of different words related? • Specific relations between senses: • E.g. Animal is more general than dog. • Semantic fields: • E.g. money is related to bank How does NLP get Word Relationships? Word Relationships:  Symmetric Relations: – Synonyms: couch/sofa  Two lemmas with the same sense – Antonyms: cold/hot, rise/fall, in/out  Two lemmas with the opposite sense  Hierarchical relations:  Hypernyms and Hyponyms: pet/dog – The hyponym (dog) is more specific than the hypernym (pet)  Homonyms and Meronyms: car/wheel – The meronym (wheel) is a part of the holonym (car)
  • 15. • Principle of compositionality: • The meaning (vector) of a complex expression (sentence) is determined by: • the meanings of its constituent expressions (words) and • the rules (grammar) used to combine them” How does NLP get Sentence Composability? • SCENE PARSING: • THE MEANING OF A SCENE IMAGE IS ALSO A FUNCTION OF SMALLER REGION. • HOW THEY COMBINE TO FORM AN LARGE OBJECT. • AND HOW OBJECT INTERACT. • Sentence Parsing: • The meaning of a sentence is a function of words. • How they combine to form an large sentences. • And how Word Interact in a given sentence.
  • 17. What is basic Linear Representation of Words? Definition • Documents are treated as a “bag” of words or terms. • Any document can be represented as a vector: a list of terms and their associated weights Pros • Simple Model to start with Cons • Disregarding grammar (term.baseform?) • Disregarding word order (term.position) • Keeping only multiplicity (term.frequency) • Less Accurate Technique : TFIDF: • Term frequency – inverse document frequency • TF - is term frequency in a document function - i.e. measure on how much information the term brings in one document • IDF - is inverse document frequency of the term function - i.e. inversed measure on how much information the term brings in all documents (corpus) • Formula: • t - term, d - one document, D - all documents Bag of Words
  • 18. • Statistical Modeling • Word ordering information lost • Data sparsity • Words as atomic symbols • Very hard to find higher level features • Features other than BOW What is Distributed Representation? Neural Network Modeling • Trained in a completely unsupervised way • Reduce data sparsity • Semantic Hashing • Appear to carry semantic information about the words • Freely available for Out of Box usage Linguistic items with similar distributions have similar meanings. Generally, it is based on co-occurrence/ context and based on the Distributional hypothesis. Distributional meaning as co-occurrence vector.
  • 19. What is One Hot Encoding? Definition: • The vast majority of rule-‐based and Statistical NLP work regards words as atomic symbols. • Form vocabulary of words that maps lemmatized words to a unique ID (position of word in vocabulary). • Typical vocabulary sizes will vary between 10 000 and 250 000. • The one-hot vector of an ID is a vector filled with 0s, except for a 1 at the position associated with the ID. • ex.: for vocabulary size D=10, the one-hot vector of word ID w=4 is e(w) = [ 0 0 0 1 0 0 0 0 0 0 ] • A one-hot encoding makes no assumption about word similarity. All words are equally different from each other. Pros • Simplicity Cons • Notion of word similarity is undefined with one-hot encoding social [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0] public [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0] • Impossible to generalize to unseen words • One-hot encoding can be memory inefficient
  • 20. • One of the most successful ideas of modern statistical NLP! What is Word Embedding? “You shall know a word by the company it keeps” (J. R. Firth 1957) these words represent banking Definition: • Help to find Syntactical as well as Semantical Similarity Pros • Simplicity • Possible to generalize to unseen words Cons • All words are equal, but some words are more equal than others.
  • 21. What is Word Embedding? Cosine similarity Vector Representation • Allow ability to map each document in a corpus to a n- dimensional vector, where n is the size of the vocabulary. • represent each unique word as a dimension and the magnitude along this dimension is the count of that word in the document. • Given such vectors a, b, …, we can compute the vector dot product and cosine of the angle between them. • The angle is a measure of alignment between 2 vectors and hence similarity. • An example of its use in information retrieval is to: Vectorize both the query string and the documents and find similarity(q, di) for all from 1 to n. Word2Vec Vector for “Sweden”
  • 22. What is Word Embedding? Classical Example to show, How vector can help computer understand semantic meanings between words of a language.
  • 24. • A language model is a probabilistic model that assigns probabilities to any sequence of words p(w1, ... ,wT) • Language modeling is the task of learning a language model that assigns high probabilities to well formed sentences • Plays a crucial role in speech recognition and machine translation systems • There are three Types of Language Modelling • Linear Language Modelling – Addressed by finding probability of a word appearing in corpus • Statistical Language Modelling – Addressed by finding probability of a word in sequence/presence of other words. • Neural Language Modelling – Addressed by understanding the context of word in its neighbor? • Recursive Language Modelling – Addressed by understanding the sequence of words appearing one after another. . What is Language Modeling?
  • 25. • An n-gram is a sequence of n words • unigrams(n=1):’‘is’’,‘‘a’’,‘‘sequence’’,etc. • bigrams(n=2): [‘‘is’’,‘‘a’’], [‘’a’’,‘‘sequence’’],etc. • trigrams(n=3): [‘’is’’,‘‘a’’,‘‘sequence’’], [‘‘a’’,‘‘sequence’’,‘‘of’’], etc. • n-gram models estimate the conditional from n-grams counts What is Linear Language Modelling? (N-Gram)
  • 26. What is Statistical Language Modelling? • Problem: • How can we handle co-occurrence of language in our models? • Solution • Using probabilistic modeling any co-occurrence of word can be modelled. • A language model is a probabilistic model that assigns probabilities to any sequence of words p(w1, ... ,wT) • Language modeling is the task of learning a language model that assigns high probabilities to well formed sentences • Plays a crucial role in speech recognition and machine translation systems • Language models define probability distributions over (natural language) strings or sentences • Joint and Conditional Probability
  • 27. • Problem: • How can we handle context of language in our models? • Solution: • Can theoretically (given enough units) approximate “any” function and fit to “any” kind of data. • Efficient for NLP: hidden layers can be used as word lookup tables • Dense distributed word vectors + efficient NN training algorithms: Can scale to billions of words ! Neural Language Modelling
  • 28. • Problem • How do we handle the compositionality of language in our models? • Solution: • Recursion: the same operator (same parameters) is applied repeatedly on different components. Also called Recurrent Neural Networks (RNN). What is Recursive Language Modelling? Recursive Neural Networks (RNN)