SlideShare a Scribd company logo
1 of 75
Download to read offline
Deep Learning 

for NLP
An Introduction to Neural Word Embeddings*
Roelof Pieters

PhD candidate KTH/CSC
CIO/CTO Feeda AB
Feeda KTH, December 4, 2014
roelof@kth.se www.csc.kth.se/~roelof/ @graphific
*and some more fun stuff…
2. NLP: WORD EMBEDDINGS
1. DEEP LEARNING
1. DEEP LEARNING
2. NLP: WORD EMBEDDINGS
A couple of headlines… [all November ’14]
Improving some task T
based on experience E with
respect to performance
measure P.
Deep Learning = Machine Learning
Learning denotes changes in the
system that are adaptive in the sense
that they enable the system to do the
same task (or tasks drawn from a
population of similar tasks) more
effectively the next time.
— H. Simon 1983 

"Why Should Machines Learn?” in Mitchell 1997
— T. Mitchell 1997
Representation learning
Attempts to automatically learn
good features or
representations
Deep learning
Attempt to learn multiple levels
of representation of increasing
complexity/abstraction
Deep Learning: What?
ML: Traditional Approach
1. Gather as much LABELED data as you can get
2. Throw some algorithms at it (mainly put in an SVM and
keep it at that)
3. If you actually have tried more algos: Pick the best
4. Spend hours hand engineering some features / feature
selection / dimensionality reduction (PCA, SVD, etc)
5. Repeat…
For each new problem/question::
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
Rosenblatt 1957 vs Minsky & Papert
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
(Rumelhart, Hinton & Williams, 1986)
Backprop Renaissance
• Multi-Layered Perceptrons (’86)
• Uses Backpropagation (Bryson & Ho 1969):
back-propagates the error signal computed at the
output layer to get derivatives for learning, in
order to update the weight vectors until
convergence is reached
Backprop Renaissance
Forward Propagation
• Sum inputs, produce activation, feed-forward
Backprop Renaissance
Back Propagation (of error)
• Calculate total error at the top
• Calculate contributions to error at each step going
backwards
• Compute gradient of example-wise loss wrt
parameters
• Simply applying the derivative chain rule wisely 





• If computing the loss (example, parameters) is O(n)
computation, then so is computing the gradient
Backpropagation
Simple Chain Rule
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
(Cortes & Vapnik 1995)
Kernel SVM
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
• Form of log-linear
Markov Random
Field 

(MRF)
• Bipartite graph, with
no intra-layer
connections
• Energy Function
RBM: Structure
• Training Function:



• often by contrastive divergence (CD) (Hinton 1999;
Hinton 2000)
• Gibbs sampling
• Gradient Descent
• Goal: compute weight updates
RBM: Training
more info: Geoffrey Hinton (2010). A Practical Guide to Training Restricted Boltzmann Machines
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
1. More labeled data
(“Big Data”)
2. GPU’s
3. “layer-wise
unsupervised
feature learning”
Stacking Single Layer Learners
One of the big ideas from 2006: layer-wise
unsupervised feature learning
- Stacking Restricted Boltzmann Machines (RBM) ->
Deep Belief Network (DBN)
- Stacking regularized auto-encoders -> deep neural
nets
• Stacked RBM
• Introduced by Hinton et al. (2006)
• 1st RBM hidden layer == 2th RBM input layer
• Feature Hierarchy
Deep Belief Network (DBN)
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
“Brainiacs” “Pragmatists”vs
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
• Computational Biology • CVAP
“Brainiacs” “Pragmatists”vs
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
• Computational Biology • CVAP
• Jorge Dávila-Chacón
• “that guy”
“Brainiacs” “Pragmatists”vs
Different Levels of Abstraction
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier to monitor what
is being learnt and to
guide the machine to
better subspaces
• A good lower level
representation can be
used for many distinct
tasks
Different Levels of Abstraction
Feature Representation
• Shared Low Level
Representations
• Multi-Task Learning
• Unsupervised Training
• Partial Feature Sharing
• Mixed Mode Learning
• Composition of
Functions
Generalizable Learning
Classic Deep Architecture
Input layer
Hidden layers
Output layer
Modern Deep Architecture
Input layer
Hidden layers
Output layer
Modern Deep Architecture
Input layer
Hidden layers
Output layer
movie time:
http://www.cs.toronto.edu/~hinton/adi/index.htm
Deep Learning
Hierarchies
Efficient
Generalization
Distributed
Sharing
Unsupervised*
Black Box
Training Time
Major PWNAGE!
Much Data
Why go Deep ?
No More Handcrafted Features !
— Andrew Ng
“I’ve worked all my life in
Machine Learning, and I’ve
never seen one algorithm knock
over benchmarks like Deep
Learning”
Deep Learning: Why?
Deep Learning: Why?
Beat state of the art in many areas:
• Language Modeling (2012, Mikolov et al)
• Image Recognition (Krizhevsky won
2012 ImageNet competition)
• Sentiment Classification (2011, Socher et
al)
• Speech Recognition (2010, Dahl et al)
• MNIST hand-written digit recognition
(Ciresan et al, 2010)
One Model rules them all ?



DL approaches have been successfully applied to:
Deep Learning: Why for NLP ?
Automatic summarization Coreference resolution Discourse analysis
Machine translation Morphological segmentation Named entity recognition (NER)
Natural language generation
Natural language understanding
Optical character recognition (OCR)
Part-of-speech tagging
Parsing
Question answering
Relationship extraction
sentence boundary disambiguation
Sentiment analysis
Speech recognition
Speech segmentation
Topic segmentation and recognition
Word segmentation
Word sense disambiguation
Information retrieval (IR)
Information extraction (IE)
Speech processing
2. NLP: WORD EMBEDDINGS
1. DEEP LEARNING
• NLP treats words mainly (rule-based/statistical
approaches at least) as atomic symbols:

• or in vector space:

• also known as “one hot” representation.
• Its problem ?
Word Representation
Love Candy Store
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …]
• NLP treats words mainly (rule-based/statistical
approaches at least) as atomic symbols:

• or in vector space:

• also known as “one hot” representation.
• Its problem ?
Word Representation
Love Candy Store
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …]
Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] AND
Store [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 …] = 0 !
Distributional representations
“You shall know a word by the company it keeps”

(J. R. Firth 1957)
One of the most successful ideas of modern
statistical NLP!
these words represent banking
• Hard (class based) clustering models:
• Soft clustering models
• Word Embeddings (Bengio et al, 2001;
Bengio et al, 2003) based on idea of
distributed representations for symbols
(Hinton 1986)
• Neural Word embeddings (Mnih and Hinton
2007, Collobert & Weston 2008, Turian et al
2010; Collobert et al. 2011, Mikolov et al.
2011)
Language Modeling
Neural distributional representations
• Neural word embeddings
• Combine vector space semantics with the prediction
of probabilistic models
• Words are represented as a dense vector:
Candy =
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
input: 

- the country of my birth
- the place where I was born
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
the country of my birth
input: 

- the country of my birth
- the place where I was born
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
the country of my birth
the place where I was born
input: 

- the country of my birth
- the place where I was born
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
the country of my birth
the place where I was born ?
…
• Recursive Tensor (Neural) Network (RTNT) 

(Socher et al. 2011; Socher 2014)
• Top-down hierarchical net (vs feed forward)
• NLP!
• Sequence based classification, windows of several
events, entire scenes (rather than images), entire
sentences (rather than words)
• Features = Vectors
• A tensor = multi-dimensional matrix, or multiple matrices
of the same size
Recursive Neural (Tensor) Network
Recursive Neural Tensor Network
Recursive Neural Tensor Network
Compositionality
Principle of compositionality:
the “meaning (vector) of a
complex expression
(sentence) is determined by:
— Gottlob Frege 

(1848 - 1925)
- the meanings of its
constituent expressions
(words) and
- the rules (grammar) used
to combine them”
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
Compositionality
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
INDT NN PRP NN
Compositionality
NP
IN
NP
PRP NN
Parse Tree
DT NN
Compositionality
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
Compositionality
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
“rules”
Compositionality
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
“rules” “meanings”
Compositionality
Vector Space + Word Embeddings: Socher
Vector Space + Word Embeddings: Socher
code & info: http://metaoptimize.com/projects/wordreprs/
Word Embeddings: Turian
t-SNE visualizations of word embeddings. Left: Number Region; Right:
Jobs Region. From Turian et al. 2011
Word Embeddings: Turian
• Recurrent Neural Network (Mikolov et al. 2010;
Mikolov et al. 2013a)
W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle")
W(‘‘woman")−W(‘‘man") ≃ W(‘‘queen")−W(‘‘king")
Figures from Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic
Regularities in Continuous Space Word Representations
Word Embeddings: Mikolov
• Mikolov et al. 2013b
Figures from Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b).
Efficient Estimation of Word Representations in Vector Space
Word Embeddings: Mikolov
• cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/
CUDA, optimized for GTX 580) 

https://code.google.com/p/cuda-convnet2/
• Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)

http://caffe.berkeleyvision.org/
• OverFeat (NYU) 

http://cilvr.nyu.edu/doku.php?id=code:start
Wanna Play ?
• Theano - CPU/GPU symbolic expression compiler in
python (from LISA lab at University of Montreal). http://
deeplearning.net/software/theano/
• Pylearn2 - library designed to make machine learning
research easy. http://deeplearning.net/software/pylearn2/
• Torch - Matlab-like environment for state-of-the-art
machine learning algorithms in lua (from Ronan Collobert,
Clement Farabet and Koray Kavukcuoglu) http://torch.ch/
• more info: http://deeplearning.net/software links/
Wanna Play ?
Wanna Play ?
as PhD candidate KTH/CSC:
“Always interested in discussion

Machine Learning, Deep
Architectures, 

Graphs, and NLP”
Wanna Play with Me ?
roelof@kth.se
www.csc.kth.se/~roelof/
Internship / EntrepeneurshipAcademic/Research
as CIO/CTO Feeda:
“Always looking for additions to our 

brand new R&D team”



[Internships upcoming on 

KTH exjobb website…]
roelof@feeda.com
www.feeda.com
Feeda
Were Hiring!
roelof@feeda.com
www.feeda.com
Feeda
• Software Developers
• Data Scientists
Appendum
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive
Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013
code & demo: http://nlp.stanford.edu/sentiment/index.html
Appendum
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 

Improving Word Representations via Global Context and Multiple Word Prototypes

More Related Content

What's hot

Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learningYu Huang
 
Feature pyramid networks for object detection
Feature pyramid networks for object detection Feature pyramid networks for object detection
Feature pyramid networks for object detection heedaeKwon
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Natural language processing techniques transition from machine learning to de...
Natural language processing techniques transition from machine learning to de...Natural language processing techniques transition from machine learning to de...
Natural language processing techniques transition from machine learning to de...Divya Gera
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learningmilad abbasi
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in VideosNAVER Engineering
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedSushant Gautam
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning pyingkodi maran
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.netwww.myassignmenthelp.net
 
Guided image filter
Guided image filterGuided image filter
Guided image filterssuser456ad6
 

What's hot (20)

Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Feature pyramid networks for object detection
Feature pyramid networks for object detection Feature pyramid networks for object detection
Feature pyramid networks for object detection
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural language processing techniques transition from machine learning to de...
Natural language processing techniques transition from machine learning to de...Natural language processing techniques transition from machine learning to de...
Natural language processing techniques transition from machine learning to de...
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in Videos
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explained
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Natural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - IntroductionNatural Language Processing (NLP) - Introduction
Natural Language Processing (NLP) - Introduction
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
 
Guided image filter
Guided image filterGuided image filter
Guided image filter
 
Dive Deep Into Deep Learning
Dive Deep Into Deep LearningDive Deep Into Deep Learning
Dive Deep Into Deep Learning
 

Viewers also liked

Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearnPratap Dangeti
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learningJörgen Sandig
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP HackRoelof Pieters
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS MeetupLINAGORA
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedMelanie Swan
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Jaemin Cho
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !LINAGORA
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Jaemin Cho
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 

Viewers also liked (20)

ggplot for python
ggplot for pythonggplot for python
ggplot for python
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Deep learning
Deep learningDeep learning
Deep learning
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP Hack
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS Meetup
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 

Similar to Deep Learning for NLP: An Introduction to Neural Word Embeddings

Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature surveyAkshay Hegde
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Dialogare con agenti artificiali
Dialogare con agenti artificiali  Dialogare con agenti artificiali
Dialogare con agenti artificiali Agnese Augello
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning AnalyticsXavier Ochoa
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxVishnuRajuV
 
Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Miningebelani
 

Similar to Deep Learning for NLP: An Introduction to Neural Word Embeddings (20)

Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Dialogare con agenti artificiali
Dialogare con agenti artificiali  Dialogare con agenti artificiali
Dialogare con agenti artificiali
 
Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
 
asdrfasdfasdf
asdrfasdfasdfasdrfasdfasdf
asdrfasdfasdf
 
Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Mining
 
The Tipping Point
The Tipping PointThe Tipping Point
The Tipping Point
 
The tipping point
The tipping pointThe tipping point
The tipping point
 

More from Roelof Pieters

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureRoelof Pieters
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity Roelof Pieters
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Roelof Pieters
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleRoelof Pieters
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineRoelof Pieters
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiRoelof Pieters
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadRoelof Pieters
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationRoelof Pieters
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningRoelof Pieters
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryRoelof Pieters
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferRoelof Pieters
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 

More from Roelof Pieters (16)

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain future
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with style
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) Machine
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative ai
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + Visualization
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep Learning
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Deep Learning for NLP: An Introduction to Neural Word Embeddings

  • 1. Deep Learning 
 for NLP An Introduction to Neural Word Embeddings* Roelof Pieters
 PhD candidate KTH/CSC CIO/CTO Feeda AB Feeda KTH, December 4, 2014 roelof@kth.se www.csc.kth.se/~roelof/ @graphific *and some more fun stuff…
  • 2. 2. NLP: WORD EMBEDDINGS 1. DEEP LEARNING
  • 3. 1. DEEP LEARNING 2. NLP: WORD EMBEDDINGS
  • 4. A couple of headlines… [all November ’14]
  • 5. Improving some task T based on experience E with respect to performance measure P. Deep Learning = Machine Learning Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task (or tasks drawn from a population of similar tasks) more effectively the next time. — H. Simon 1983 
 "Why Should Machines Learn?” in Mitchell 1997 — T. Mitchell 1997
  • 6. Representation learning Attempts to automatically learn good features or representations Deep learning Attempt to learn multiple levels of representation of increasing complexity/abstraction Deep Learning: What?
  • 7. ML: Traditional Approach 1. Gather as much LABELED data as you can get 2. Throw some algorithms at it (mainly put in an SVM and keep it at that) 3. If you actually have tried more algos: Pick the best 4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc) 5. Repeat… For each new problem/question::
  • 8. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” Rosenblatt 1957 vs Minsky & Papert
  • 9. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” (Rumelhart, Hinton & Williams, 1986)
  • 10. Backprop Renaissance • Multi-Layered Perceptrons (’86) • Uses Backpropagation (Bryson & Ho 1969): back-propagates the error signal computed at the output layer to get derivatives for learning, in order to update the weight vectors until convergence is reached
  • 11. Backprop Renaissance Forward Propagation • Sum inputs, produce activation, feed-forward
  • 12. Backprop Renaissance Back Propagation (of error) • Calculate total error at the top • Calculate contributions to error at each step going backwards
  • 13. • Compute gradient of example-wise loss wrt parameters • Simply applying the derivative chain rule wisely 
 
 
 • If computing the loss (example, parameters) is O(n) computation, then so is computing the gradient Backpropagation
  • 15. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” (Cortes & Vapnik 1995) Kernel SVM
  • 16. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” • Form of log-linear Markov Random Field 
 (MRF) • Bipartite graph, with no intra-layer connections
  • 18. • Training Function:
 
 • often by contrastive divergence (CD) (Hinton 1999; Hinton 2000) • Gibbs sampling • Gradient Descent • Goal: compute weight updates RBM: Training more info: Geoffrey Hinton (2010). A Practical Guide to Training Restricted Boltzmann Machines
  • 19. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” 1. More labeled data (“Big Data”) 2. GPU’s 3. “layer-wise unsupervised feature learning”
  • 20. Stacking Single Layer Learners One of the big ideas from 2006: layer-wise unsupervised feature learning - Stacking Restricted Boltzmann Machines (RBM) -> Deep Belief Network (DBN) - Stacking regularized auto-encoders -> deep neural nets
  • 21. • Stacked RBM • Introduced by Hinton et al. (2006) • 1st RBM hidden layer == 2th RBM input layer • Feature Hierarchy Deep Belief Network (DBN)
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical
  • 27. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical “Brainiacs” “Pragmatists”vs
  • 28. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical • Computational Biology • CVAP “Brainiacs” “Pragmatists”vs
  • 29. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical • Computational Biology • CVAP • Jorge Dávila-Chacón • “that guy” “Brainiacs” “Pragmatists”vs
  • 30. Different Levels of Abstraction
  • 31. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation
  • 32. • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training • Partial Feature Sharing • Mixed Mode Learning • Composition of Functions Generalizable Learning
  • 33. Classic Deep Architecture Input layer Hidden layers Output layer
  • 34. Modern Deep Architecture Input layer Hidden layers Output layer
  • 35. Modern Deep Architecture Input layer Hidden layers Output layer movie time: http://www.cs.toronto.edu/~hinton/adi/index.htm
  • 38. No More Handcrafted Features !
  • 39. — Andrew Ng “I’ve worked all my life in Machine Learning, and I’ve never seen one algorithm knock over benchmarks like Deep Learning” Deep Learning: Why?
  • 40. Deep Learning: Why? Beat state of the art in many areas: • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) • Speech Recognition (2010, Dahl et al) • MNIST hand-written digit recognition (Ciresan et al, 2010)
  • 41. One Model rules them all ?
 
 DL approaches have been successfully applied to: Deep Learning: Why for NLP ? Automatic summarization Coreference resolution Discourse analysis Machine translation Morphological segmentation Named entity recognition (NER) Natural language generation Natural language understanding Optical character recognition (OCR) Part-of-speech tagging Parsing Question answering Relationship extraction sentence boundary disambiguation Sentiment analysis Speech recognition Speech segmentation Topic segmentation and recognition Word segmentation Word sense disambiguation Information retrieval (IR) Information extraction (IE) Speech processing
  • 42. 2. NLP: WORD EMBEDDINGS 1. DEEP LEARNING
  • 43. • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:
 • or in vector space:
 • also known as “one hot” representation. • Its problem ? Word Representation Love Candy Store [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …]
  • 44. • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:
 • or in vector space:
 • also known as “one hot” representation. • Its problem ? Word Representation Love Candy Store [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] AND Store [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 …] = 0 !
  • 45. Distributional representations “You shall know a word by the company it keeps”
 (J. R. Firth 1957) One of the most successful ideas of modern statistical NLP! these words represent banking • Hard (class based) clustering models: • Soft clustering models
  • 46. • Word Embeddings (Bengio et al, 2001; Bengio et al, 2003) based on idea of distributed representations for symbols (Hinton 1986) • Neural Word embeddings (Mnih and Hinton 2007, Collobert & Weston 2008, Turian et al 2010; Collobert et al. 2011, Mikolov et al. 2011) Language Modeling
  • 47. Neural distributional representations • Neural word embeddings • Combine vector space semantics with the prediction of probabilistic models • Words are represented as a dense vector: Candy =
  • 48. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world:
  • 49. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: input: 
 - the country of my birth - the place where I was born
  • 50. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth input: 
 - the country of my birth - the place where I was born
  • 51. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born input: 
 - the country of my birth - the place where I was born
  • 52. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born ? …
  • 53. • Recursive Tensor (Neural) Network (RTNT) 
 (Socher et al. 2011; Socher 2014) • Top-down hierarchical net (vs feed forward) • NLP! • Sequence based classification, windows of several events, entire scenes (rather than images), entire sentences (rather than words) • Features = Vectors • A tensor = multi-dimensional matrix, or multiple matrices of the same size Recursive Neural (Tensor) Network
  • 56. Compositionality Principle of compositionality: the “meaning (vector) of a complex expression (sentence) is determined by: — Gottlob Frege 
 (1848 - 1925) - the meanings of its constituent expressions (words) and - the rules (grammar) used to combine them”
  • 57. NP PP/IN NP DT NN PRP$ NN Parse Tree Compositionality
  • 58. NP PP/IN NP DT NN PRP$ NN Parse Tree INDT NN PRP NN Compositionality
  • 59. NP IN NP PRP NN Parse Tree DT NN Compositionality
  • 60. NP IN NP DT NN PRP NN PP NP (S / ROOT) Compositionality
  • 61. NP IN NP DT NN PRP NN PP NP (S / ROOT) “rules” Compositionality
  • 62. NP IN NP DT NN PRP NN PP NP (S / ROOT) “rules” “meanings” Compositionality
  • 63. Vector Space + Word Embeddings: Socher
  • 64. Vector Space + Word Embeddings: Socher
  • 65. code & info: http://metaoptimize.com/projects/wordreprs/ Word Embeddings: Turian
  • 66. t-SNE visualizations of word embeddings. Left: Number Region; Right: Jobs Region. From Turian et al. 2011 Word Embeddings: Turian
  • 67. • Recurrent Neural Network (Mikolov et al. 2010; Mikolov et al. 2013a) W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle") W(‘‘woman")−W(‘‘man") ≃ W(‘‘queen")−W(‘‘king") Figures from Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations Word Embeddings: Mikolov
  • 68. • Mikolov et al. 2013b Figures from Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b). Efficient Estimation of Word Representations in Vector Space Word Embeddings: Mikolov
  • 69.
  • 70. • cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ CUDA, optimized for GTX 580) 
 https://code.google.com/p/cuda-convnet2/ • Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)
 http://caffe.berkeleyvision.org/ • OverFeat (NYU) 
 http://cilvr.nyu.edu/doku.php?id=code:start Wanna Play ?
  • 71. • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http:// deeplearning.net/software/theano/ • Pylearn2 - library designed to make machine learning research easy. http://deeplearning.net/software/pylearn2/ • Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ Wanna Play ? Wanna Play ?
  • 72. as PhD candidate KTH/CSC: “Always interested in discussion
 Machine Learning, Deep Architectures, 
 Graphs, and NLP” Wanna Play with Me ? roelof@kth.se www.csc.kth.se/~roelof/ Internship / EntrepeneurshipAcademic/Research as CIO/CTO Feeda: “Always looking for additions to our 
 brand new R&D team”
 
 [Internships upcoming on 
 KTH exjobb website…] roelof@feeda.com www.feeda.com Feeda
  • 74. Appendum Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013 code & demo: http://nlp.stanford.edu/sentiment/index.html
  • 75. Appendum Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 
 Improving Word Representations via Global Context and Multiple Word Prototypes