SlideShare a Scribd company logo
1 of 40
Bhaskar Mitra, Microsoft (Bing Sciences)
http://research.microsoft.com/people/bmitra
Neural text embeddings
are responsible for many
recent performance
improvements in Natural
Language Processing tasks
Mikolov et al. "Distributed representations of words and phrases and their compositionality." NIPS (2013).
Mikolov et al. "Efficient estimation of word representations in vector space." arXiv preprint (2013).
Bansal, Gimpel, and Livescu. "Tailoring Continuous Word Representations for Dependency Parsing." ACL (2014).
Mikolov, Le, and Sutskever. "Exploiting similarities among languages for machine translation." arXiv preprint (2013).
There is also a long history
of vector space models
(both dense and sparse) in
information retrieval
Salton, Wong, and Yang. "A vector space model for automatic indexing." ACM (1975).
Deerwester et al. "Indexing by latent semantic analysis." JASIS (1990).
Salakhutdinov, and Hinton. "Semantic hashing.“ SIGIR (2007).
What is an embedding?
A vector representation of items
Vectors are real-valued and dense
Vectors are small
Number of dimensions much smaller than the number of items
Items can be…
Words, short text, long text, images, entities, audio, etc. – depends on the task
Think sparse, act dense
Mostly the same principles apply to both the vector
space models
Sparse vectors are easier to visualize and reason
about
Learning embeddings is mostly about compression
and generalization over their sparse counterparts
Learning word embeddings
Start with a paired items dataset
[source, target]
Train a neural network
Bottleneck layer gives you a
dense vector representation
E.g., word2vec
Pennington, Socher, and Manning. "Glove: Global Vectors for Word Representation." EMNLP (2014).
Target
Item
Source
Item
Source
Embedding
Target
Embedding
Distance
Metric
Learning word embeddings
Start with a paired items dataset
[source, target]
Make a Source x Target matrix
Factorizing the matrix gives you
a dense vector representation
E.g., LSA, GloVe
T0 T1 T2 T3 T4 T5 T6 T7 T8
S0
S1
S2
S3
S5
S6
S7
Pennington, Socher, and Manning. "Glove: Global Vectors for Word Representation." EMNLP (2014).
Learning word embeddings
Start with a paired items dataset
[source, target]
Make a bi-partite graph
PPMI over edges gives you a
sparse vector representation
E.g., explicit representations
Levy et. al. “Linguistic regularities in sparse and explicit word representations”. CoNLL (2015)
Some examples of text embeddings
Embedding for Source Item Target Item Learning Model
Latent Semantic Analysis
Deerwester et. al. (1990)
Single word
Word
(one-hot)
Document
(one-hot)
Matrix factorization
Word2vec
Mikolov et. al. (2013)
Single Word
Word
(one-hot)
Neighboring Word
(one-hot)
Neural Network (Shallow)
Glove
Pennington et. al. (2014)
Single Word
Word
(one-hot)
Neighboring Word
(one-hot)
Matrix factorization
Semantic Hashing (auto-encoder)
Salakhutdinov and Hinton (2007)
Multi-word text
Document
(bag-of-words)
Same as source
(bag-of-words)
Neural Network (Deep)
DSSM
Huang et. al. (2013), Shen et. al. (2014)
Multi-word text
Query text
(bag-of-trigrams)
Document title
(bag-of-trigrams)
Neural Network (Deep)
Session DSSM
Mitra (2015)
Multi-word text
Query text
(bag-of-trigrams)
Next query in session
(bag-of-trigrams)
Neural Network (Deep)
Language Model DSSM
Mitra and Craswell (2015)
Multi-word text
Query prefix
(bag-of-trigrams)
Query suffix
(bag-of-trigrams)
Neural Network (Deep)
What notion of relatedness between
words does your vector space model?
banana
banana
Doc7 Doc9Doc2 Doc4 Doc11
What notion of relatedness between
words does your vector space model?
The vector can correspond to documents in which the word occurs
The vector can correspond to neighboring word context
e.g., “yellow banana grows on trees in africa”
banana
(grows, +1) (tree, +3)(yellow, -1) (on, +2) (africa, +5)
+1 +3-1 +2 +50 +4
What notion of relatedness between
words does your vector space model?
The vector can correspond to character trigrams in the word
banana
ana nan#ba na# ban
What notion of relatedness between
words does your vector space model?
Each of the previous vector
spaces model a different notion
of relatedness between words
Let’s consider the following example…
We have four (tiny) documents,
Document 1 : “seattle seahawks jerseys”
Document 2 : “seattle seahawks highlights”
Document 3 : “denver broncos jerseys”
Document 4 : “denver broncos highlights”
If we use document occurrence vectors…
seattle
Document 1 Document 3
Document 2 Document 4
seahawks
denver
broncos
similar
similar
In the rest of this talk, we refer to this notion of relatedness as Topical similarity.
If we use word context vectors…
seattle
(seattle, -1) (denver, -1)
(seahawks, +1) (broncos, +1)
(jerseys, + 1)
(jerseys, + 2)
(highlights, +1)
(highlights, +2)
seahawks
denver
broncos
similar
similar
In the rest of this talk, we refer to this notion of relatedness as Typical (by-type) similarity.
If we use character trigram vectors…
This notion of relatedness is similar to string edit-distance.
seattle
#se set
sea eat
ett
att
ttl
tle
settle
le#
similar
What does word2vec do?
“seahawks jerseys”
“seahawks highlights”
“seattle seahawks wilson”
“seattle seahawks sherman”
“seattle seahawks browner”
“seattle seahawks lfedi”
“broncos jerseys”
“broncos highlights”
“denver broncos lynch”
“denver broncos sanchez”
“denver broncos miller”
“denver broncos marshall”
Uses word context vectors but without the inter-word distance
For example, let’s consider the following “documents”
What does word2vec do?
seattle
seattle denver
seahawks broncos
jerseys
highlights
wilson
sherman
seahawks
denver
broncos
similar
browner
lfedi
lynch
sanchez
miller
marshall
[seahawks] – [seattle] + [Denver]
Mikolov et al. "Distributed representations of words and phrases and their compositionality." NIPS (2013).
Mikolov et al. "Efficient estimation of word representations in vector space." arXiv preprint (2013).
Session Modelling
Text Embeddings for
How do you model that the intent shift
is similar to
london things to do in london
new york new york tourist attractions
We can use vector algebra over queries!
Mitra. " Exploring Session Context using Distributed Representations of Queries and Reformulations." SIGIR (2015).
A brief introduction to DSSM
DNN trained on
clickthrough data
to maximize
cosine similarity
Tri-gram hashing
of terms for input
P.-S. Huang, et al. “Learning deep structured semantic models for web search using clickthrough data.” CIKM (2013).
Learning query reformulation embeddings
Train a DSSM over session
query pairs
The embedding for q1→q2
is given by,
Mitra. " Exploring Session Context using Distributed Representations of Queries and Reformulations." SIGIR (2015).
Using reformulation embeddings for
contextualizing query auto-completion
Mitra. " Exploring Session Context using Distributed Representations of Queries and Reformulations." SIGIR (2015).
Ideas I would love to discuss!
Modelling search trails as paths in the embedding space
Using embeddings to discover latent structure in information
seeking tasks
Embeddings for temporal modelling
Document Ranking
Text Embeddings for
What if I told you that everyone
who uses Word2vec is throwing half
the model away?
Word2vec optimizes IN-OUT dot
product which captures the co-
occurrence statistics of words
from the training corpus
Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016).
Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
Different notions of relatedness from
IN-IN and IN-OUT vector comparisons
using word2vec trained on Web queries
Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016).
Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
Using IN-OUT similarity to model
document aboutness
Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016).
Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
Dual Embedding Space Model (DESM)
Map query words to IN space and document
words to OUT space and compute average of
all-pairs cosine similarity
Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016).
Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
Ideas I would love to discuss!
Exploring traditional IR concepts (e.g., term frequency, term
importance, document length normalization, etc.) in the
context of dense vector representations of words
How can we formalize what relationship (typical, topical, etc.)
an embedding space models?
Get the data
IN+OUT Embeddings for 2.7M words
trained on 600M+ Bing queries
research.microsoft.com/projects/DESM
Download
Query Auto-Completion
Text Embeddings for
Typical and Topical similarities for
text (not just words!)
Mitra and Craswell. "Query Auto-Completion for Rare Prefixes." CIKM (2015).
The Typical-DSSM is trained on query prefix-
suffix pairs, as opposed to the Topical-DSSM
trained on query-document pairs
We can use the Typical-DSSM model for
query auto-completion for rare or unseen
prefixes!
Mitra and Craswell. "Query Auto-Completion for Rare Prefixes." CIKM (2015).
Query auto-completion for rare prefixes
Mitra and Craswell. "Query Auto-Completion for Rare Prefixes." CIKM (2015).
Ideas I would love to discuss!
Query auto-completion beyond just ranking “previously
seen” queries
Neural models for query completion (LSTMs/RNNs still
perform surprisingly poorly on metrics like MRR)
Neu-IR 2016
The SIGIR 2016 Workshop on
Neural Information Retrieval
Pisa, Tuscany, Italy
Workshop: July 21st, 2016
Submission deadline: May 30th, 2016
http://research.microsoft.com/neuir2016
(Call for Participation)
W. Bruce Croft
University of Massachusetts
Amherst, US
Jiafeng Guo
Chinese Academy of Sciences
Beijing, China
Maarten de Rijke
University of Amsterdam
Amsterdam, The Netherlands
Bhaskar Mitra
Bing, Microsoft
Cambridge, UK
Nick Craswell
Bing, Microsoft
Bellevue, US
Organizers

More Related Content

What's hot

Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Universitat Politècnica de Catalunya
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processingMinh Pham
 
Recent Progress in RNN and NLP
Recent Progress in RNN and NLPRecent Progress in RNN and NLP
Recent Progress in RNN and NLPhytae
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
NLP with Deep Learning
NLP with Deep LearningNLP with Deep Learning
NLP with Deep Learningfmguler
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and originShubhankar Mohan
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERTshaurya uppal
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to KerasJohn Ramey
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - IntroductionJungwon Kim
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringTraian Rebedea
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLPBill Liu
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vecananth
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 

What's hot (20)

Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Recent Progress in RNN and NLP
Recent Progress in RNN and NLPRecent Progress in RNN and NLP
Recent Progress in RNN and NLP
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
NLP with Deep Learning
NLP with Deep LearningNLP with Deep Learning
NLP with Deep Learning
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to Keras
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLP
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
 
Bert
BertBert
Bert
 

Viewers also liked

Cs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksCs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksYanbin Kong
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 
State of Blockchain 2017: Smartnetworks and the Blockchain Economy
State of Blockchain 2017:  Smartnetworks and the Blockchain EconomyState of Blockchain 2017:  Smartnetworks and the Blockchain Economy
State of Blockchain 2017: Smartnetworks and the Blockchain EconomyMelanie Swan
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP HackRoelof Pieters
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationYanbin Kong
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Jaemin Cho
 
Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Jaemin Cho
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedMelanie Swan
 

Viewers also liked (20)

Cs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksCs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural Networks
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 
State of Blockchain 2017: Smartnetworks and the Blockchain Economy
State of Blockchain 2017:  Smartnetworks and the Blockchain EconomyState of Blockchain 2017:  Smartnetworks and the Blockchain Economy
State of Blockchain 2017: Smartnetworks and the Blockchain Economy
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine TranslationRoee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP Hack
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and Segmentation
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)
 
Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
 

Similar to Neural Text Embeddings for NLP Tasks

Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networksconnectbeubax
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM
 
Continuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfContinuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfdevangmittal4
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learningfridolin.wild
 
Interpreting Embeddings with Comparison
Interpreting Embeddings with ComparisonInterpreting Embeddings with Comparison
Interpreting Embeddings with Comparisongleicher
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsAndre Freitas
 
A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...ijnlc
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
Interactive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector EmbeddingsInteractive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector Embeddingsgleicher
 

Similar to Neural Text Embeddings for NLP Tasks (20)

Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Continuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdfContinuous bag of words cbow word2vec word embedding work .pdf
Continuous bag of words cbow word2vec word embedding work .pdf
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
 
Interpreting Embeddings with Comparison
Interpreting Embeddings with ComparisonInterpreting Embeddings with Comparison
Interpreting Embeddings with Comparison
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
Interactive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector EmbeddingsInteractive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector Embeddings
 

More from Bhaskar Mitra

Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationBhaskar Mitra
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?Bhaskar Mitra
 
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...Bhaskar Mitra
 
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Bhaskar Mitra
 
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationMultisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackBhaskar Mitra
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBhaskar Mitra
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for RetrievalBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural NetworksBhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalBhaskar Mitra
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcomeBhaskar Mitra
 

More from Bhaskar Mitra (20)

Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
 
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
 
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
 
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationMultisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcome
 

Recently uploaded

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Neural Text Embeddings for NLP Tasks

  • 1. Bhaskar Mitra, Microsoft (Bing Sciences) http://research.microsoft.com/people/bmitra
  • 2. Neural text embeddings are responsible for many recent performance improvements in Natural Language Processing tasks Mikolov et al. "Distributed representations of words and phrases and their compositionality." NIPS (2013). Mikolov et al. "Efficient estimation of word representations in vector space." arXiv preprint (2013). Bansal, Gimpel, and Livescu. "Tailoring Continuous Word Representations for Dependency Parsing." ACL (2014). Mikolov, Le, and Sutskever. "Exploiting similarities among languages for machine translation." arXiv preprint (2013).
  • 3. There is also a long history of vector space models (both dense and sparse) in information retrieval Salton, Wong, and Yang. "A vector space model for automatic indexing." ACM (1975). Deerwester et al. "Indexing by latent semantic analysis." JASIS (1990). Salakhutdinov, and Hinton. "Semantic hashing.“ SIGIR (2007).
  • 4. What is an embedding? A vector representation of items Vectors are real-valued and dense Vectors are small Number of dimensions much smaller than the number of items Items can be… Words, short text, long text, images, entities, audio, etc. – depends on the task
  • 5. Think sparse, act dense Mostly the same principles apply to both the vector space models Sparse vectors are easier to visualize and reason about Learning embeddings is mostly about compression and generalization over their sparse counterparts
  • 6. Learning word embeddings Start with a paired items dataset [source, target] Train a neural network Bottleneck layer gives you a dense vector representation E.g., word2vec Pennington, Socher, and Manning. "Glove: Global Vectors for Word Representation." EMNLP (2014). Target Item Source Item Source Embedding Target Embedding Distance Metric
  • 7. Learning word embeddings Start with a paired items dataset [source, target] Make a Source x Target matrix Factorizing the matrix gives you a dense vector representation E.g., LSA, GloVe T0 T1 T2 T3 T4 T5 T6 T7 T8 S0 S1 S2 S3 S5 S6 S7 Pennington, Socher, and Manning. "Glove: Global Vectors for Word Representation." EMNLP (2014).
  • 8. Learning word embeddings Start with a paired items dataset [source, target] Make a bi-partite graph PPMI over edges gives you a sparse vector representation E.g., explicit representations Levy et. al. “Linguistic regularities in sparse and explicit word representations”. CoNLL (2015)
  • 9. Some examples of text embeddings Embedding for Source Item Target Item Learning Model Latent Semantic Analysis Deerwester et. al. (1990) Single word Word (one-hot) Document (one-hot) Matrix factorization Word2vec Mikolov et. al. (2013) Single Word Word (one-hot) Neighboring Word (one-hot) Neural Network (Shallow) Glove Pennington et. al. (2014) Single Word Word (one-hot) Neighboring Word (one-hot) Matrix factorization Semantic Hashing (auto-encoder) Salakhutdinov and Hinton (2007) Multi-word text Document (bag-of-words) Same as source (bag-of-words) Neural Network (Deep) DSSM Huang et. al. (2013), Shen et. al. (2014) Multi-word text Query text (bag-of-trigrams) Document title (bag-of-trigrams) Neural Network (Deep) Session DSSM Mitra (2015) Multi-word text Query text (bag-of-trigrams) Next query in session (bag-of-trigrams) Neural Network (Deep) Language Model DSSM Mitra and Craswell (2015) Multi-word text Query prefix (bag-of-trigrams) Query suffix (bag-of-trigrams) Neural Network (Deep)
  • 10. What notion of relatedness between words does your vector space model? banana
  • 11. banana Doc7 Doc9Doc2 Doc4 Doc11 What notion of relatedness between words does your vector space model? The vector can correspond to documents in which the word occurs
  • 12. The vector can correspond to neighboring word context e.g., “yellow banana grows on trees in africa” banana (grows, +1) (tree, +3)(yellow, -1) (on, +2) (africa, +5) +1 +3-1 +2 +50 +4 What notion of relatedness between words does your vector space model?
  • 13. The vector can correspond to character trigrams in the word banana ana nan#ba na# ban What notion of relatedness between words does your vector space model?
  • 14. Each of the previous vector spaces model a different notion of relatedness between words
  • 15. Let’s consider the following example… We have four (tiny) documents, Document 1 : “seattle seahawks jerseys” Document 2 : “seattle seahawks highlights” Document 3 : “denver broncos jerseys” Document 4 : “denver broncos highlights”
  • 16. If we use document occurrence vectors… seattle Document 1 Document 3 Document 2 Document 4 seahawks denver broncos similar similar In the rest of this talk, we refer to this notion of relatedness as Topical similarity.
  • 17. If we use word context vectors… seattle (seattle, -1) (denver, -1) (seahawks, +1) (broncos, +1) (jerseys, + 1) (jerseys, + 2) (highlights, +1) (highlights, +2) seahawks denver broncos similar similar In the rest of this talk, we refer to this notion of relatedness as Typical (by-type) similarity.
  • 18. If we use character trigram vectors… This notion of relatedness is similar to string edit-distance. seattle #se set sea eat ett att ttl tle settle le# similar
  • 19. What does word2vec do? “seahawks jerseys” “seahawks highlights” “seattle seahawks wilson” “seattle seahawks sherman” “seattle seahawks browner” “seattle seahawks lfedi” “broncos jerseys” “broncos highlights” “denver broncos lynch” “denver broncos sanchez” “denver broncos miller” “denver broncos marshall” Uses word context vectors but without the inter-word distance For example, let’s consider the following “documents”
  • 20. What does word2vec do? seattle seattle denver seahawks broncos jerseys highlights wilson sherman seahawks denver broncos similar browner lfedi lynch sanchez miller marshall [seahawks] – [seattle] + [Denver] Mikolov et al. "Distributed representations of words and phrases and their compositionality." NIPS (2013). Mikolov et al. "Efficient estimation of word representations in vector space." arXiv preprint (2013).
  • 22. How do you model that the intent shift is similar to london things to do in london new york new york tourist attractions
  • 23. We can use vector algebra over queries! Mitra. " Exploring Session Context using Distributed Representations of Queries and Reformulations." SIGIR (2015).
  • 24. A brief introduction to DSSM DNN trained on clickthrough data to maximize cosine similarity Tri-gram hashing of terms for input P.-S. Huang, et al. “Learning deep structured semantic models for web search using clickthrough data.” CIKM (2013).
  • 25. Learning query reformulation embeddings Train a DSSM over session query pairs The embedding for q1→q2 is given by, Mitra. " Exploring Session Context using Distributed Representations of Queries and Reformulations." SIGIR (2015).
  • 26. Using reformulation embeddings for contextualizing query auto-completion Mitra. " Exploring Session Context using Distributed Representations of Queries and Reformulations." SIGIR (2015).
  • 27. Ideas I would love to discuss! Modelling search trails as paths in the embedding space Using embeddings to discover latent structure in information seeking tasks Embeddings for temporal modelling
  • 29. What if I told you that everyone who uses Word2vec is throwing half the model away? Word2vec optimizes IN-OUT dot product which captures the co- occurrence statistics of words from the training corpus Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016). Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
  • 30. Different notions of relatedness from IN-IN and IN-OUT vector comparisons using word2vec trained on Web queries Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016). Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
  • 31. Using IN-OUT similarity to model document aboutness Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016). Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
  • 32. Dual Embedding Space Model (DESM) Map query words to IN space and document words to OUT space and compute average of all-pairs cosine similarity Mitra, et al. "A Dual Embedding Space Model for Document Ranking." arXiv preprint (2016). Nalisnick, et al. "Improving Document Ranking with Dual Word Embeddings." WWW (2016).
  • 33. Ideas I would love to discuss! Exploring traditional IR concepts (e.g., term frequency, term importance, document length normalization, etc.) in the context of dense vector representations of words How can we formalize what relationship (typical, topical, etc.) an embedding space models?
  • 34. Get the data IN+OUT Embeddings for 2.7M words trained on 600M+ Bing queries research.microsoft.com/projects/DESM Download
  • 36. Typical and Topical similarities for text (not just words!) Mitra and Craswell. "Query Auto-Completion for Rare Prefixes." CIKM (2015).
  • 37. The Typical-DSSM is trained on query prefix- suffix pairs, as opposed to the Topical-DSSM trained on query-document pairs We can use the Typical-DSSM model for query auto-completion for rare or unseen prefixes! Mitra and Craswell. "Query Auto-Completion for Rare Prefixes." CIKM (2015).
  • 38. Query auto-completion for rare prefixes Mitra and Craswell. "Query Auto-Completion for Rare Prefixes." CIKM (2015).
  • 39. Ideas I would love to discuss! Query auto-completion beyond just ranking “previously seen” queries Neural models for query completion (LSTMs/RNNs still perform surprisingly poorly on metrics like MRR)
  • 40. Neu-IR 2016 The SIGIR 2016 Workshop on Neural Information Retrieval Pisa, Tuscany, Italy Workshop: July 21st, 2016 Submission deadline: May 30th, 2016 http://research.microsoft.com/neuir2016 (Call for Participation) W. Bruce Croft University of Massachusetts Amherst, US Jiafeng Guo Chinese Academy of Sciences Beijing, China Maarten de Rijke University of Amsterdam Amsterdam, The Netherlands Bhaskar Mitra Bing, Microsoft Cambridge, UK Nick Craswell Bing, Microsoft Bellevue, US Organizers