SlideShare a Scribd company logo
1 of 60
Download to read offline
Xavier Giro-i-Nieto
Associate Professor
Universitat Politecnica de Catalunya
@DocXavi
xavier.giro@upc.edu
Attention Mechanisms
Lecture 14
[course site]
2
Acknowledgments
Amaia Salvador
Doctor
Universitat Politècnica de Catalunya
Marta R. Costa-jussà
Associate Professor
Universitat Politècnica de Catalunya
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
3
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
4
Motivation: Vision
5
Image:
H x W x 3
bird
The whole input volume is used to predict the output...
Motivation: Vision
6
Image:
H x W x 3
bird
...despite the fact that not all pixels are equally important
The whole input volume is used to predict the output...
Motivation: Automatic Speech Recognition (ASR)
7
Figure: distill.pub
Chan, W., Jaitly, N., Le, Q., & Vinyals, O. Listen, attend and spell: A neural network for large vocabulary conversational
speech recognition. ICASSP 2016.
Transcribed letters correspond with just a few phonemes.
Motivation: Neural Machine Translation (NMT)
8
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
The translated words are often related to a subset of the words from the source
sentence.
(Edge thicknesses represent the attention weights found by the attention model)
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
9
Seq2Seq with RNN
10
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014.
The Seq2Seq scheme can be implemented with RNNs:
● trigger the output generation with an input <go> symbol.
● the predicted word at timestep t, becomes the input at t+1.
11
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014.
Seq2Seq with RNN
Seq2Seq for NMT
12
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
13
Seq2Seq for NMT
How does the length of the source sentence affect the size (d) of its encoding ?
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
14
How does the length of the input sentence affect the size (d) of its encoded
representation ?
Ray Mooney, ACL 2014 Workshop on Semantic Parsing.
Seq2Seq for NMT
"You can't cram the meaning of
a whole %&!$# sentence into a
single $&!#* vector!"
15
Seq2Seq for NMT: Information bottleneck
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
16
Query, Keys & Values
Databases store information as pair of keys and values (K,V).
17
Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020)
Example:
<Key> <Value>
myfile.pdf
The (K,Q,V) terminology used to retrieve information from databases is adopted to
formulate attention.
18
Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020)
Query, Keys & Values
Attention is a mechanism to compute a context vector (c) for a query (Q) as a
weighted sum of values (V).
19
Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020)
Query, Keys & Values
20
Key (K) / Values (V) Query (Q)
Usually, both keys and values correspond to the encoder states.
Query, Keys & Values
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
21
22
Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016.
Seq2Seq + Attention
23
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
24
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
25
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
26
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
27
Seq2Seq + Attention: NMT
Softmax
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
28
Seq2Seq + Attention: NMT
29
Seq2Seq + Attention: NMT
Context (c)
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
30
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
31
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
32
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
33Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
Seq2Seq + Attention: NMT
34
Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016.
Seq2Seq + Attention
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
35
36
Slide: Marta R. Costa-jussà
Attention Mechanisms
How to compute
these attention
scores ?
● Additive attention
● Multiplicative
● Scaled dot product
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
○ Additive
37
38
Attention Mechanisms: Additive Attention
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
The first attention mechanism used a one-hidden layer feed-forward network to
calculate the attention alignment:
39
Attention Mechanisms: Additive Attention
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
hj
zi
Attention
weight
(aj
)
Key
(k)
Query
(q)
The first attention mechanism used a one-hidden layer feed-forward network to
calculate the attention alignment:
40
Attention Mechanisms: Additive Attention
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
hj
zi
Attention
weight
(aj
)
The original attention mechanism uses a one-hidden layer feed-forward network
to calculate the attention alignment:
a(q,k) = va
T
tanh(W1
q + W2
k)
a(q,k) = va
T
tanh(Wa
[q;k])
It is often expressed as two separate
linear transformations W1
and W2
for zi
and hi
, respectively:
concatenation
Key
(k)
Query
(q)
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
○ Multiplicative
41
42
Attention Mechanisms: Multiplicative Attention
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation.
EMNLP 2015
A computational efficient approach is just learning matrix multplication:
a(q,k) = qT
W k
=
43
Attention Mechanisms: Multiplicative Attention
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation.
EMNLP 2015
If the dimension of q & k are equal (eg. in seq2seq, by learning a linear
projection), an even more computational efficient approach is a dot product
between query and key:
a(q,k) = qT
k
=
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
○ Scaled dot product
44
45
Attention Mechanisms: Scaled dot product
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.. Attention is all you need.
NeurIPS 2017.
Multiplicative attention performs worse than additive attention for large
dimensionality of the decoder state (dk
) unless a scaling factor is added:
a(q,k) = ( qT
k ) / √ dk
=
46
Attention Mechanisms: Example
Key (K) / Values (V) Queries (Q)
Softmax
dh
Q KT =
=
C
VC At
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
Attention Mechanisms: Example
Figure: Ilharco, Ilharco, Turc, Dettmers, Ferreira, Lee, “High Performance Natural Language Processing”. EMNLP 2020.
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
48
Image Captioning without Attention
LSTMLSTM LSTM
CNN LSTM
A bird flying
...
<EOS>
Features:
D
49
...
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. CVPR 2015.
Limitation: All output predictions are based on the final and static output
of the encoder
CNN
Image:
H x W x 3
50
Image Captioning with Attention
Slide: Amaia Salvador
Attention for Image Captioning
CNN
Image:
H x W x 3
L x
visual
embeddings
h0
51
c0
The first context
vector is the
average of the L
embeddings
a1
L x Attention weights
y1
Predicted word
y0
First word (<start> token)
Slide: Amaia Salvador
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
y1c1
Visual features weighted
with attention give the next
context
h1
a2 y2
52
a1 y1
c0 y0
Predicted word in
previous timestep
Slide: Amaia Salvador
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
c1 y1
h1
a2 y2
h2
a3 y3
c2 y2
53
a1 y1
c0 y0
Slide: Amaia Salvador
Attention for Image Captioning
54Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua
Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
Attention for Image Captioning
55Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua
Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
56
Side-note: attention can be computed with previous or current hidden state
CNN
Image:
H x W x 3
h1
v y1
h2 h3
v y2
a1
y1
v y0average
c1
a2
y2
c2
a3
y3
c3
Attention for Image Captioning
57
Exercise
Given a query vector and two keys:
● q = [0.3, 0.2, 0.1]
● k1
= [0.1, 0.3, 0.1]
● k2
= [0.6, 0.4, 0.2]
a) What are the attention weights a1 and a2 computing the dot product ?
b) Considering dh
=3, what are the attention weights 1&2 when computing the
scaled dot product ?
c) Which key vector will receive more attention ?
58
Learn more
Alex Graves, UCL x Deepmind 2020 [slides]
59
Learn more
● Tutorials
○ Sebastian Ruder, Deep Learning for NLP Best Practices # Attention (2017).
○ Chris Olah, Shan Carter, “Attention and Augmented Recurrent Neural Networks”. Distill 2016.
○ Lilian Weng, “Attention? Attention!”. Lil’Log 2017.
● Demos
○ François Fleuret (EPFL)
● Scientific publications
○ Katharopoulos, A., Vyas, A., Pappas, N., & Fleuret, F. (2020). Transformers are rnns: Fast autoregressive
transformers with linear attention. ICML 2020.
○ Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero,
Yee Whye Teh, Tim Harley, Razvan Pascanu, “Multiplicative Interactions and Where to Find Them”. ICLR 2020.
[tweet]
○ Woo, Sanghyun, Jongchan Park, Joon-Young Lee, and In So Kweon. "Cbam: Convolutional block attention
module." ECCV 2018.
○ Lu, Xiankai, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli. "See More, Know More:
Unsupervised Video Object Segmentation with Co-Attention Siamese Networks." CVPR 2019.
○ Wang, Ziqin, Jun Xu, Li Liu, Fan Zhu, and Ling Shao. "Ranet: Ranking attention network for fast video object
segmentation." ICCV 2019.
60
Questions ?

More Related Content

What's hot

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018Universitat Politècnica de Catalunya
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningMarc Bolaños Solà
 
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaSelf-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaUniversitat Politècnica de Catalunya
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaUniversitat Politècnica de Catalunya
 
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 

What's hot (20)

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal Learning
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
 
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
 
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaSelf-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
 
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
 
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 

Similar to Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAJin-Hwa Kim
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataVrije Universiteit Amsterdam
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yuzukun
 
Scalable Dynamic Graph Summarization
Scalable Dynamic Graph SummarizationScalable Dynamic Graph Summarization
Scalable Dynamic Graph SummarizationIoanna Tsalouchidou
 
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Universitat Politècnica de Catalunya
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...Tomoyuki Suzuki
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptAnshika865276
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?Dhafer Malouche
 
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelsJAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelshirokazutanaka
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Saad Liaqat
 
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar NowcastingRecurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar NowcastingAndreas Scheidegger
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Universitat Politècnica de Catalunya
 
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...MLAI2
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning AnalyticsXavier Ochoa
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationFeynman Liang
 

Similar to Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020 (20)

Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
 
Open-ended Visual Question-Answering
Open-ended  Visual Question-AnsweringOpen-ended  Visual Question-Answering
Open-ended Visual Question-Answering
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yu
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Scalable Dynamic Graph Summarization
Scalable Dynamic Graph SummarizationScalable Dynamic Graph Summarization
Scalable Dynamic Graph Summarization
 
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelsJAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01
 
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar NowcastingRecurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
 
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 

More from Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Universitat Politècnica de Catalunya
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Universitat Politècnica de Catalunya
 
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationHate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationUniversitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (16)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
 
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationHate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
 

Recently uploaded

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 

Recently uploaded (20)

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

  • 1. Xavier Giro-i-Nieto Associate Professor Universitat Politecnica de Catalunya @DocXavi xavier.giro@upc.edu Attention Mechanisms Lecture 14 [course site]
  • 2. 2 Acknowledgments Amaia Salvador Doctor Universitat Politècnica de Catalunya Marta R. Costa-jussà Associate Professor Universitat Politècnica de Catalunya
  • 3. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 3
  • 4. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 4
  • 5. Motivation: Vision 5 Image: H x W x 3 bird The whole input volume is used to predict the output...
  • 6. Motivation: Vision 6 Image: H x W x 3 bird ...despite the fact that not all pixels are equally important The whole input volume is used to predict the output...
  • 7. Motivation: Automatic Speech Recognition (ASR) 7 Figure: distill.pub Chan, W., Jaitly, N., Le, Q., & Vinyals, O. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. ICASSP 2016. Transcribed letters correspond with just a few phonemes.
  • 8. Motivation: Neural Machine Translation (NMT) 8 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) The translated words are often related to a subset of the words from the source sentence. (Edge thicknesses represent the attention weights found by the attention model)
  • 9. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 9
  • 10. Seq2Seq with RNN 10 Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014. The Seq2Seq scheme can be implemented with RNNs: ● trigger the output generation with an input <go> symbol. ● the predicted word at timestep t, becomes the input at t+1.
  • 11. 11 Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014. Seq2Seq with RNN
  • 12. Seq2Seq for NMT 12 Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 13. 13 Seq2Seq for NMT How does the length of the source sentence affect the size (d) of its encoding ? Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 14. 14 How does the length of the input sentence affect the size (d) of its encoded representation ? Ray Mooney, ACL 2014 Workshop on Semantic Parsing. Seq2Seq for NMT "You can't cram the meaning of a whole %&!$# sentence into a single $&!#* vector!"
  • 15. 15 Seq2Seq for NMT: Information bottleneck Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 16. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 16
  • 17. Query, Keys & Values Databases store information as pair of keys and values (K,V). 17 Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020) Example: <Key> <Value> myfile.pdf
  • 18. The (K,Q,V) terminology used to retrieve information from databases is adopted to formulate attention. 18 Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020) Query, Keys & Values
  • 19. Attention is a mechanism to compute a context vector (c) for a query (Q) as a weighted sum of values (V). 19 Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020) Query, Keys & Values
  • 20. 20 Key (K) / Values (V) Query (Q) Usually, both keys and values correspond to the encoder states. Query, Keys & Values Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 21. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 21
  • 22. 22 Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016. Seq2Seq + Attention
  • 23. 23 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 24. 24 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 25. 25 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 26. 26 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 27. 27 Seq2Seq + Attention: NMT Softmax Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 29. 29 Seq2Seq + Attention: NMT Context (c) Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 30. 30 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 31. 31 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 32. 32 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 33. 33Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. Seq2Seq + Attention: NMT
  • 34. 34 Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016. Seq2Seq + Attention
  • 35. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 35
  • 36. 36 Slide: Marta R. Costa-jussà Attention Mechanisms How to compute these attention scores ? ● Additive attention ● Multiplicative ● Scaled dot product
  • 37. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms ○ Additive 37
  • 38. 38 Attention Mechanisms: Additive Attention Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. The first attention mechanism used a one-hidden layer feed-forward network to calculate the attention alignment:
  • 39. 39 Attention Mechanisms: Additive Attention Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. hj zi Attention weight (aj ) Key (k) Query (q) The first attention mechanism used a one-hidden layer feed-forward network to calculate the attention alignment:
  • 40. 40 Attention Mechanisms: Additive Attention Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. hj zi Attention weight (aj ) The original attention mechanism uses a one-hidden layer feed-forward network to calculate the attention alignment: a(q,k) = va T tanh(W1 q + W2 k) a(q,k) = va T tanh(Wa [q;k]) It is often expressed as two separate linear transformations W1 and W2 for zi and hi , respectively: concatenation Key (k) Query (q)
  • 41. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms ○ Multiplicative 41
  • 42. 42 Attention Mechanisms: Multiplicative Attention Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015 A computational efficient approach is just learning matrix multplication: a(q,k) = qT W k =
  • 43. 43 Attention Mechanisms: Multiplicative Attention Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015 If the dimension of q & k are equal (eg. in seq2seq, by learning a linear projection), an even more computational efficient approach is a dot product between query and key: a(q,k) = qT k =
  • 44. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms ○ Scaled dot product 44
  • 45. 45 Attention Mechanisms: Scaled dot product Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.. Attention is all you need. NeurIPS 2017. Multiplicative attention performs worse than additive attention for large dimensionality of the decoder state (dk ) unless a scaling factor is added: a(q,k) = ( qT k ) / √ dk =
  • 46. 46 Attention Mechanisms: Example Key (K) / Values (V) Queries (Q) Softmax dh Q KT = = C VC At Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 47. Attention Mechanisms: Example Figure: Ilharco, Ilharco, Turc, Dettmers, Ferreira, Lee, “High Performance Natural Language Processing”. EMNLP 2020.
  • 48. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 48
  • 49. Image Captioning without Attention LSTMLSTM LSTM CNN LSTM A bird flying ... <EOS> Features: D 49 ... Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. CVPR 2015. Limitation: All output predictions are based on the final and static output of the encoder
  • 50. CNN Image: H x W x 3 50 Image Captioning with Attention Slide: Amaia Salvador
  • 51. Attention for Image Captioning CNN Image: H x W x 3 L x visual embeddings h0 51 c0 The first context vector is the average of the L embeddings a1 L x Attention weights y1 Predicted word y0 First word (<start> token) Slide: Amaia Salvador
  • 52. Attention for Image Captioning CNN Image: H x W x 3 h0 y1c1 Visual features weighted with attention give the next context h1 a2 y2 52 a1 y1 c0 y0 Predicted word in previous timestep Slide: Amaia Salvador
  • 53. Attention for Image Captioning CNN Image: H x W x 3 h0 c1 y1 h1 a2 y2 h2 a3 y3 c2 y2 53 a1 y1 c0 y0 Slide: Amaia Salvador
  • 54. Attention for Image Captioning 54Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
  • 55. Attention for Image Captioning 55Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
  • 56. 56 Side-note: attention can be computed with previous or current hidden state CNN Image: H x W x 3 h1 v y1 h2 h3 v y2 a1 y1 v y0average c1 a2 y2 c2 a3 y3 c3 Attention for Image Captioning
  • 57. 57 Exercise Given a query vector and two keys: ● q = [0.3, 0.2, 0.1] ● k1 = [0.1, 0.3, 0.1] ● k2 = [0.6, 0.4, 0.2] a) What are the attention weights a1 and a2 computing the dot product ? b) Considering dh =3, what are the attention weights 1&2 when computing the scaled dot product ? c) Which key vector will receive more attention ?
  • 58. 58 Learn more Alex Graves, UCL x Deepmind 2020 [slides]
  • 59. 59 Learn more ● Tutorials ○ Sebastian Ruder, Deep Learning for NLP Best Practices # Attention (2017). ○ Chris Olah, Shan Carter, “Attention and Augmented Recurrent Neural Networks”. Distill 2016. ○ Lilian Weng, “Attention? Attention!”. Lil’Log 2017. ● Demos ○ François Fleuret (EPFL) ● Scientific publications ○ Katharopoulos, A., Vyas, A., Pappas, N., & Fleuret, F. (2020). Transformers are rnns: Fast autoregressive transformers with linear attention. ICML 2020. ○ Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu, “Multiplicative Interactions and Where to Find Them”. ICLR 2020. [tweet] ○ Woo, Sanghyun, Jongchan Park, Joon-Young Lee, and In So Kweon. "Cbam: Convolutional block attention module." ECCV 2018. ○ Lu, Xiankai, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli. "See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks." CVPR 2019. ○ Wang, Ziqin, Jun Xu, Li Liu, Fan Zhu, and Ling Shao. "Ranet: Ranking attention network for fast video object segmentation." ICCV 2019.