SlideShare a Scribd company logo
1 of 31
Download to read offline
Attention is all you need
Whi Kwon
소개
2008 ~ 2015: 화공생명공학과
2015 ~ 2017: 품질 / 고객지원 엔지니어
2017 ~ 2018: 딥러닝 자유롭게 공부
2018 ~: 의료 분야 스타트업
관심사
~2017.12: Vision, NLP
~2018.06: RL, GAN
2018.06~: Relational, Imitation
Outline
Part.1: Attention
Part.2: Self-Attention
Part 1. Attention
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. . It is the taking possession by the mind in clear and vivid form of one out
of what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. . It is the taking possession by the mind in clear and vivid form of one out
of what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. It is the taking possession by the mind in clear and vivid form of one out of
what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Recurrent Neural Network
...
attention also referred resources
문제 : Non-parallel computation, not long-range dependencies
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. It is the taking possession by the mind in clear and vivid form of one out of
what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Convolution Neural Network
attention also ... cognitive process
of selectively ... whether deemed
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
concentrat of ... enthrallment or
attention has ... cognitive processing
Filter
문제 : Not long-range dependencies, computationally inefficient
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. . It is the taking possession by the mind in clear and vivid form of one out
of what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Attention mechanism
Parallel computation, long-range dependencies, explainable
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
1. Q 와 K 간의 유사도를 구합니다 .
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
2. 너무 큰 값이 지배적이지 않도록 normalize
1. Q 와 K 간의 유사도를 구합니다 .
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
2. 너무 큰 값이 지배적이지 않도록 normalize
1. Q 와 K 간의 유사도를 구합니다 .
3. 유사도 → 가중치 ( 총 합 =1)
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
2. 너무 큰 값이 지배적이지 않도록 normalize
3. 유사도 → 가중치 ( 총 합 =1)
1. Q 와 K 간의 유사도를 구합니다 .
4. 가중치를
V 에 곱해줍니다 .
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
정보 {K:V} 가 어떤 Q 와 연관이 있을 것입니다 .
이를 활용해서 K 와 Q 의 유사도를 구하고 이를 , V 에 반영해줍시다 .
그럼 Q 에 직접적으로 연관된 V 의 정보를 더 많이 전달해 줄 수 있을 것입
니다 .
2. 너무 큰 값이 지배적이지 않도록 normalize
3. 유사도 → 가중치 ( 총 합 =1)
1. Q 와 K 간의 유사도를 구합니다 .
4. 가중치를
V 에 곱해줍니다 .
e.g. Attention mechanism with Seq2Seq
...
Encoder
Decoder
...
Decoder 의 정보 전달은 오직 이
전 t 의 정보에 의존적입니다 .
Encoder 의 마지막 정보가
Decoder 로 전달됩니다 .
Encoder 의 정보 전달은 이전
t 의 hidden state, 현재 t 의
input 에 의존적입니다 .
(Machine translation, Encoder-Decoder, Attention)
e.g. Attention mechanism with Seq2Seq
(Machine translation, Encoder-Decoder, Attention)
⊕
...
Decoder
Encoder
...
Attention
Long-range dependency
e.g. Attention mechanism with Seq2Seq
Fig from Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015
(Machine translation, Encoder-Decoder, Attention)
Attention
e.g. Style-token
Fig. from Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis.
ArXiv. 2018
Decoder
Encoder1
Encoder2
GST
(Random init token)
⊕
Attention
(Text to speech, Encoder-Decoder, Style transfer, Attention)
Demo: https://google.github.io/tacotron/publications/global_style_tokens/
Part 2. Self-attention
1
Self-attention
1 2 3
4 5 6
7 8 9
3
9
1
2
*
=
0.1
0.3
0.10.1
0.10.2
0.30.10.10.1
... ...
0.1
0.3
0.10.1
0.10.2
0.30.10.10.1
...
*
*
*
3
9
1
2
...
1’ 2’ 3’
4’ 5’ 6’
7’ 8’ 9’
1’⊕
Self-attention LayerSelf-attention Layer
2
Self-attention
1 2 3
4 5 6
7 8 9
3
9
1
2
*
=
0.1
0.1
0.10.1
0.10.2
0.30.10.10.2
... ...
0.1
0.1
0.10.1
0.10.2
0.30.10.10.2
...
*
*
*
3
9
1
2
...
⊕
1’ 2’ 3’
4’ 5’ 6’
7’ 8’ 9’
2’⊕
Self-attention Layer
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
Self-attention
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
2. j pixel 값을 곱한다 .
Self-attention
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
2. j pixel 값을 곱한다 .
3. normalization 항
Self-attention
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
i, j 번째 정보는 서로 연관이 있을 것입니다 .
각 위치 별 유사도를 구하고 이를 가중치로 반영해줍시다 .
그럼 , 모든 위치 별 관계를 학습 할 수 있을 것입니다 .
(Long-range dependency!)
2. j pixel 값을 곱한다 .
3. normalization 항
Self-attention
e.g. Self-Attention GAN
(Image generation, GAN, Self-attention)
Transpose
Conv ⊕
Latent
(z)
Image
(x’)
Self-
Attention
Conv ⊕ FC
Self-
Attention
ProbImage
(x)
Generator
Discriminator
Fig. from Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018.
e.g. Self-Attention GAN
(Image generation, GAN, Self-attention)
Conclusion
Attention:
Self-Attention:
Next...?
Relational Network, Graphical Model...
Reference
- Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015
- Wang et al. Non-local neural networks. ArXiv. 2017
- Vaswani et al. Attention is all you need. ArXiv. 2017
- Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End
Speech Synthesis. ArXiv. 2018
- Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018.
- Attention is all you need 설명 블로그
(https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/)
- Attention is all you need 설명 동영상
(https://www.youtube.com/watch?v=iDulhoQ2pro)

More Related Content

What's hot

Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear ComplexitySangwoo Mo
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningRoberto Pereira Silveira
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Universitat Politècnica de Catalunya
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanismSwatiNarkhede1
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Attention is all you need
Attention is all you needAttention is all you need
Attention is all you needHoon Heo
 
Attention scores and mechanisms
Attention scores and mechanismsAttention scores and mechanisms
Attention scores and mechanismsJaeHo Jang
 
Human pose estimation with deep learning
Human pose estimation with deep learningHuman pose estimation with deep learning
Human pose estimation with deep learningengiyad95
 
[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP
[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP
[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIPDeep Learning JP
 
[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개Donghyeon Kim
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural NetworksSangwoo Mo
 
ConvNetの歴史とResNet亜種、ベストプラクティス
ConvNetの歴史とResNet亜種、ベストプラクティスConvNetの歴史とResNet亜種、ベストプラクティス
ConvNetの歴史とResNet亜種、ベストプラクティスYusuke Uchida
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기Woong won Lee
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019Yusuke Uchida
 
Deep neural networks and tabular data
Deep neural networks and tabular dataDeep neural networks and tabular data
Deep neural networks and tabular dataJimmyLiang20
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?NAVER D2
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSeiya Tokui
 

What's hot (20)

Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Attention is all you need
Attention is all you needAttention is all you need
Attention is all you need
 
Attention scores and mechanisms
Attention scores and mechanismsAttention scores and mechanisms
Attention scores and mechanisms
 
Human pose estimation with deep learning
Human pose estimation with deep learningHuman pose estimation with deep learning
Human pose estimation with deep learning
 
[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP
[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP
[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP
 
[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개[기초개념] Recurrent Neural Network (RNN) 소개
[기초개념] Recurrent Neural Network (RNN) 소개
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
ConvNetの歴史とResNet亜種、ベストプラクティス
ConvNetの歴史とResNet亜種、ベストプラクティスConvNetの歴史とResNet亜種、ベストプラクティス
ConvNetの歴史とResNet亜種、ベストプラクティス
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019
 
Deep neural networks and tabular data
Deep neural networks and tabular dataDeep neural networks and tabular data
Deep neural networks and tabular data
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 

Similar to Attention mechanism 소개 자료

Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A    Guidelines Manual (November 1, 2018) .docxCh. 5 Pt. A    Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docxbartholomeocoombs
 
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...Nao (Naotsugu) Tsuchiya
 
Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6Larry Paul
 
Watzl "What is Attention?"
Watzl "What is Attention?"Watzl "What is Attention?"
Watzl "What is Attention?"sebastianwatzl
 
Behavioral analysis of cognition
Behavioral analysis of cognitionBehavioral analysis of cognition
Behavioral analysis of cognitionGheraldine Fillaro
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Numenta
 
Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...Nao (Naotsugu) Tsuchiya
 
1810.mid1043.07
1810.mid1043.071810.mid1043.07
1810.mid1043.07vizualizer
 
Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention Nao (Naotsugu) Tsuchiya
 
Cognitive Science Unit 4
Cognitive Science Unit 4Cognitive Science Unit 4
Cognitive Science Unit 4CSITSansar
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxkumarkaushal17
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxkumarkaushal17
 
A Handbook of Cognition for UX Designers
A Handbook of Cognition for UX DesignersA Handbook of Cognition for UX Designers
A Handbook of Cognition for UX DesignersXinLei Guo
 
Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?Klaxon
 
UP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic ThinkingUP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic ThinkingEducation Moving Up Cc.
 
NeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016CNeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016CValeria Trezzi
 
Human function and attention ppt
Human function and attention pptHuman function and attention ppt
Human function and attention pptHenry Mwanza
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.pptbutest
 

Similar to Attention mechanism 소개 자료 (20)

Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A    Guidelines Manual (November 1, 2018) .docxCh. 5 Pt. A    Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
 
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
 
Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6
 
Watzl "What is Attention?"
Watzl "What is Attention?"Watzl "What is Attention?"
Watzl "What is Attention?"
 
Tvcg.12a
Tvcg.12aTvcg.12a
Tvcg.12a
 
Behavioral analysis of cognition
Behavioral analysis of cognitionBehavioral analysis of cognition
Behavioral analysis of cognition
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
 
Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...
 
1810.mid1043.07
1810.mid1043.071810.mid1043.07
1810.mid1043.07
 
Boost your strategic thinking
Boost your strategic thinkingBoost your strategic thinking
Boost your strategic thinking
 
Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention
 
Cognitive Science Unit 4
Cognitive Science Unit 4Cognitive Science Unit 4
Cognitive Science Unit 4
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptx
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptx
 
A Handbook of Cognition for UX Designers
A Handbook of Cognition for UX DesignersA Handbook of Cognition for UX Designers
A Handbook of Cognition for UX Designers
 
Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?
 
UP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic ThinkingUP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic Thinking
 
NeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016CNeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016C
 
Human function and attention ppt
Human function and attention pptHuman function and attention ppt
Human function and attention ppt
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 

Recently uploaded

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 

Recently uploaded (20)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 

Attention mechanism 소개 자료

  • 1. Attention is all you need Whi Kwon
  • 2. 소개 2008 ~ 2015: 화공생명공학과 2015 ~ 2017: 품질 / 고객지원 엔지니어 2017 ~ 2018: 딥러닝 자유롭게 공부 2018 ~: 의료 분야 스타트업
  • 3. 관심사 ~2017.12: Vision, NLP ~2018.06: RL, GAN 2018.06~: Relational, Imitation
  • 6. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. . It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources.
  • 7. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. . It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources.
  • 8. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources. Recurrent Neural Network ... attention also referred resources 문제 : Non-parallel computation, not long-range dependencies
  • 9. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources. Convolution Neural Network attention also ... cognitive process of selectively ... whether deemed . . . . . . . . . . . . . . . concentrat of ... enthrallment or attention has ... cognitive processing Filter 문제 : Not long-range dependencies, computationally inefficient
  • 10. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. . It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources. Attention mechanism Parallel computation, long-range dependencies, explainable
  • 11. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 1. Q 와 K 간의 유사도를 구합니다 .
  • 12. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 2. 너무 큰 값이 지배적이지 않도록 normalize 1. Q 와 K 간의 유사도를 구합니다 .
  • 13. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 2. 너무 큰 값이 지배적이지 않도록 normalize 1. Q 와 K 간의 유사도를 구합니다 . 3. 유사도 → 가중치 ( 총 합 =1)
  • 14. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 2. 너무 큰 값이 지배적이지 않도록 normalize 3. 유사도 → 가중치 ( 총 합 =1) 1. Q 와 K 간의 유사도를 구합니다 . 4. 가중치를 V 에 곱해줍니다 .
  • 15. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 정보 {K:V} 가 어떤 Q 와 연관이 있을 것입니다 . 이를 활용해서 K 와 Q 의 유사도를 구하고 이를 , V 에 반영해줍시다 . 그럼 Q 에 직접적으로 연관된 V 의 정보를 더 많이 전달해 줄 수 있을 것입 니다 . 2. 너무 큰 값이 지배적이지 않도록 normalize 3. 유사도 → 가중치 ( 총 합 =1) 1. Q 와 K 간의 유사도를 구합니다 . 4. 가중치를 V 에 곱해줍니다 .
  • 16. e.g. Attention mechanism with Seq2Seq ... Encoder Decoder ... Decoder 의 정보 전달은 오직 이 전 t 의 정보에 의존적입니다 . Encoder 의 마지막 정보가 Decoder 로 전달됩니다 . Encoder 의 정보 전달은 이전 t 의 hidden state, 현재 t 의 input 에 의존적입니다 . (Machine translation, Encoder-Decoder, Attention)
  • 17. e.g. Attention mechanism with Seq2Seq (Machine translation, Encoder-Decoder, Attention) ⊕ ... Decoder Encoder ... Attention Long-range dependency
  • 18. e.g. Attention mechanism with Seq2Seq Fig from Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015 (Machine translation, Encoder-Decoder, Attention) Attention
  • 19. e.g. Style-token Fig. from Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ArXiv. 2018 Decoder Encoder1 Encoder2 GST (Random init token) ⊕ Attention (Text to speech, Encoder-Decoder, Style transfer, Attention) Demo: https://google.github.io/tacotron/publications/global_style_tokens/
  • 21. 1 Self-attention 1 2 3 4 5 6 7 8 9 3 9 1 2 * = 0.1 0.3 0.10.1 0.10.2 0.30.10.10.1 ... ... 0.1 0.3 0.10.1 0.10.2 0.30.10.10.1 ... * * * 3 9 1 2 ... 1’ 2’ 3’ 4’ 5’ 6’ 7’ 8’ 9’ 1’⊕ Self-attention LayerSelf-attention Layer
  • 22. 2 Self-attention 1 2 3 4 5 6 7 8 9 3 9 1 2 * = 0.1 0.1 0.10.1 0.10.2 0.30.10.10.2 ... ... 0.1 0.1 0.10.1 0.10.2 0.30.10.10.2 ... * * * 3 9 1 2 ... ⊕ 1’ 2’ 3’ 4’ 5’ 6’ 7’ 8’ 9’ 2’⊕ Self-attention Layer
  • 23. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . Self-attention
  • 24. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . 2. j pixel 값을 곱한다 . Self-attention
  • 25. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . 2. j pixel 값을 곱한다 . 3. normalization 항 Self-attention
  • 26. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . i, j 번째 정보는 서로 연관이 있을 것입니다 . 각 위치 별 유사도를 구하고 이를 가중치로 반영해줍시다 . 그럼 , 모든 위치 별 관계를 학습 할 수 있을 것입니다 . (Long-range dependency!) 2. j pixel 값을 곱한다 . 3. normalization 항 Self-attention
  • 27. e.g. Self-Attention GAN (Image generation, GAN, Self-attention) Transpose Conv ⊕ Latent (z) Image (x’) Self- Attention Conv ⊕ FC Self- Attention ProbImage (x) Generator Discriminator
  • 28. Fig. from Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018. e.g. Self-Attention GAN (Image generation, GAN, Self-attention)
  • 31. Reference - Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015 - Wang et al. Non-local neural networks. ArXiv. 2017 - Vaswani et al. Attention is all you need. ArXiv. 2017 - Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ArXiv. 2018 - Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018. - Attention is all you need 설명 블로그 (https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/) - Attention is all you need 설명 동영상 (https://www.youtube.com/watch?v=iDulhoQ2pro)