SlideShare a Scribd company logo
1 of 20
Introduction to
Sequence to Sequence Model
2017.03.16 Seminar
Presenter : Hyemin Ahn
Recurrent Neural Networks : For what?
2017-03-28 CPSLAB (EECS) 2
 Human remembers and uses the pattern of sequence.
• Try ‘a b c d e f g…’
• But how about ‘z y x w v u t s…’ ?
 The idea behind RNN is to make use of sequential
information.
 Let’s learn a pattern of a sequence, and utilize (estimate,
generate, etc…) it!
 But HOW?
Recurrent Neural Networks : Typical RNNs
2017-03-28 CPSLAB (EECS) 3
OUTPUT
INPUT
ONE
STEP
DELAY
HIDDEN
STATE
 RNNs are called “RECURRENT” because they
perform the same task for every element of a
sequence, with the output being depended on
the previous computations.
 RNNs have a “memory” which captures
information about what has been calculated so
far.
 The hidden state ℎ 𝑡 captures some information
about a sequence.
 If we use 𝑓 = tanh , Vanishing/Exploding
gradient problem happens.
 For overcome this, we use LSTM/GRU.
𝒉 𝒕
𝒚 𝒕
𝒙 𝒕
ℎ 𝑡 = 𝑓 𝑈𝑥 𝑡 + 𝑊ℎ 𝑡−1 + 𝑏
𝑦𝑡 = 𝑉ℎ 𝑡 + 𝑐
𝑈
𝑊
𝑉
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 4
 Let’s think about the machine, which guesses the dinner menu from
things in shopping bag.
Umm,,
Carbonara!
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 5
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 6
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Forget
Some
Memories!
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 7
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Forget
Some
Memories!
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 8
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Insert
Some
Memories!
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 9
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 10
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 11
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒚 𝒕
𝒙 𝒕
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 12
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 13
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 14
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Neural Networks : GRU
2017-03-28 CPSLAB (EECS) 15
𝑓𝑡 = 𝜎(𝑊𝑓 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑓)
𝑖 𝑡 = 𝜎 𝑊𝑖 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑖
𝑜𝑡 = 𝜎(𝑊𝑜 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑜)
𝐶𝑡 = tanh 𝑊𝐶 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶
𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖 𝑡 ∗ 𝐶𝑡
ℎ 𝑡 = 𝑜𝑡 ∗ tanh(𝐶𝑡)
Maybe we can simplify this structure, efficiently!
GRU
𝑧𝑡 = 𝜎 𝑊𝑧 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑧
𝑟𝑡 = 𝜎 𝑊𝑟 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑟
ℎ 𝑡 = tanh 𝑊ℎ ∙ 𝑟𝑡 ∗ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶
ℎ 𝑡 = (1 − 𝑧𝑡) ∗ ℎ 𝑡−1 + 𝑧𝑡 ∗ ℎ 𝑡
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Sequence to Sequence Model: What is it?
2017-03-28 CPSLAB (EECS) 16
ℎ 𝑒(1) ℎ 𝑒(2) ℎ 𝑒(3) ℎ 𝑒(4) ℎ 𝑒(5)
LSTM/GRU
Encoder
LSTM/GRU
Decoder
ℎ 𝑑(1) ℎ 𝑑(𝑇𝑒)
Western Food
To
Korean Food
Transition
Sequence to Sequence Model: Implementation
2017-03-28 CPSLAB (EECS) 17
 The simplest way to implement sequence to sequence model is
to just pass the last hidden state of decoder 𝒉 𝑻
to the first GRU cell of encoder!
 However, this method’s power gets weaker when the encoder need to
generate longer sequence.
Sequence to Sequence Model: Attention Decoder
2017-03-28 CPSLAB (EECS) 18
Bidirectional
GRU Encoder
Attention
GRU Decoder
𝑐𝑡
 For each GRU cell consisting the
decoder, let’s differently pass the
encoder’s information!
ℎ𝑖 =
ℎ𝑖
ℎ𝑖
𝑐𝑖 =
𝑗=1
𝑇𝑥
𝛼𝑖𝑗ℎ𝑗
𝑠𝑖 = 𝑓 𝑠𝑖−1, 𝑦𝑖−1, 𝑐𝑖
= 1 − 𝑧𝑖 ∗ 𝑠𝑖−1 + 𝑧𝑖 ∗ 𝑠𝑖
𝑧𝑖 = 𝜎 𝑊𝑧 𝑦𝑖−1 + 𝑈𝑧 𝑠𝑖−1
𝑟𝑖 = 𝜎 𝑊𝑟 𝑦𝑖−1 + 𝑈𝑟 𝑠𝑖−1
𝑠𝑖 = tanh(𝑦𝑖−1 + 𝑈 𝑟𝑖 ∗ 𝑠𝑖−1 + 𝐶𝑐𝑖)
𝛼𝑖𝑗 =
exp(𝑒 𝑖𝑗)
𝑘=1
𝑇 𝑥 exp(𝑒 𝑖𝑘)
𝑒𝑖𝑗 = 𝑣 𝑎
𝑇
tanh 𝑊𝑎 𝑠𝑖−1 + 𝑈 𝑎ℎ𝑗
Sequence to Sequence Model: Example codes
2017-03-28 CPSLAB (EECS) 19
Codes Here @ Github
2017-03-28 CPSLAB (EECS) 20

More Related Content

What's hot

ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUSri Geetha
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxDeep Learning Italia
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Alia Hamwi
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Edureka!
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentationbhavesh_physics
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep LearningNatasha Latysheva
 

What's hot (20)

ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRU
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
Lstm
LstmLstm
Lstm
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
 

Viewers also liked

1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based accelerationHye-min Ahn
 
Kernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian ProcessesKernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian ProcessesSungjoon Choi
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPGHye-min Ahn
 
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
Understanding deep learning requires rethinking generalization (2017)    2 2(2)Understanding deep learning requires rethinking generalization (2017)    2 2(2)
Understanding deep learning requires rethinking generalization (2017) 2 2(2)정훈 서
 
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2정훈 서
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1Haesun Park
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Mad Scientists
 
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...Darshan Santani
 
회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것성환 조
 
Cooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message PassingCooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message PassingLyft
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)YerevaNN research lab
 
Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search EngineDeep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search EngineKoby Karp
 
Human brain how it work
Human brain how it workHuman brain how it work
Human brain how it workhudvin
 
Build Message Bot With Neural Network
Build Message Bot With Neural NetworkBuild Message Bot With Neural Network
Build Message Bot With Neural NetworkBilly Yang
 
시나브로 Django 발표
시나브로 Django 발표시나브로 Django 발표
시나브로 Django 발표명서 강
 
Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)정훈 서
 

Viewers also liked (20)

1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
 
Kernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian ProcessesKernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian Processes
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPG
 
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
Understanding deep learning requires rethinking generalization (2017)    2 2(2)Understanding deep learning requires rethinking generalization (2017)    2 2(2)
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
 
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
 
Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
 
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
 
회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것
 
Cooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message PassingCooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message Passing
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 
Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search EngineDeep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
 
Human brain how it work
Human brain how it workHuman brain how it work
Human brain how it work
 
RNN & LSTM
RNN & LSTMRNN & LSTM
RNN & LSTM
 
Build Message Bot With Neural Network
Build Message Bot With Neural NetworkBuild Message Bot With Neural Network
Build Message Bot With Neural Network
 
시나브로 Django 발표
시나브로 Django 발표시나브로 Django 발표
시나브로 Django 발표
 
Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)
 

Similar to Introduction For seq2seq(sequence to sequence) and RNN

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSharath TS
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingDongang (Sean) Wang
 
Intro to deep learning_ matteo alberti
Intro to deep learning_ matteo albertiIntro to deep learning_ matteo alberti
Intro to deep learning_ matteo albertiDeep Learning Italia
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya
 
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...cscpconf
 
Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1paul0001
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...TELKOMNIKA JOURNAL
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissectionChenYiHuang5
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionEun Ji Lee
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patternsVito Strano
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksSang Jun Lee
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3TOMMYLINK1
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 

Similar to Introduction For seq2seq(sequence to sequence) and RNN (20)

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
 
Intro to deep learning_ matteo alberti
Intro to deep learning_ matteo albertiIntro to deep learning_ matteo alberti
Intro to deep learning_ matteo alberti
 
LSTM
LSTMLSTM
LSTM
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
 
Rnn presentation 2
Rnn presentation 2Rnn presentation 2
Rnn presentation 2
 
Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissection
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patterns
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 

Recently uploaded

Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxNiranjanYadav41
 
Configuration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentConfiguration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentBharaniDharan195623
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptbibisarnayak0
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 

Recently uploaded (20)

Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
 
Configuration of IoT devices - Systems managament
Configuration of IoT devices - Systems managamentConfiguration of IoT devices - Systems managament
Configuration of IoT devices - Systems managament
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 

Introduction For seq2seq(sequence to sequence) and RNN

  • 1. Introduction to Sequence to Sequence Model 2017.03.16 Seminar Presenter : Hyemin Ahn
  • 2. Recurrent Neural Networks : For what? 2017-03-28 CPSLAB (EECS) 2  Human remembers and uses the pattern of sequence. • Try ‘a b c d e f g…’ • But how about ‘z y x w v u t s…’ ?  The idea behind RNN is to make use of sequential information.  Let’s learn a pattern of a sequence, and utilize (estimate, generate, etc…) it!  But HOW?
  • 3. Recurrent Neural Networks : Typical RNNs 2017-03-28 CPSLAB (EECS) 3 OUTPUT INPUT ONE STEP DELAY HIDDEN STATE  RNNs are called “RECURRENT” because they perform the same task for every element of a sequence, with the output being depended on the previous computations.  RNNs have a “memory” which captures information about what has been calculated so far.  The hidden state ℎ 𝑡 captures some information about a sequence.  If we use 𝑓 = tanh , Vanishing/Exploding gradient problem happens.  For overcome this, we use LSTM/GRU. 𝒉 𝒕 𝒚 𝒕 𝒙 𝒕 ℎ 𝑡 = 𝑓 𝑈𝑥 𝑡 + 𝑊ℎ 𝑡−1 + 𝑏 𝑦𝑡 = 𝑉ℎ 𝑡 + 𝑐 𝑈 𝑊 𝑉
  • 4. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 4  Let’s think about the machine, which guesses the dinner menu from things in shopping bag. Umm,, Carbonara!
  • 5. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 5 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕
  • 6. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 6 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 Forget Some Memories!
  • 7. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 7 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 Forget Some Memories! LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 8. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 8 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 Insert Some Memories! LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 9. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 9 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 10. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 10 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 11. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 11 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒚 𝒕 𝒙 𝒕 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 12. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 12 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡. Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 13. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 13 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡. Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 14. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 14 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡. Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 15. Recurrent Neural Networks : GRU 2017-03-28 CPSLAB (EECS) 15 𝑓𝑡 = 𝜎(𝑊𝑓 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑓) 𝑖 𝑡 = 𝜎 𝑊𝑖 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑖 𝑜𝑡 = 𝜎(𝑊𝑜 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑜) 𝐶𝑡 = tanh 𝑊𝐶 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶 𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖 𝑡 ∗ 𝐶𝑡 ℎ 𝑡 = 𝑜𝑡 ∗ tanh(𝐶𝑡) Maybe we can simplify this structure, efficiently! GRU 𝑧𝑡 = 𝜎 𝑊𝑧 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑧 𝑟𝑡 = 𝜎 𝑊𝑟 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑟 ℎ 𝑡 = tanh 𝑊ℎ ∙ 𝑟𝑡 ∗ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶 ℎ 𝑡 = (1 − 𝑧𝑡) ∗ ℎ 𝑡−1 + 𝑧𝑡 ∗ ℎ 𝑡 Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 16. Sequence to Sequence Model: What is it? 2017-03-28 CPSLAB (EECS) 16 ℎ 𝑒(1) ℎ 𝑒(2) ℎ 𝑒(3) ℎ 𝑒(4) ℎ 𝑒(5) LSTM/GRU Encoder LSTM/GRU Decoder ℎ 𝑑(1) ℎ 𝑑(𝑇𝑒) Western Food To Korean Food Transition
  • 17. Sequence to Sequence Model: Implementation 2017-03-28 CPSLAB (EECS) 17  The simplest way to implement sequence to sequence model is to just pass the last hidden state of decoder 𝒉 𝑻 to the first GRU cell of encoder!  However, this method’s power gets weaker when the encoder need to generate longer sequence.
  • 18. Sequence to Sequence Model: Attention Decoder 2017-03-28 CPSLAB (EECS) 18 Bidirectional GRU Encoder Attention GRU Decoder 𝑐𝑡  For each GRU cell consisting the decoder, let’s differently pass the encoder’s information! ℎ𝑖 = ℎ𝑖 ℎ𝑖 𝑐𝑖 = 𝑗=1 𝑇𝑥 𝛼𝑖𝑗ℎ𝑗 𝑠𝑖 = 𝑓 𝑠𝑖−1, 𝑦𝑖−1, 𝑐𝑖 = 1 − 𝑧𝑖 ∗ 𝑠𝑖−1 + 𝑧𝑖 ∗ 𝑠𝑖 𝑧𝑖 = 𝜎 𝑊𝑧 𝑦𝑖−1 + 𝑈𝑧 𝑠𝑖−1 𝑟𝑖 = 𝜎 𝑊𝑟 𝑦𝑖−1 + 𝑈𝑟 𝑠𝑖−1 𝑠𝑖 = tanh(𝑦𝑖−1 + 𝑈 𝑟𝑖 ∗ 𝑠𝑖−1 + 𝐶𝑐𝑖) 𝛼𝑖𝑗 = exp(𝑒 𝑖𝑗) 𝑘=1 𝑇 𝑥 exp(𝑒 𝑖𝑘) 𝑒𝑖𝑗 = 𝑣 𝑎 𝑇 tanh 𝑊𝑎 𝑠𝑖−1 + 𝑈 𝑎ℎ𝑗
  • 19. Sequence to Sequence Model: Example codes 2017-03-28 CPSLAB (EECS) 19 Codes Here @ Github