SlideShare a Scribd company logo
1 of 21
Download to read offline
Multi-Task Learning for NLP
2017/04/17 Parsing Group
Motoki Sato
What is Multi-task?
l Single task
2
l Multi task
Model 1
Input
(sentence)
POS
(task1)
Model 2
Input
(sentence)
Chunking
(task2)
Model
Input
(sentence)
POS
(task1)
Chunking
(task2)
Multi-task learning Paper (1)
3
l (Søgaard, 2016) ACL 2016 short.
l Tasks:
–  POS (low level task)
–  Chunking (high level task)
Multi-task learning Paper (2)
4
l (Hashimoto, 2016) arxiv.
l Tasks (many tasks):
–  POS, Chunking, Dependency parsing,
–  Semantic relatedness, Textual entailment
Dataset
5
(Søgaard, 2016) (Hashimoto, 2016)
POS Penn Treebank Penn Treebank
Chunking Penn Treebank Penn Treebank
CCG Penn Treebank -
Dependency parsing - Penn Treebank
Semantic relatedness - SICK
Textual entailment - SICK
(Søgaard, 2016)
(Søgaard, 2016)
POS Low level task
Chunking High level task
CCG High level task
6
Input Words and Predict Tag Examples:
Multi-task for Vision?
l  Cha Zhang, et al. “Improving Multiview Face Detection with Multi-Task Deep
Convolutional Neural Networks”
7
Share hidden layers
(shared representation)
Multi-task for NLP?
l  Collobert, et al. “Natural Language Processing (Almost) from Scratch”
8
Share
hidden
layers
Individual
layer for
each task
(Søgaard, 2016) Outermost ver.
9
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
w0 w1 w2 w3
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
POS
Tag
Chunk
Tag
POS
Tag
Chunk
Tag
… …
3-th layer
2-th layer
1-th layer
Previous multi-task learning shared hidden layers,
Share
hidden
layers
(Søgaard, 2016) lower-layer ver.
10
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
w0 w1 w2 w3
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Chunk
Tag
Chunk
Tag
… …
3-th layer
2-th layer
1-th layer
Previous multi-task learning shared hidden layers,
POS
Tag
POS
Tag
POS
Tag
POS
Tag
Experiments
11
Low-level
task
High-level task
Single task
Multi task
It is consistently better to have POS supervision at
the innermost rather than the outermost layer.
(Søgaard, 2016) Domain Adaptation
l What is domain adaptation?
12
Source
Trained
Model
Trained
Model
Target
(ex.) News domain (ex.) Twitter domain
(Søgaard, 2016) Source Training
13
Source
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
w0 w1 w2 w3
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Chunk
Tag
Chunk
Tag
… …
3-th layer
2-th layer
1-th layer
POS
Tag
POS
Tag
POS
Tag
POS
Tag
WSJ newswire
(Søgaard, 2016) Target Training
14
Target
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
w0 w1 w2 w3
Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM
Chunk
Tag
Chunk
Tag
… …
3-th layer
2-th layer
1-th layer
Re-train
POS at
Target
Domain
POS
Tag
POS
Tag
POS
Tag
broadcast, weblogs domain
No Chunk
training at
Target
Domain
Domain Adaptation Experiments
15
High-level task supervision in the source domain,
lower-level task supervision in the target domain.
(Hashimoto, 2016)
16
(Hashimoto, 2016)
17
(Hashimoto, 2016)
18
Training Loss for Multi Task Learning
l In (Hashimoto, 2016),
19
L2-norm regularization term
The embedding parameter after training the final
task in the top-most layer at the previous training
epoch.
Dataset
20
(Søgaard, 2016) (Hashimoto, 2016)
POS Penn Treebank Penn Treebank
Chunking Penn Treebank Penn Treebank
CCG Penn Treebank -
Dependency parsing - Penn Treebank
Semantic relatedness - SICK
Textual entailment - SICK
Since (Søgaard, 2016) uses same dataset (same
input), they can use the sum of loss for multi-tasks.
Catastrophic Forgetting
l  “Overcoming Catastrophic Forgetting in Neural Networks”, James
Kirkpatrick, Raia Hadsell, et al. https://arxiv.org/abs/1612.00796
l  https://theneuralperspective.com/2017/04/01/overcoming-catastrophic-
forgetting-in-neural-networks/
21

More Related Content

What's hot

Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
 
[2019] Class-based N-gram Models of Natural Language
[2019] Class-based N-gram Models of Natural Language[2019] Class-based N-gram Models of Natural Language
[2019] Class-based N-gram Models of Natural LanguageJinho Choi
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsLei Guo
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural networkKaty Lee
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Chain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptxChain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptxNeethaSherra1
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1Sara Hooker
 
Text Categorization using N-grams and Hidden-Markov-Models
Text Categorization using N-grams and Hidden-Markov-ModelsText Categorization using N-grams and Hidden-Markov-Models
Text Categorization using N-grams and Hidden-Markov-ModelsThomas Mathew
 
Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Olusola Amusan
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
 
Optimization in deep learning
Optimization in deep learningOptimization in deep learning
Optimization in deep learningJeremy Nixon
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Deep belief network.pptx
Deep belief network.pptxDeep belief network.pptx
Deep belief network.pptxSushilAcharya18
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolutionivan provalov
 

What's hot (20)

Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
[2019] Class-based N-gram Models of Natural Language
[2019] Class-based N-gram Models of Natural Language[2019] Class-based N-gram Models of Natural Language
[2019] Class-based N-gram Models of Natural Language
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural network
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Chain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptxChain-of-thought Prompting.pptx
Chain-of-thought Prompting.pptx
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Text Categorization using N-grams and Hidden-Markov-Models
Text Categorization using N-grams and Hidden-Markov-ModelsText Categorization using N-grams and Hidden-Markov-Models
Text Categorization using N-grams and Hidden-Markov-Models
 
Restricted boltzmann machine
Restricted boltzmann machineRestricted boltzmann machine
Restricted boltzmann machine
 
Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Optimization in deep learning
Optimization in deep learningOptimization in deep learning
Optimization in deep learning
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
01 introduction to mpls
01 introduction to mpls 01 introduction to mpls
01 introduction to mpls
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Deep belief network.pptx
Deep belief network.pptxDeep belief network.pptx
Deep belief network.pptx
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolution
 

Similar to Multi-Task Learning for NLP

Beyond Post-Editing: The Work of the eBay MTLS
Beyond Post-Editing: The Work of the eBay MTLSBeyond Post-Editing: The Work of the eBay MTLS
Beyond Post-Editing: The Work of the eBay MTLSJose Luis Bonilla Sánchez
 
Beyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s Role
Beyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s RoleBeyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s Role
Beyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s RoleJose Luis Bonilla Sánchez
 
Introduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning AlgorithmsIntroduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning AlgorithmsShao-Yen Hung
 
Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016Chris Fregly
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016MLconf
 
Domain specific nlp pipelines
Domain specific nlp pipelinesDomain specific nlp pipelines
Domain specific nlp pipelinesRajesh Muppalla
 
Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020Basis Technology
 
Mdb dn 2017_18_query_hackathon
Mdb dn 2017_18_query_hackathonMdb dn 2017_18_query_hackathon
Mdb dn 2017_18_query_hackathonDaniel M. Farrell
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Isabelle Augenstein
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMHolden Karau
 
Postgrest: the REST API for PostgreSQL databases
Postgrest: the REST API for PostgreSQL databasesPostgrest: the REST API for PostgreSQL databases
Postgrest: the REST API for PostgreSQL databasesLucio Grenzi
 

Similar to Multi-Task Learning for NLP (13)

Beyond Post-Editing: The Work of the eBay MTLS
Beyond Post-Editing: The Work of the eBay MTLSBeyond Post-Editing: The Work of the eBay MTLS
Beyond Post-Editing: The Work of the eBay MTLS
 
Beyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s Role
Beyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s RoleBeyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s Role
Beyond Post-Editing - How the eBay MTLS Reinvent the Linguist´s Role
 
Introduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning AlgorithmsIntroduction of Online Machine Learning Algorithms
Introduction of Online Machine Learning Algorithms
 
Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016Atlanta MLconf Machine Learning Conference 09-23-2016
Atlanta MLconf Machine Learning Conference 09-23-2016
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
 
Domain specific nlp pipelines
Domain specific nlp pipelinesDomain specific nlp pipelines
Domain specific nlp pipelines
 
Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020
 
inteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access FrameworkinteSearch: An Intelligent Linked Data Information Access Framework
inteSearch: An Intelligent Linked Data Information Access Framework
 
An Introduction To Map-Reduce
An Introduction To Map-ReduceAn Introduction To Map-Reduce
An Introduction To Map-Reduce
 
Mdb dn 2017_18_query_hackathon
Mdb dn 2017_18_query_hackathonMdb dn 2017_18_query_hackathon
Mdb dn 2017_18_query_hackathon
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVM
 
Postgrest: the REST API for PostgreSQL databases
Postgrest: the REST API for PostgreSQL databasesPostgrest: the REST API for PostgreSQL databases
Postgrest: the REST API for PostgreSQL databases
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Multi-Task Learning for NLP

  • 1. Multi-Task Learning for NLP 2017/04/17 Parsing Group Motoki Sato
  • 2. What is Multi-task? l Single task 2 l Multi task Model 1 Input (sentence) POS (task1) Model 2 Input (sentence) Chunking (task2) Model Input (sentence) POS (task1) Chunking (task2)
  • 3. Multi-task learning Paper (1) 3 l (Søgaard, 2016) ACL 2016 short. l Tasks: –  POS (low level task) –  Chunking (high level task)
  • 4. Multi-task learning Paper (2) 4 l (Hashimoto, 2016) arxiv. l Tasks (many tasks): –  POS, Chunking, Dependency parsing, –  Semantic relatedness, Textual entailment
  • 5. Dataset 5 (Søgaard, 2016) (Hashimoto, 2016) POS Penn Treebank Penn Treebank Chunking Penn Treebank Penn Treebank CCG Penn Treebank - Dependency parsing - Penn Treebank Semantic relatedness - SICK Textual entailment - SICK
  • 6. (Søgaard, 2016) (Søgaard, 2016) POS Low level task Chunking High level task CCG High level task 6 Input Words and Predict Tag Examples:
  • 7. Multi-task for Vision? l  Cha Zhang, et al. “Improving Multiview Face Detection with Multi-Task Deep Convolutional Neural Networks” 7 Share hidden layers (shared representation)
  • 8. Multi-task for NLP? l  Collobert, et al. “Natural Language Processing (Almost) from Scratch” 8 Share hidden layers Individual layer for each task
  • 9. (Søgaard, 2016) Outermost ver. 9 Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM w0 w1 w2 w3 Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM POS Tag Chunk Tag POS Tag Chunk Tag … … 3-th layer 2-th layer 1-th layer Previous multi-task learning shared hidden layers, Share hidden layers
  • 10. (Søgaard, 2016) lower-layer ver. 10 Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM w0 w1 w2 w3 Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Chunk Tag Chunk Tag … … 3-th layer 2-th layer 1-th layer Previous multi-task learning shared hidden layers, POS Tag POS Tag POS Tag POS Tag
  • 11. Experiments 11 Low-level task High-level task Single task Multi task It is consistently better to have POS supervision at the innermost rather than the outermost layer.
  • 12. (Søgaard, 2016) Domain Adaptation l What is domain adaptation? 12 Source Trained Model Trained Model Target (ex.) News domain (ex.) Twitter domain
  • 13. (Søgaard, 2016) Source Training 13 Source Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM w0 w1 w2 w3 Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Chunk Tag Chunk Tag … … 3-th layer 2-th layer 1-th layer POS Tag POS Tag POS Tag POS Tag WSJ newswire
  • 14. (Søgaard, 2016) Target Training 14 Target Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM w0 w1 w2 w3 Bi-LSTM Bi-LSTM Bi-LSTM Bi-LSTM Chunk Tag Chunk Tag … … 3-th layer 2-th layer 1-th layer Re-train POS at Target Domain POS Tag POS Tag POS Tag broadcast, weblogs domain No Chunk training at Target Domain
  • 15. Domain Adaptation Experiments 15 High-level task supervision in the source domain, lower-level task supervision in the target domain.
  • 19. Training Loss for Multi Task Learning l In (Hashimoto, 2016), 19 L2-norm regularization term The embedding parameter after training the final task in the top-most layer at the previous training epoch.
  • 20. Dataset 20 (Søgaard, 2016) (Hashimoto, 2016) POS Penn Treebank Penn Treebank Chunking Penn Treebank Penn Treebank CCG Penn Treebank - Dependency parsing - Penn Treebank Semantic relatedness - SICK Textual entailment - SICK Since (Søgaard, 2016) uses same dataset (same input), they can use the sum of loss for multi-tasks.
  • 21. Catastrophic Forgetting l  “Overcoming Catastrophic Forgetting in Neural Networks”, James Kirkpatrick, Raia Hadsell, et al. https://arxiv.org/abs/1612.00796 l  https://theneuralperspective.com/2017/04/01/overcoming-catastrophic- forgetting-in-neural-networks/ 21