SlideShare a Scribd company logo
1 of 74
Machine Translation
A introduction
Shu 2016.5
1
Part of this slide is stolen from the slide of Kohen
(www.statmt.org)
2
3
Agenda
• Overview of the history
• Statistical machine translation
• Recent developments in SMT
• Neural machine translation
• Some problems of NMT
• Futures of MT
What to translate
4
The dream of machine translation
5
Approaches of MT
6
The history of machine translation
• 1629
- Proposed universal language by René Descartes
- Different tongues shares one set of symbols
• 1947
- First computer used transistors instead of vacuum tubes
• 1949 ~
- Rule-based machine translation
• 1954
- First demo by IBM
• 1993 ~
- Statistical machine translation
• 2013 ~
- Neural machine translation
7
Rule-based translation systems
• Translation rules created by experts of
linguistics
• Hard to maintain or update
• The performance is still (or almost) the state-of-
the-art
8
Statistical machine translation
• Translation models are learned from parallel
corpus
• Language independent
9
10
Agenda
• Overview of the history
• Statistical machine translation
• Recent developments in SMT
• Neural machine translation
• Some problems of NMT
• Futures of MT
Statistical machine translation
11
For people who don’t like equations
12
A common pipeline of SMT
13
Alignment
Neural re-ranking
Evaluation of SMT
• BLEU
- n-gram matching (usually 4-gram)
• NIST
- Content words are more important
• RIBES (Hideki Isozaki, 2010)
- Order is also important
- Better for SVO-to-SOV language pairs
14
BLEU score matrix
15
A brief history of the development of SMT
• 1990 ~ 2000
- Word-based models (IBM models)
- Brown, Och, Ney.
• 2003
- Phrase-based models
- Philip Kohen
• 2005;2007
- Hierarchical Phrase-based models
- David Chiang
• 2010 ~
- Tree models, Factor models
16
Language model
• Modelling p(the dog is sparking)
- In order to know which candidate is more natural
• Markov Assumption
• 5-gram model is mostly used in SMT
17
Bi-gram example
18
Parallel corpus
19
Word alignments
20
Word alignments in the matrix
21
How to get word alignments
• In short
- Run giza++ with parallel corpus
- Wait for 5 hours
• Technically
- 5 IBM models, HMM models, EM algorithm
22
Run the EM algorithm
23
Run the EM algorithm
24
Run the EM algorithm
25
Run the EM algorithm
26
10 years of the work
Phrase-based translation model
27
He goes to the curry restaurant
Group into phrases
He goes to the curry restaurant
Translate
彼は ⾏く に カレー屋
Reorder
彼は ⾏くにカレー屋
Extract phrase table
28
Word alignments Phrase table
Decoding
• In short
- Run moses
- Wait for 2 days
• Technically
- (1) Load all the translation rules
- (2) Search for the best hypothesis
29
Load all the translation rules
30
Search for the best hypothesis
• Beam search / Cube search
31
Hierarchical phrase-based models
• Allow phrases to have gaps
32
Hard problems of MT
• Word order
• Word sense
• Pronouns
• Tense
• Idioms
33
Word order
34
Word sense ambiguity
35
Problem of pronouns
36
Different tenses
• Past tense vs. present tense
• Grammar discrepancy
37
Idioms
38
Resources of SMT
• Parallel corpus
- LDC datas
- www.ldc.upenn.edu
- Europarl corpus
- Danish, Dutch, English, Finnish, French,
- German, Greek, Italian, Portuguese, Spanish, Swedish
- Japanese
- NTCIR-8 (3M) , ASPEC (3M)
• Word alignment software
- GIZA ++, Berkeley aligner
• Language modelling
- SRILIM, Berkeley LM, KenLM
• Decoder
- Moses (maintained by the group of Kohen)
- Travatar (Graham Neubig)
39
40
Agenda
• Overview of the history
• Statistical machine translation
• Recent developments in SMT
• Neural machine translation
• Some problems of NMT
• Futures of MT
Recent developments of SMT
• Advances in decoders
• Super-large-scale language model
- language model compression
• Margin Infused Relaxed Algorithm (MIRA)
- train the hyper parameters in a smart way
• Tree models
- Tree-to-Tree translation
- String-to-Tree translation
- Tree-to-String translation
- Forest-to-String translation *
- Robust to parsing errors
• Factor models
• Pre-reordering
41
What is a parse tree
42
Context-free grammar Dependency grammar
Tree-to-string translation models
43
• Translate source code to comment
Pre-reordering phrase-based translation model
44
He goes to the curry restaurant
He the curry restaurant
Group into phrases
He the curry restaurant
Translate
彼は ⾏くにカレー屋
Pre-reordering
to goes
goesto
Example of pre-reordering
45
寿命 の 向上 が 実用 化 の 大きな 課題 で あ る 。
the life of the improvement va_nsubjpass the practical application of a large problem is .
Restructured parse tree
the improvement of the life is a large problem of the practical application.
Original input
Reordered input
Reference
A summary of SMT
46
47
Agenda
• Overview of the history
• Statistical machine translation
• Recent developments in SMT
• Neural machine translation
• Some problems of NMT
• Futures of MT
Problem of conventional SMT
• Under-fitting (non-parametric approach)
• Solution:
- Deep recurrent neural networks
48
Application of neural networks in MT
49
High computational complexity
50
High computational complexity
51
• Try AdaGrad, AdaDelta, Adam in the first place
Neural machine translation
• encoder-decoder approach
52
Performance dropMulti-layer encoder-decoder model
Soft-attention mechanism
‣ make a weighted summary
53
soft-attention model
Visualization of learned representation
54
Experiments in WAT2015
55
Evaluation result: human evaluation scores
56
Evaluation result: evaluation scores
57
BLEU RIBES HUMAN JPO
Baseline phrase-based SMT 29.80 0.691
Baseline hierarchical phrase-based SMT 32.56 0.746
Baseline Tree-to-string SMT 33.44 0.758 30.00
Submitted system 1
(NMT)
34.19 0.802 43.50
Submitted system 2
(NMT + System combination)
36.21 0.809 53.75 3.81
Best competitor 1: NAIST
(Travatar System with NeuralMT Reranking)
38.17 0.813 62.25 4.04
Best competitor 2: naver
(SMT t2s + Spell correction + NMT reranking)
36.14 0.803 53.25 4.00
(Option) Finding & Insights
‣ Soft-attention models outperforms multi-layer
encoder-decoder models
‣ Training models on pre-reordered data hurts
the performance
‣ NMT models tend to make grammatically
valid but incomplete translations
58
59
Agenda
• Overview of the history
• Statistical machine translation
• Recent developments in SMT
• Neural machine translation
• Some problems of NMT
• Futures of MT
Can’t use monolingual data
• Deep fusion (Gulcehre et al., 2015)
• Integrate a neural language model trained on massive
monolingual corpus
60
The attention mechanism is not perfect
• Local search (Minh-Thang Luong, 2015)
61
Local search modelGlobal search model
The attention mechanism is not perfect
• Input feeding
62
Translation does not cover all the words
• Coverage-based NMT model (Zhaopeng Tu et al., 2016)
63
Objective function is bad
• Cross-entropy is too much different to BLEU
• Solutions:
- (1) Data as demonstrator (Bengio et al., 2015)
64
Objective function is bad (cont.)
• Cross-entropy is too much different to BLEU
• Solutions:
- (2) Mixed REINFORCE (Ranzato et al., 2016)
65
Objective function is bad (cont.)
• Cross-entropy is too much different to BLEU
• Solutions:
- (3) Minimum Risk Training (Shen et al., 2015)
66
Objective of MRT
6 BLEU gain in Chinese-English task
Large vocabulary problem
• The problem
- English vocab. has 700K words
- So I set the size of output layer to 700K
- Then I get memory error
• Solutions
- I still want to use 700K vocab.
- Noise-contrastive estimation (Gutmann and Hyvarinen, 2010)
- Clustering (Mikolov. et al., 2013)
- Approximate Learning Approach (Jean et al., 2015)
- I give up, cut it to 80K vocab. and recover <UNK> tokens
- Positional unknown model (Minh-Thang Luong et al, 2015)
67
68
Agenda
• Overview of the history
• Statistical machine translation
• Recent developments in SMT
• Neural machine translation
• Some problems of NMT
• Futures of MT
Future of MT
• Semantic preserving translation
• Character/sub-word level models
• Translation in context
• Low-resource translation
- Knowledge transfer
- Multilingual translation
69
Multilingual seq-to-seq model
70
Modality agnostic space
71
Beyond translation: Image/Video Caption Generation
72
Beyond translation: Image/Video Caption Generation
73
Thanks.
74

More Related Content

What's hot

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingMercy Rani
 
Computational linguistics
Computational linguistics Computational linguistics
Computational linguistics kashmasardar
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.netwww.myassignmenthelp.net
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsshrey bhate
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsVahid Saffarian
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsAdnanBaloch15
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing Adarsh Saxena
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics1101989
 
Introduction to computational linguistics
Introduction to computational linguisticsIntroduction to computational linguistics
Introduction to computational linguisticsBijneshwor Shrestha
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Natural language processing
Natural language processingNatural language processing
Natural language processingKarenVacca
 
(Applied linguistics) shmitt's book ch 1
(Applied linguistics) shmitt's book ch 1(Applied linguistics) shmitt's book ch 1
(Applied linguistics) shmitt's book ch 1VivaAs
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Alia Hamwi
 
Natural language processing
Natural language processing Natural language processing
Natural language processing Md.Sumon Sarder
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentationSai Mohith
 
Computer Aided Translation
Computer Aided TranslationComputer Aided Translation
Computer Aided TranslationPhilipp Koehn
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 

What's hot (20)

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Computational linguistics
Computational linguistics Computational linguistics
Computational linguistics
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Introduction to computational linguistics
Introduction to computational linguisticsIntroduction to computational linguistics
Introduction to computational linguistics
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
(Applied linguistics) shmitt's book ch 1
(Applied linguistics) shmitt's book ch 1(Applied linguistics) shmitt's book ch 1
(Applied linguistics) shmitt's book ch 1
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Computer Aided Translation
Computer Aided TranslationComputer Aided Translation
Computer Aided Translation
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Viewers also liked

Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translationHrishikesh Nair
 
画像処理分野における研究事例紹介
画像処理分野における研究事例紹介画像処理分野における研究事例紹介
画像処理分野における研究事例紹介nlab_utokyo
 
20160601画像電子学会
20160601画像電子学会20160601画像電子学会
20160601画像電子学会nlab_utokyo
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーnlab_utokyo
 
マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例nlab_utokyo
 

Viewers also liked (7)

Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
画像処理分野における研究事例紹介
画像処理分野における研究事例紹介画像処理分野における研究事例紹介
画像処理分野における研究事例紹介
 
ISM2014
ISM2014ISM2014
ISM2014
 
20160601画像電子学会
20160601画像電子学会20160601画像電子学会
20160601画像電子学会
 
20150930
2015093020150930
20150930
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
 
マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例マシンパーセプション研究におけるChainer活用事例
マシンパーセプション研究におけるChainer活用事例
 

Similar to Machine Translation Introduction

Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)Zachary S. Brown
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationChamani Shiranthika
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchNatasha Latysheva
 
Translating phrases in neural machine translation
Translating phrases in neural machine translationTranslating phrases in neural machine translation
Translating phrases in neural machine translation sekizawayuuki
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translationkhyati gupta
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translationkhyati gupta
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Waykantanmt
 
Unsupervised Neural Machine Translation for Low-Resource Domains
Unsupervised Neural Machine Translation for Low-Resource DomainsUnsupervised Neural Machine Translation for Low-Resource Domains
Unsupervised Neural Machine Translation for Low-Resource Domainstaeseon ryu
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptxthanhdowork
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?CS, NcState
 
Assessing quick update methods of statistical translation models
Assessing quick update methods of statistical translation modelsAssessing quick update methods of statistical translation models
Assessing quick update methods of statistical translation modelstransLectures
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language ProcessingSebastian Ruder
 

Similar to Machine Translation Introduction (20)

Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translation
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
Translating phrases in neural machine translation
Translating phrases in neural machine translationTranslating phrases in neural machine translation
Translating phrases in neural machine translation
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
project present
project presentproject present
project present
 
Experiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine TranslationExperiments with Different Models of Statistcial Machine Translation
Experiments with Different Models of Statistcial Machine Translation
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
 
Searching for the Best Machine Translation Combination
Searching for the Best Machine Translation CombinationSearching for the Best Machine Translation Combination
Searching for the Best Machine Translation Combination
 
Unsupervised Neural Machine Translation for Low-Resource Domains
Unsupervised Neural Machine Translation for Low-Resource DomainsUnsupervised Neural Machine Translation for Low-Resource Domains
Unsupervised Neural Machine Translation for Low-Resource Domains
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 
Assessing quick update methods of statistical translation models
Assessing quick update methods of statistical translation modelsAssessing quick update methods of statistical translation models
Assessing quick update methods of statistical translation models
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 

More from nlab_utokyo

画像の基盤モデルの変遷と研究動向
画像の基盤モデルの変遷と研究動向画像の基盤モデルの変遷と研究動向
画像の基盤モデルの変遷と研究動向nlab_utokyo
 
大規模言語モデルとChatGPT
大規模言語モデルとChatGPT大規模言語モデルとChatGPT
大規模言語モデルとChatGPTnlab_utokyo
 
Non-autoregressive text generation
Non-autoregressive text generationNon-autoregressive text generation
Non-autoregressive text generationnlab_utokyo
 
2020年度 東京大学中山研 研究室紹介
2020年度 東京大学中山研 研究室紹介2020年度 東京大学中山研 研究室紹介
2020年度 東京大学中山研 研究室紹介nlab_utokyo
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~nlab_utokyo
 
Lab introduction 2014
Lab introduction 2014Lab introduction 2014
Lab introduction 2014nlab_utokyo
 
SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2nlab_utokyo
 

More from nlab_utokyo (12)

画像の基盤モデルの変遷と研究動向
画像の基盤モデルの変遷と研究動向画像の基盤モデルの変遷と研究動向
画像の基盤モデルの変遷と研究動向
 
大規模言語モデルとChatGPT
大規模言語モデルとChatGPT大規模言語モデルとChatGPT
大規模言語モデルとChatGPT
 
Non-autoregressive text generation
Non-autoregressive text generationNon-autoregressive text generation
Non-autoregressive text generation
 
2020年度 東京大学中山研 研究室紹介
2020年度 東京大学中山研 研究室紹介2020年度 東京大学中山研 研究室紹介
2020年度 東京大学中山研 研究室紹介
 
RecSysTV2014
RecSysTV2014RecSysTV2014
RecSysTV2014
 
20150414seminar
20150414seminar20150414seminar
20150414seminar
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~
 
MIRU2014 SLAC
MIRU2014 SLACMIRU2014 SLAC
MIRU2014 SLAC
 
Lab introduction 2014
Lab introduction 2014Lab introduction 2014
Lab introduction 2014
 
SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2
 
ICME 2013
ICME 2013ICME 2013
ICME 2013
 
Seminar
SeminarSeminar
Seminar
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Machine Translation Introduction

  • 2. Part of this slide is stolen from the slide of Kohen (www.statmt.org) 2
  • 3. 3 Agenda • Overview of the history • Statistical machine translation • Recent developments in SMT • Neural machine translation • Some problems of NMT • Futures of MT
  • 5. The dream of machine translation 5
  • 7. The history of machine translation • 1629 - Proposed universal language by René Descartes - Different tongues shares one set of symbols • 1947 - First computer used transistors instead of vacuum tubes • 1949 ~ - Rule-based machine translation • 1954 - First demo by IBM • 1993 ~ - Statistical machine translation • 2013 ~ - Neural machine translation 7
  • 8. Rule-based translation systems • Translation rules created by experts of linguistics • Hard to maintain or update • The performance is still (or almost) the state-of- the-art 8
  • 9. Statistical machine translation • Translation models are learned from parallel corpus • Language independent 9
  • 10. 10 Agenda • Overview of the history • Statistical machine translation • Recent developments in SMT • Neural machine translation • Some problems of NMT • Futures of MT
  • 12. For people who don’t like equations 12
  • 13. A common pipeline of SMT 13 Alignment Neural re-ranking
  • 14. Evaluation of SMT • BLEU - n-gram matching (usually 4-gram) • NIST - Content words are more important • RIBES (Hideki Isozaki, 2010) - Order is also important - Better for SVO-to-SOV language pairs 14
  • 16. A brief history of the development of SMT • 1990 ~ 2000 - Word-based models (IBM models) - Brown, Och, Ney. • 2003 - Phrase-based models - Philip Kohen • 2005;2007 - Hierarchical Phrase-based models - David Chiang • 2010 ~ - Tree models, Factor models 16
  • 17. Language model • Modelling p(the dog is sparking) - In order to know which candidate is more natural • Markov Assumption • 5-gram model is mostly used in SMT 17
  • 21. Word alignments in the matrix 21
  • 22. How to get word alignments • In short - Run giza++ with parallel corpus - Wait for 5 hours • Technically - 5 IBM models, HMM models, EM algorithm 22
  • 23. Run the EM algorithm 23
  • 24. Run the EM algorithm 24
  • 25. Run the EM algorithm 25
  • 26. Run the EM algorithm 26 10 years of the work
  • 27. Phrase-based translation model 27 He goes to the curry restaurant Group into phrases He goes to the curry restaurant Translate 彼は ⾏く に カレー屋 Reorder 彼は ⾏くにカレー屋
  • 28. Extract phrase table 28 Word alignments Phrase table
  • 29. Decoding • In short - Run moses - Wait for 2 days • Technically - (1) Load all the translation rules - (2) Search for the best hypothesis 29
  • 30. Load all the translation rules 30
  • 31. Search for the best hypothesis • Beam search / Cube search 31
  • 32. Hierarchical phrase-based models • Allow phrases to have gaps 32
  • 33. Hard problems of MT • Word order • Word sense • Pronouns • Tense • Idioms 33
  • 37. Different tenses • Past tense vs. present tense • Grammar discrepancy 37
  • 39. Resources of SMT • Parallel corpus - LDC datas - www.ldc.upenn.edu - Europarl corpus - Danish, Dutch, English, Finnish, French, - German, Greek, Italian, Portuguese, Spanish, Swedish - Japanese - NTCIR-8 (3M) , ASPEC (3M) • Word alignment software - GIZA ++, Berkeley aligner • Language modelling - SRILIM, Berkeley LM, KenLM • Decoder - Moses (maintained by the group of Kohen) - Travatar (Graham Neubig) 39
  • 40. 40 Agenda • Overview of the history • Statistical machine translation • Recent developments in SMT • Neural machine translation • Some problems of NMT • Futures of MT
  • 41. Recent developments of SMT • Advances in decoders • Super-large-scale language model - language model compression • Margin Infused Relaxed Algorithm (MIRA) - train the hyper parameters in a smart way • Tree models - Tree-to-Tree translation - String-to-Tree translation - Tree-to-String translation - Forest-to-String translation * - Robust to parsing errors • Factor models • Pre-reordering 41
  • 42. What is a parse tree 42 Context-free grammar Dependency grammar
  • 43. Tree-to-string translation models 43 • Translate source code to comment
  • 44. Pre-reordering phrase-based translation model 44 He goes to the curry restaurant He the curry restaurant Group into phrases He the curry restaurant Translate 彼は ⾏くにカレー屋 Pre-reordering to goes goesto
  • 45. Example of pre-reordering 45 寿命 の 向上 が 実用 化 の 大きな 課題 で あ る 。 the life of the improvement va_nsubjpass the practical application of a large problem is . Restructured parse tree the improvement of the life is a large problem of the practical application. Original input Reordered input Reference
  • 46. A summary of SMT 46
  • 47. 47 Agenda • Overview of the history • Statistical machine translation • Recent developments in SMT • Neural machine translation • Some problems of NMT • Futures of MT
  • 48. Problem of conventional SMT • Under-fitting (non-parametric approach) • Solution: - Deep recurrent neural networks 48
  • 49. Application of neural networks in MT 49
  • 51. High computational complexity 51 • Try AdaGrad, AdaDelta, Adam in the first place
  • 52. Neural machine translation • encoder-decoder approach 52 Performance dropMulti-layer encoder-decoder model
  • 53. Soft-attention mechanism ‣ make a weighted summary 53 soft-attention model
  • 54. Visualization of learned representation 54
  • 56. Evaluation result: human evaluation scores 56
  • 57. Evaluation result: evaluation scores 57 BLEU RIBES HUMAN JPO Baseline phrase-based SMT 29.80 0.691 Baseline hierarchical phrase-based SMT 32.56 0.746 Baseline Tree-to-string SMT 33.44 0.758 30.00 Submitted system 1 (NMT) 34.19 0.802 43.50 Submitted system 2 (NMT + System combination) 36.21 0.809 53.75 3.81 Best competitor 1: NAIST (Travatar System with NeuralMT Reranking) 38.17 0.813 62.25 4.04 Best competitor 2: naver (SMT t2s + Spell correction + NMT reranking) 36.14 0.803 53.25 4.00
  • 58. (Option) Finding & Insights ‣ Soft-attention models outperforms multi-layer encoder-decoder models ‣ Training models on pre-reordered data hurts the performance ‣ NMT models tend to make grammatically valid but incomplete translations 58
  • 59. 59 Agenda • Overview of the history • Statistical machine translation • Recent developments in SMT • Neural machine translation • Some problems of NMT • Futures of MT
  • 60. Can’t use monolingual data • Deep fusion (Gulcehre et al., 2015) • Integrate a neural language model trained on massive monolingual corpus 60
  • 61. The attention mechanism is not perfect • Local search (Minh-Thang Luong, 2015) 61 Local search modelGlobal search model
  • 62. The attention mechanism is not perfect • Input feeding 62
  • 63. Translation does not cover all the words • Coverage-based NMT model (Zhaopeng Tu et al., 2016) 63
  • 64. Objective function is bad • Cross-entropy is too much different to BLEU • Solutions: - (1) Data as demonstrator (Bengio et al., 2015) 64
  • 65. Objective function is bad (cont.) • Cross-entropy is too much different to BLEU • Solutions: - (2) Mixed REINFORCE (Ranzato et al., 2016) 65
  • 66. Objective function is bad (cont.) • Cross-entropy is too much different to BLEU • Solutions: - (3) Minimum Risk Training (Shen et al., 2015) 66 Objective of MRT 6 BLEU gain in Chinese-English task
  • 67. Large vocabulary problem • The problem - English vocab. has 700K words - So I set the size of output layer to 700K - Then I get memory error • Solutions - I still want to use 700K vocab. - Noise-contrastive estimation (Gutmann and Hyvarinen, 2010) - Clustering (Mikolov. et al., 2013) - Approximate Learning Approach (Jean et al., 2015) - I give up, cut it to 80K vocab. and recover <UNK> tokens - Positional unknown model (Minh-Thang Luong et al, 2015) 67
  • 68. 68 Agenda • Overview of the history • Statistical machine translation • Recent developments in SMT • Neural machine translation • Some problems of NMT • Futures of MT
  • 69. Future of MT • Semantic preserving translation • Character/sub-word level models • Translation in context • Low-resource translation - Knowledge transfer - Multilingual translation 69
  • 72. Beyond translation: Image/Video Caption Generation 72
  • 73. Beyond translation: Image/Video Caption Generation 73