SlideShare a Scribd company logo
1 of 27
Download to read offline
Learning to Compose Domain-
Specific Transformations for
Data Augmentation
Tatsuya Shirakawa
tatsuya@abeja.asia
ABEJA, Inc. (Researcher)
- Deep Learning
- Computer Vision
- Natural Language Processing
- Graph Convolution / Graph Embedding
- Mathematical Optimization
- https://github.com/TatsuyaShiraka
tech blog → http://tech-blog.abeja.asia/
Poincaré Embeddings Graph Convolution
We are hiring! → https://www.abeja.asia/recruit/
→ https://six.abejainc.com/
A. J. Ratner, H. R. Ehrenberg, et al., “Learning to Compose Domain-
Specific Transformations for Data Augmentation”, NIPS2017
Today’s Paper
3
Problem to solve
• Learning how to compose predefined
data transformations (TFs) to create
naturally transformed data (data
augmentation)
How to solve
• Formulate the problem as a sequence
generation problem
• Learned by policy gradient method
1. Introduction
2. Proposed Method
3. Results
4. Summary
Agenda
4
1.Introduction
2. Proposed Method
3. Results
4. Summary
Agenda
5
Applying sequence of transformation functions
(TFs) to each data to augment dataset
Data Augmentation (DA)
6
Common Assumption

Transformed data are natural and essential
informations (e.g. classes) are kept unchanged


… But massive DA can easily break the assumption
DA can break informations
7
(CIFAR-10)
• Generator generates sequences of TFs
• Discriminator discriminates transformed
data are realistic or not
• End model (learned afterward)
This Paper — Learning to Compose TFs
8
G
D
Df
Technical Remarks: transformation sequences have same length L
1. Introduction
2.Proposed Method
3. Results
4. Summary
Agenda
9
• Discriminator discriminate whether given data
are realistic (1) or not (0)
• Relaxed Assumption

TFs preserve essential information or collapse it
Discriminator
10
Generator G is adversarially learned against D
This leads G to generate transformation sequences
that don’t collapse data
Generative Adversarial Objective
11Technical Remarks: Generator is not conditioned on data
Generator should not learn null transformation
sequences, so maximize
Examples of Null transformation sequence
• Horizontal Flip x 2
• Rotate left 5° and rotate right 5°
Diversity Objective
12
Overall Objective
13
min
✓
max J = ˜J + ↵J 1
d
• We can optimize discriminator and generator
alternatively
• Optimization of discriminator can be done
by simple gradient ascent method
• Optimization of generator needs
optimization of sequence generation
process and cannot be applied simple
gradient descent method
Optimization
14
G
D
Reformulate the optimization problem for G as a
sequential decision making (RL) problem
Optimization of G — RL problem
15
…
h⌧1
h⌧2
h⌧L
x ˜x1 ˜x2 ˜xL
r1 r2 rL
Technical Remarks: loss is defined as loss(x) = log(1-D(x)) in the paper
rt = loss(˜xt) loss(˜xt 1),
LX
t=1
rt = loss(˜xL) loss(x)
Final loss





can be minimized by policy gradient method
Optimization of G — Policy Gradient
16
π … stochastic transition policy
implicitly defined by G
Policy Gradient Method
1.Generate samples (run the policy)
2.Estimate return
3.Improve the policy ✓ ✓ ⌘r✓U(✓)
Independent Model — Mean Field Model

learning task-specific “accuracy” and “frequency”
of each TF 

e.g.
State-based Model — LSTM

some combination of TFs might be very lossy

(e.g. blur -> zoom, brighten -> saturation)
Generator (Policy) Model
17
• D measures whether data are realistic or not
• G (mean field / LSTM) generate sequences of TFs of length L
• Adversarial training for G & D
• Standard gradient ascent method for D
• Policy gradient method for G
Summary of Proposed Method
18
1. Introduction
2. Proposed Method
3.Results
4. Summary
Agenda
19
• MNIST
• CIFAR-10
Datasets
20
• ACE corpus • Mammography Tumor-
Classification Dataset 

(DDSM)
• MNIST
• CIFAR-10
Datasets — Image Datasets
21
• ACE corpus • Mammography Tumor-
Classification Dataset 

(DDSM)
MNIST
CIFAR-10
• MNIST
• CIFAR-10
Datasets — ACE corpus
22
• ACE corpus • Mammography Tumor-
Classification Dataset 

(DDSM)
The goal is to identify
mentions of employer-
employee relations in
news articles
Conditional word swap TF
1.Construct trigram
language model
2.Sample a word
conditioned on the
preceding words
• MNIST
• CIFAR-10
Datasets — DDSM dataset
23
• ACE corpus • Mammography Tumor-
Classification Dataset 

(DDSM)
Standard image TFs
Subselected so as not to
break class-invariance
Segmentation-based TFs
1.Segment the tumor mass
2.Perform TFs 

(e.g. rotation or shifting)
3.Stitch it into a randomly-
sampled benign tissue
image
Results — CIFAR-10 Classification
24
Basic … random crop
Heur. … random composition of TFs
+ DS … allowing domain-specific TFs (semantic-segmentation-based)
Results — TF Freq. / Seq. Length
25
Results — Training Progress on MNIST
26
https://hazyresearch.github.io/snorkel/blog/tanda.html
• Adversarial Training for Data Augmentation
• Optimization with standard/policy gradient method
• Achieved better performance on several datasets
Summary
27

More Related Content

What's hot

Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningSebastian Ruder
 
APS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor RhoneAPS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor RhoneTrevorDavidRhone
 
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...AI Frontiers
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNAlpaca
 
InfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial NetworksInfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial NetworksZak Jost
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 Albert Bifet
 
Machine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By ExamplesMachine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By ExamplesMario Cartia
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15MLconf
 
Pybcn machine learning for dummies with python
Pybcn machine learning for dummies with pythonPybcn machine learning for dummies with python
Pybcn machine learning for dummies with pythonJavier Arias Losada
 
Neural Networks and Deep Learning for Physicists
Neural Networks and Deep Learning for PhysicistsNeural Networks and Deep Learning for Physicists
Neural Networks and Deep Learning for PhysicistsHéloïse Nonne
 
Big Data Analytics for connected home
Big Data Analytics for connected homeBig Data Analytics for connected home
Big Data Analytics for connected homeHéloïse Nonne
 
Europython - Machine Learning for dummies with Python
Europython - Machine Learning for dummies with PythonEuropython - Machine Learning for dummies with Python
Europython - Machine Learning for dummies with PythonJavier Arias Losada
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Dongmin Choi
 

What's hot (20)

Startup Data Science
Startup Data ScienceStartup Data Science
Startup Data Science
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine Learning
 
APS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor RhoneAPS GDS data science talk by Trevor Rhone
APS GDS data science talk by Trevor Rhone
 
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
 
InfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial NetworksInfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial Networks
 
Active learning
Active learningActive learning
Active learning
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
Machine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By ExamplesMachine Learning Real Life Applications By Examples
Machine Learning Real Life Applications By Examples
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
 
Pybcn machine learning for dummies with python
Pybcn machine learning for dummies with pythonPybcn machine learning for dummies with python
Pybcn machine learning for dummies with python
 
Neural Networks and Deep Learning for Physicists
Neural Networks and Deep Learning for PhysicistsNeural Networks and Deep Learning for Physicists
Neural Networks and Deep Learning for Physicists
 
Big Data Analytics for connected home
Big Data Analytics for connected homeBig Data Analytics for connected home
Big Data Analytics for connected home
 
李育杰/The Growth of a Data Scientist
李育杰/The Growth of a Data Scientist李育杰/The Growth of a Data Scientist
李育杰/The Growth of a Data Scientist
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
 
Bol.com
Bol.comBol.com
Bol.com
 
Europython - Machine Learning for dummies with Python
Europython - Machine Learning for dummies with PythonEuropython - Machine Learning for dummies with Python
Europython - Machine Learning for dummies with Python
 
Lent Matlab H Ss
Lent Matlab H SsLent Matlab H Ss
Lent Matlab H Ss
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
 

Similar to Learning to Compose Domain-Specific Transformations for Data Augmentation

Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 
Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2yannabraham
 
OPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEM
OPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEMOPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEM
OPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEMJesus Velasquez
 
Iteratively Learning Data Transformation Programs from Examples
Iteratively Learning Data Transformation Programs from ExamplesIteratively Learning Data Transformation Programs from Examples
Iteratively Learning Data Transformation Programs from ExamplesBo Wu
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsJason Riedy
 
Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataWeCloudData
 
OPTEX Mathematical Modeling and Management System
OPTEX Mathematical Modeling and Management SystemOPTEX Mathematical Modeling and Management System
OPTEX Mathematical Modeling and Management SystemJesus Velasquez
 
Applying Linear Optimization Using GLPK
Applying Linear Optimization Using GLPKApplying Linear Optimization Using GLPK
Applying Linear Optimization Using GLPKJeremy Chen
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systemsOlivier Teytaud
 
OPTEX - Mathematical Modeling and Management System
OPTEX - Mathematical Modeling and Management System OPTEX - Mathematical Modeling and Management System
OPTEX - Mathematical Modeling and Management System Jesus Velasquez
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality ReductionSaad Elbeleidy
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...Viach Kakovskyi
 
Big Data And Machine Learning Using MATLAB.pdf
Big Data And Machine Learning Using MATLAB.pdfBig Data And Machine Learning Using MATLAB.pdf
Big Data And Machine Learning Using MATLAB.pdfssuserb2837a
 
AutoML for user segmentation: how to match millions of users with hundreds of...
AutoML for user segmentation: how to match millions of users with hundreds of...AutoML for user segmentation: how to match millions of users with hundreds of...
AutoML for user segmentation: how to match millions of users with hundreds of...Institute of Contemporary Sciences
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesRevolution Analytics
 
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)쉽게 설명하는 GAN (What is this? Gum? It's GAN.)
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)Hansol Kang
 

Similar to Learning to Compose Domain-Specific Transformations for Data Augmentation (20)

Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2
 
OPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEM
OPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEMOPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEM
OPTEX MATHEMATICAL MODELING AND MANAGEMENT SYSTEM
 
Iteratively Learning Data Transformation Programs from Examples
Iteratively Learning Data Transformation Programs from ExamplesIteratively Learning Data Transformation Programs from Examples
Iteratively Learning Data Transformation Programs from Examples
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming Graphs
 
Srikanta Mishra
Srikanta MishraSrikanta Mishra
Srikanta Mishra
 
Deep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudDataDeep Learning Introduction - WeCloudData
Deep Learning Introduction - WeCloudData
 
OPTEX Mathematical Modeling and Management System
OPTEX Mathematical Modeling and Management SystemOPTEX Mathematical Modeling and Management System
OPTEX Mathematical Modeling and Management System
 
Applying Linear Optimization Using GLPK
Applying Linear Optimization Using GLPKApplying Linear Optimization Using GLPK
Applying Linear Optimization Using GLPK
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
OPTEX - Mathematical Modeling and Management System
OPTEX - Mathematical Modeling and Management System OPTEX - Mathematical Modeling and Management System
OPTEX - Mathematical Modeling and Management System
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...
 
Planet
PlanetPlanet
Planet
 
Big Data And Machine Learning Using MATLAB.pdf
Big Data And Machine Learning Using MATLAB.pdfBig Data And Machine Learning Using MATLAB.pdf
Big Data And Machine Learning Using MATLAB.pdf
 
[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊
 
AutoML for user segmentation: how to match millions of users with hundreds of...
AutoML for user segmentation: how to match millions of users with hundreds of...AutoML for user segmentation: how to match millions of users with hundreds of...
AutoML for user segmentation: how to match millions of users with hundreds of...
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)쉽게 설명하는 GAN (What is this? Gum? It's GAN.)
쉽게 설명하는 GAN (What is this? Gum? It's GAN.)
 

More from Tatsuya Shirakawa

NeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under UncertaintyNeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under UncertaintyTatsuya Shirakawa
 
2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phrase2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phraseTatsuya Shirakawa
 
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19Tatsuya Shirakawa
 
Retail Face Analysis Inside-Out
Retail Face Analysis Inside-OutRetail Face Analysis Inside-Out
Retail Face Analysis Inside-OutTatsuya Shirakawa
 
データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法Tatsuya Shirakawa
 
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習Tatsuya Shirakawa
 
Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ...
 Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ... Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ...
Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ...Tatsuya Shirakawa
 
Poincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical RepresentationsPoincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical RepresentationsTatsuya Shirakawa
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 

More from Tatsuya Shirakawa (13)

NeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under UncertaintyNeurIPS2021読み会 Fairness in Ranking under Uncertainty
NeurIPS2021読み会 Fairness in Ranking under Uncertainty
 
2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phrase2021 10-07 kdd2021読み会 uc phrase
2021 10-07 kdd2021読み会 uc phrase
 
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
医療ビッグデータの今後を見通すために知っておきたい機械学習の基礎〜最前線 agains COVID-19
 
ICCV2019 report
ICCV2019 reportICCV2019 report
ICCV2019 report
 
Retail Face Analysis Inside-Out
Retail Face Analysis Inside-OutRetail Face Analysis Inside-Out
Retail Face Analysis Inside-Out
 
データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法データに内在する構造をみるための埋め込み手法
データに内在する構造をみるための埋め込み手法
 
ヒトの機械学習
ヒトの機械学習ヒトの機械学習
ヒトの機械学習
 
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
Seeing Unseens with Machine Learning -- 
見えていないものを見出す機械学習
 
Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ...
 Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ... Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ...
Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., ...
 
Hyperbolic Neural Networks
Hyperbolic Neural NetworksHyperbolic Neural Networks
Hyperbolic Neural Networks
 
Poincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical RepresentationsPoincare embeddings for Learning Hierarchical Representations
Poincare embeddings for Learning Hierarchical Representations
 
Dynamic filter networks
Dynamic filter networksDynamic filter networks
Dynamic filter networks
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 

Recently uploaded

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 

Recently uploaded (20)

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 

Learning to Compose Domain-Specific Transformations for Data Augmentation

  • 1. Learning to Compose Domain- Specific Transformations for Data Augmentation Tatsuya Shirakawa tatsuya@abeja.asia
  • 2. ABEJA, Inc. (Researcher) - Deep Learning - Computer Vision - Natural Language Processing - Graph Convolution / Graph Embedding - Mathematical Optimization - https://github.com/TatsuyaShiraka tech blog → http://tech-blog.abeja.asia/ Poincaré Embeddings Graph Convolution We are hiring! → https://www.abeja.asia/recruit/ → https://six.abejainc.com/
  • 3. A. J. Ratner, H. R. Ehrenberg, et al., “Learning to Compose Domain- Specific Transformations for Data Augmentation”, NIPS2017 Today’s Paper 3 Problem to solve • Learning how to compose predefined data transformations (TFs) to create naturally transformed data (data augmentation) How to solve • Formulate the problem as a sequence generation problem • Learned by policy gradient method
  • 4. 1. Introduction 2. Proposed Method 3. Results 4. Summary Agenda 4
  • 5. 1.Introduction 2. Proposed Method 3. Results 4. Summary Agenda 5
  • 6. Applying sequence of transformation functions (TFs) to each data to augment dataset Data Augmentation (DA) 6
  • 7. Common Assumption
 Transformed data are natural and essential informations (e.g. classes) are kept unchanged 
 … But massive DA can easily break the assumption DA can break informations 7 (CIFAR-10)
  • 8. • Generator generates sequences of TFs • Discriminator discriminates transformed data are realistic or not • End model (learned afterward) This Paper — Learning to Compose TFs 8 G D Df Technical Remarks: transformation sequences have same length L
  • 9. 1. Introduction 2.Proposed Method 3. Results 4. Summary Agenda 9
  • 10. • Discriminator discriminate whether given data are realistic (1) or not (0) • Relaxed Assumption
 TFs preserve essential information or collapse it Discriminator 10
  • 11. Generator G is adversarially learned against D This leads G to generate transformation sequences that don’t collapse data Generative Adversarial Objective 11Technical Remarks: Generator is not conditioned on data
  • 12. Generator should not learn null transformation sequences, so maximize Examples of Null transformation sequence • Horizontal Flip x 2 • Rotate left 5° and rotate right 5° Diversity Objective 12
  • 14. • We can optimize discriminator and generator alternatively • Optimization of discriminator can be done by simple gradient ascent method • Optimization of generator needs optimization of sequence generation process and cannot be applied simple gradient descent method Optimization 14 G D
  • 15. Reformulate the optimization problem for G as a sequential decision making (RL) problem Optimization of G — RL problem 15 … h⌧1 h⌧2 h⌧L x ˜x1 ˜x2 ˜xL r1 r2 rL Technical Remarks: loss is defined as loss(x) = log(1-D(x)) in the paper rt = loss(˜xt) loss(˜xt 1), LX t=1 rt = loss(˜xL) loss(x)
  • 16. Final loss
 
 
 can be minimized by policy gradient method Optimization of G — Policy Gradient 16 π … stochastic transition policy implicitly defined by G Policy Gradient Method 1.Generate samples (run the policy) 2.Estimate return 3.Improve the policy ✓ ✓ ⌘r✓U(✓)
  • 17. Independent Model — Mean Field Model
 learning task-specific “accuracy” and “frequency” of each TF 
 e.g. State-based Model — LSTM
 some combination of TFs might be very lossy
 (e.g. blur -> zoom, brighten -> saturation) Generator (Policy) Model 17
  • 18. • D measures whether data are realistic or not • G (mean field / LSTM) generate sequences of TFs of length L • Adversarial training for G & D • Standard gradient ascent method for D • Policy gradient method for G Summary of Proposed Method 18
  • 19. 1. Introduction 2. Proposed Method 3.Results 4. Summary Agenda 19
  • 20. • MNIST • CIFAR-10 Datasets 20 • ACE corpus • Mammography Tumor- Classification Dataset 
 (DDSM)
  • 21. • MNIST • CIFAR-10 Datasets — Image Datasets 21 • ACE corpus • Mammography Tumor- Classification Dataset 
 (DDSM) MNIST CIFAR-10
  • 22. • MNIST • CIFAR-10 Datasets — ACE corpus 22 • ACE corpus • Mammography Tumor- Classification Dataset 
 (DDSM) The goal is to identify mentions of employer- employee relations in news articles Conditional word swap TF 1.Construct trigram language model 2.Sample a word conditioned on the preceding words
  • 23. • MNIST • CIFAR-10 Datasets — DDSM dataset 23 • ACE corpus • Mammography Tumor- Classification Dataset 
 (DDSM) Standard image TFs Subselected so as not to break class-invariance Segmentation-based TFs 1.Segment the tumor mass 2.Perform TFs 
 (e.g. rotation or shifting) 3.Stitch it into a randomly- sampled benign tissue image
  • 24. Results — CIFAR-10 Classification 24 Basic … random crop Heur. … random composition of TFs + DS … allowing domain-specific TFs (semantic-segmentation-based)
  • 25. Results — TF Freq. / Seq. Length 25
  • 26. Results — Training Progress on MNIST 26 https://hazyresearch.github.io/snorkel/blog/tanda.html
  • 27. • Adversarial Training for Data Augmentation • Optimization with standard/policy gradient method • Achieved better performance on several datasets Summary 27