SlideShare a Scribd company logo
1 of 20
InfoGAN: Interpretable Representation Learning by
Information Maximizing Generative Adversarial Nets
Xi Chen, Yan Duan, Rein Houthooft, John Schulman,
Ilya Sutskever, Pieter Abbeel (UC Berkeley, Open AI)
Presenter: Shuhei M. Yoshida (Dept. of Physics, UTokyo)
Unsupervised learning of disentangled representations
Goal
GANs + Maximizing Mutual Information
between generated images and input codes
Approach
Benefit
Interpretable representation obtained
without supervision and substantial additional costs
Reference
https://arxiv.org/abs/1606.03657 (with Appendix sections)
Implementations
https://github.com/openai/InfoGAN (by the authors, with TensorFlow)
https://github.com/yoshum/InfoGAN (by the presenter, with Chainer)
NIPS2016読み会
Motivation
How can we achieve
unsupervised learning of disentangled representation?
In general, learned representation is entangled,
i.e. encoded in a data space in a complicated manner
When a representation is disentangled, it would be
more interpretable and easier to apply to tasks
Related works
• Unsupervised learning of representation
(no mechanism to force disentanglement)
Stacked (often denoising) autoencoder, RBM
Many others, including semi-supervised approach
• Supervised learning of disentangled representation
Bilinear models, multi-view perceptron
VAEs, adversarial autoencoders
• Weakly supervised learning of disentangled representation
disBM, DC-IGN
• Unsupervised learning of disentangled representation
hossRBM, applicable only to discrete latent factors
which the presenter has almost no knowledge about.
This work:
Unsupervised learning of disentangled representation
applicable to both continuous and discrete latent factors
Generative Adversarial Nets(GANs)
Generative model trained by competition between
two neural nets:
Generator 𝑥 = 𝐺 𝑧 , 𝑧 ∼ 𝑝 𝑧 𝑍
𝑝 𝑧 𝑍 : an arbitrary noise distribution
Discriminator 𝐷 𝑥 ∈ 0,1 :
probability that 𝑥 is sampled from the data dist. 𝑝data(𝑋)
rather than generated by the generator 𝐺 𝑧
min
𝐺
max
𝐷
𝑉GAN 𝐺, 𝐷 , where
𝑉GAN 𝐺, 𝐷 ≡ 𝐸 𝑥∼𝑝data 𝑋 ln 𝐷 𝑥 + 𝐸 𝑧∼𝑝 𝑧 𝑍 ln 1 − 𝐷 𝐺 𝑧
Optimization problem to solve:
Problems with GANs
From the perspective of representation learning:
No restrictions on how 𝐺 𝑧 uses 𝑧
• 𝑧 can be used in a highly entangled way
• Each dimension of 𝑧 does not represent
any salient feature of the training data
𝑧1
𝑧2
Proposed Resolution: InfoGAN
-Maximizing Mutual Information -
Observation in conventional GANs:
a generated date 𝑥 does not have much information
on the noise 𝑧 from which 𝑥 is generated
because of heavily entangled use of 𝑧
Proposed resolution = InfoGAN:
the generator 𝐺 𝑧, 𝑐 trained so that
it maximize the mutual information 𝐼 𝐶 𝑋 between
the latent code 𝐶 and the generated data 𝑋
min
𝐺
max
𝐷
𝑉GAN 𝐺, 𝐷 − 𝜆𝐼 𝐶 𝑋 = 𝐺 𝑍, 𝐶
Mutual Information
𝐼 𝑋; 𝑌 = 𝐻 𝑋 − 𝐻 𝑋 𝑌 , where
• 𝐻 𝑋 = 𝐸 𝑥∼𝑝 𝑋 − ln 𝑝 𝑋 = 𝑥 :
Entropy of the prior distribution
• 𝐻 𝑋 𝑌 = 𝐸 𝑦∼𝑝 𝑌 ,𝑥∼𝑝 𝑋|𝑌=𝑦 − ln 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 :
Entropy of the posterior distribution
𝑝 𝑋 = 𝑥
𝑥
𝑝 𝑋 = 𝑥|𝑌 = 𝑦
𝑥
𝑝 𝑋 = 𝑥|𝑌 = 𝑦
𝑥
𝐼 𝑋; 𝑌 = 0 𝐼 𝑋; 𝑌 > 0
Sampling 𝑦 ∼ 𝑝 𝑌
Avoiding increase of calculation costs
Major difficulty:
Evaluation of 𝐼 𝐶 𝑋 based on
evaluation and sampling from the posterior 𝑝 𝐶 𝑋
Two strategies:
Variational maximization of mutual information
Use an approximate function 𝑄 𝑐 𝑥 = 𝑝 𝐶 = 𝑐 𝑋 = 𝑥
Sharing the neural net
between 𝑄 𝑐 𝑥 and the discriminator 𝐷 𝑥
Variational Maximization of MI
For an arbitrary function 𝑄 𝑐, 𝑥 ,
𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑝 𝐶 = 𝑐 𝑋 = 𝑥
(∵ positivity of KL divergence)
= 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄(𝑐, 𝑥) + 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln
𝑝 𝐶 = 𝑐 𝑋 = 𝑥
𝑄 𝑐, 𝑥
= 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄(𝑐, 𝑥) + 𝐸 𝑥∼𝑝 𝐺 𝑋 𝐷KL 𝑝 𝐶 𝑋 = 𝑥 ||𝑄 𝐶, 𝑥
≥ 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄(𝑐, 𝑥)
Variational Maximization of MI
Maximizing 𝐿𝐼 𝐺, 𝑄 w.r.t. 𝐺 and 𝑄
 With 𝑄(𝑐, 𝑥) approximating 𝑝 𝐶 = 𝑐 𝑋 = 𝑥 , we obtain
an variational estimate of the mutual information:
𝐿𝐼 𝐺, 𝑄 ≡ 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 + 𝐻 𝐶
≲ 𝐼 𝐶 𝑋 = 𝐺 𝑍, 𝐶
⇔
• Achieving the equality by setting 𝑄 𝑐, 𝑥 = 𝑝 𝐶 = 𝑐 𝑋 = 𝑥
• Maximizing the mutual information
min
𝐺,𝑄
max
𝐷
𝑉GAN 𝐺, 𝐷 − 𝜆𝐿𝐼 𝐺, 𝑄
Optimization problem to solve in InfoGAN:
Eliminate sampling from posterior
Lemma
𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥) 𝑓 𝑥, 𝑦 = 𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥),𝑥′∼𝑝 𝑋′ 𝑌=𝑦) 𝑓 𝑥′
, 𝑦 .
𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥
= 𝐸 𝑐∼𝑝 𝐶 ,𝑧∼𝑝 𝑧 𝑍 ,𝑥=𝐺 𝑧,𝑐 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 ,
By using this lemma and noting that
𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 = 𝑬 𝒄∼𝒑 𝑪 ,𝒛∼𝒑 𝒛 𝒁 ,𝒙=𝑮 𝒛,𝒄 𝐥𝐧 𝑸 𝒄, 𝒙
we can eliminate the sampling from 𝑝 𝐶|𝑋 = 𝑥 :
Easy to estimate!
Proof of lemma
Lemma
𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥) 𝑓 𝑥, 𝑦 = 𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥),𝑥′∼𝑝 𝑋′ 𝑌=𝑦) 𝑓 𝑥′
, 𝑦 .
∵ l. h. s. =
𝑥 𝑦
𝑝 𝑋 = 𝑥 𝑝 𝑌 = 𝑦 𝑋 = 𝑥 𝑓 𝑥, 𝑦
=
𝑥 𝑦
𝑝 𝑌 = 𝑦 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 𝑓 𝑥, 𝑦
=
𝑥 𝑦 𝑥′
𝑝 𝑋 = 𝑥′, 𝑌 = 𝑦 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 𝑓 𝑥, 𝑦
=
𝑥 𝑦 𝑥′
𝑝 𝑋 = 𝑥′ 𝑝 𝑌 = 𝑦 𝑋 = 𝑥′ 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 𝑓 𝑥, 𝑦
= r. h. s.
∵ Bayes’ theorem
Sharing layers between 𝐷 and 𝑄
Model 𝑄 𝑐, 𝑥 using neural network
Reduce the calculation costs by
sharing all the convolution layers with 𝐷
Image from Odena, et al., arXiv:1610.09585.
Convolution layers of the discriminator
𝐷 𝑄
Given DCGANs,
InfoGAN comes for negligible additional costs!
Experiment – MI Maximization
• InfoGAN on MNIST dataset
• Latent code 𝑐
= 10-class categorical code
𝐿𝐼 quickly saturates to
𝐻 𝑐 = ln 10 ∼ 2.3 in InfoGAN
Figure 1 in the original paper
Experiment
– Disentangled Representation –
Figure 2 in the original paper
• InfoGAN on MNIST dataset
• Latent codes
 𝑐1: 10-class categorical code
 𝑐2, 𝑐3: continuous code
 𝑐1 can be used as a
classifier with 5% error
rate.
 𝑐2 and 𝑐3 captured the
rotation and width,
respectively
Experiment
– Disentangled Representation –
Dataset: P. Paysan, et al., AVSS, 2009, pp. 296–301.
Figure 3 in the original paper
Experiment
– Disentangled Representation –
Dataset: M. Aubry, et al., CVPR, 2014, pp. 3762–3769.
InfoGAN learned salient features without supervision
Figure 4 in the original paper
Experiment
– Disentangled Representation –
Dataset: Street View House Number
Figure 5 in the original paper
Experiment
– Disentangled Representation –
Dataset: CelebA
Figure 6 in the original paper
Future Prospect and Conclusion
Mutual information maximization can be applied to
other methods, e.g. VAE
Learning hierarchical latent representation
Improving semi-supervised learning
High-dimentional data discovery
Unsupervised learning of disentangled representations
Goal
GANs + Maximizing Mutual Information
between generated images and input codes
Approach
Benefit
Interpretable representation obtained
without supervision and substantial additional costs

More Related Content

What's hot

Artificial Neural Networks (ANNs) - XOR - Step-By-Step
Artificial Neural Networks (ANNs) - XOR - Step-By-StepArtificial Neural Networks (ANNs) - XOR - Step-By-Step
Artificial Neural Networks (ANNs) - XOR - Step-By-StepAhmed Gad
 
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Edureka!
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descentkandelin
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent methodSanghyuk Chun
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descentSuraj Parmar
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to AutoencodersYan Xu
 
Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++Dongheon Lee
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...홍배 김
 

What's hot (20)

Artificial Neural Networks (ANNs) - XOR - Step-By-Step
Artificial Neural Networks (ANNs) - XOR - Step-By-StepArtificial Neural Networks (ANNs) - XOR - Step-By-Step
Artificial Neural Networks (ANNs) - XOR - Step-By-Step
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 
Lstm
LstmLstm
Lstm
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descent
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
StarGAN
StarGANStarGAN
StarGAN
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
 

Viewers also liked

Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmKatsuki Ohto
 
時系列データ3
時系列データ3時系列データ3
時系列データ3graySpace999
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansKimikazu Kato
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Toru Fujino
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsKen Kuroki
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learningmooopan
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoderssuga93
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentHiroyuki Fukuda
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Kazuto Fukuchi
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowTatsuya Shirakawa
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence LearningDeep Learning JP
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics Koichi Hamada
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...Kusano Hitoshi
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Kentaro Minami
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learningKazuki Fujikawa
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介Kohei Hayashi
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural NetworksSeiya Tokui
 

Viewers also liked (18)

Introduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithmIntroduction of "TrailBlazer" algorithm
Introduction of "TrailBlazer" algorithm
 
時系列データ3
時系列データ3時系列データ3
時系列データ3
 
Value iteration networks
Value iteration networksValue iteration networks
Value iteration networks
 
Fast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-MeansFast and Probvably Seedings for k-Means
Fast and Probvably Seedings for k-Means
 
Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)Dual Learning for Machine Translation (NIPS 2016)
Dual Learning for Machine Translation (NIPS 2016)
 
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and PhysicsInteraction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
 
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement LearningSafe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
 
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”Introduction of “Fairness in Learning: Classic and Contextual Bandits”
Introduction of “Fairness in Learning: Classic and Contextual Bandits”
 
Improving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive FlowImproving Variational Inference with Inverse Autoregressive Flow
Improving Variational Inference with Inverse Autoregressive Flow
 
[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning[DL輪読会]Convolutional Sequence to Sequence Learning
[DL輪読会]Convolutional Sequence to Sequence Learning
 
NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics  NIPS 2016 Overview and Deep Learning Topics
NIPS 2016 Overview and Deep Learning Topics
 
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
論文紹介 Combining Model-Based and Model-Free Updates for Trajectory-Centric Rein...
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
ICML2016読み会 概要紹介
ICML2016読み会 概要紹介ICML2016読み会 概要紹介
ICML2016読み会 概要紹介
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 

Similar to Unsupervised Learning of Disentangled Representations

20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptxssuser7807522
 
QTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature MapQTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature MapHa Phuong
 
Deep Residual Hashing Neural Network for Image Retrieval
Deep Residual Hashing Neural Network for Image RetrievalDeep Residual Hashing Neural Network for Image Retrieval
Deep Residual Hashing Neural Network for Image RetrievalEdwin Efraín Jiménez Lepe
 
Non parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete dataNon parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete dataYueshen Xu
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GANNAVER Engineering
 
Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising
Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT DenoisingPatch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising
Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoisingssuser886075
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종WooSung Choi
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
 
Tutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptxTutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptxJulián Tachella
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentElectronic Arts / DICE
 
Compressed Sensing using Generative Model
Compressed Sensing using Generative ModelCompressed Sensing using Generative Model
Compressed Sensing using Generative Modelkenluck2001
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017Masa Kato
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Gingles Caroline
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsDrBaljitSinghKhehra
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsDrBaljitSinghKhehra
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsDrBaljitSinghKhehra
 
Mncs 16-10-1주-변승규-introduction to the machine learning #2
Mncs 16-10-1주-변승규-introduction to the machine learning #2Mncs 16-10-1주-변승규-introduction to the machine learning #2
Mncs 16-10-1주-변승규-introduction to the machine learning #2Seung-gyu Byeon
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Wuhyun Rico Shin
 

Similar to Unsupervised Learning of Disentangled Representations (20)

20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx
 
QTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature MapQTML2021 UAP Quantum Feature Map
QTML2021 UAP Quantum Feature Map
 
Deep Residual Hashing Neural Network for Image Retrieval
Deep Residual Hashing Neural Network for Image RetrievalDeep Residual Hashing Neural Network for Image Retrieval
Deep Residual Hashing Neural Network for Image Retrieval
 
Non parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete dataNon parametric bayesian learning in discrete data
Non parametric bayesian learning in discrete data
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising
Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT DenoisingPatch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising
Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
Tutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptxTutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptx
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game Development
 
Compressed Sensing using Generative Model
Compressed Sensing using Generative ModelCompressed Sensing using Generative Model
Compressed Sensing using Generative Model
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
 
그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning Models
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning Models
 
Artificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning ModelsArtificial Neural Networks-Supervised Learning Models
Artificial Neural Networks-Supervised Learning Models
 
Mncs 16-10-1주-변승규-introduction to the machine learning #2
Mncs 16-10-1주-변승규-introduction to the machine learning #2Mncs 16-10-1주-변승규-introduction to the machine learning #2
Mncs 16-10-1주-변승규-introduction to the machine learning #2
 
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
Paper review: Measuring the Intrinsic Dimension of Objective Landscapes.
 

Recently uploaded

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Unsupervised Learning of Disentangled Representations

  • 1. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel (UC Berkeley, Open AI) Presenter: Shuhei M. Yoshida (Dept. of Physics, UTokyo) Unsupervised learning of disentangled representations Goal GANs + Maximizing Mutual Information between generated images and input codes Approach Benefit Interpretable representation obtained without supervision and substantial additional costs Reference https://arxiv.org/abs/1606.03657 (with Appendix sections) Implementations https://github.com/openai/InfoGAN (by the authors, with TensorFlow) https://github.com/yoshum/InfoGAN (by the presenter, with Chainer) NIPS2016読み会
  • 2. Motivation How can we achieve unsupervised learning of disentangled representation? In general, learned representation is entangled, i.e. encoded in a data space in a complicated manner When a representation is disentangled, it would be more interpretable and easier to apply to tasks
  • 3. Related works • Unsupervised learning of representation (no mechanism to force disentanglement) Stacked (often denoising) autoencoder, RBM Many others, including semi-supervised approach • Supervised learning of disentangled representation Bilinear models, multi-view perceptron VAEs, adversarial autoencoders • Weakly supervised learning of disentangled representation disBM, DC-IGN • Unsupervised learning of disentangled representation hossRBM, applicable only to discrete latent factors which the presenter has almost no knowledge about. This work: Unsupervised learning of disentangled representation applicable to both continuous and discrete latent factors
  • 4. Generative Adversarial Nets(GANs) Generative model trained by competition between two neural nets: Generator 𝑥 = 𝐺 𝑧 , 𝑧 ∼ 𝑝 𝑧 𝑍 𝑝 𝑧 𝑍 : an arbitrary noise distribution Discriminator 𝐷 𝑥 ∈ 0,1 : probability that 𝑥 is sampled from the data dist. 𝑝data(𝑋) rather than generated by the generator 𝐺 𝑧 min 𝐺 max 𝐷 𝑉GAN 𝐺, 𝐷 , where 𝑉GAN 𝐺, 𝐷 ≡ 𝐸 𝑥∼𝑝data 𝑋 ln 𝐷 𝑥 + 𝐸 𝑧∼𝑝 𝑧 𝑍 ln 1 − 𝐷 𝐺 𝑧 Optimization problem to solve:
  • 5. Problems with GANs From the perspective of representation learning: No restrictions on how 𝐺 𝑧 uses 𝑧 • 𝑧 can be used in a highly entangled way • Each dimension of 𝑧 does not represent any salient feature of the training data 𝑧1 𝑧2
  • 6. Proposed Resolution: InfoGAN -Maximizing Mutual Information - Observation in conventional GANs: a generated date 𝑥 does not have much information on the noise 𝑧 from which 𝑥 is generated because of heavily entangled use of 𝑧 Proposed resolution = InfoGAN: the generator 𝐺 𝑧, 𝑐 trained so that it maximize the mutual information 𝐼 𝐶 𝑋 between the latent code 𝐶 and the generated data 𝑋 min 𝐺 max 𝐷 𝑉GAN 𝐺, 𝐷 − 𝜆𝐼 𝐶 𝑋 = 𝐺 𝑍, 𝐶
  • 7. Mutual Information 𝐼 𝑋; 𝑌 = 𝐻 𝑋 − 𝐻 𝑋 𝑌 , where • 𝐻 𝑋 = 𝐸 𝑥∼𝑝 𝑋 − ln 𝑝 𝑋 = 𝑥 : Entropy of the prior distribution • 𝐻 𝑋 𝑌 = 𝐸 𝑦∼𝑝 𝑌 ,𝑥∼𝑝 𝑋|𝑌=𝑦 − ln 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 : Entropy of the posterior distribution 𝑝 𝑋 = 𝑥 𝑥 𝑝 𝑋 = 𝑥|𝑌 = 𝑦 𝑥 𝑝 𝑋 = 𝑥|𝑌 = 𝑦 𝑥 𝐼 𝑋; 𝑌 = 0 𝐼 𝑋; 𝑌 > 0 Sampling 𝑦 ∼ 𝑝 𝑌
  • 8. Avoiding increase of calculation costs Major difficulty: Evaluation of 𝐼 𝐶 𝑋 based on evaluation and sampling from the posterior 𝑝 𝐶 𝑋 Two strategies: Variational maximization of mutual information Use an approximate function 𝑄 𝑐 𝑥 = 𝑝 𝐶 = 𝑐 𝑋 = 𝑥 Sharing the neural net between 𝑄 𝑐 𝑥 and the discriminator 𝐷 𝑥
  • 9. Variational Maximization of MI For an arbitrary function 𝑄 𝑐, 𝑥 , 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑝 𝐶 = 𝑐 𝑋 = 𝑥 (∵ positivity of KL divergence) = 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄(𝑐, 𝑥) + 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑝 𝐶 = 𝑐 𝑋 = 𝑥 𝑄 𝑐, 𝑥 = 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄(𝑐, 𝑥) + 𝐸 𝑥∼𝑝 𝐺 𝑋 𝐷KL 𝑝 𝐶 𝑋 = 𝑥 ||𝑄 𝐶, 𝑥 ≥ 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄(𝑐, 𝑥)
  • 10. Variational Maximization of MI Maximizing 𝐿𝐼 𝐺, 𝑄 w.r.t. 𝐺 and 𝑄  With 𝑄(𝑐, 𝑥) approximating 𝑝 𝐶 = 𝑐 𝑋 = 𝑥 , we obtain an variational estimate of the mutual information: 𝐿𝐼 𝐺, 𝑄 ≡ 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 + 𝐻 𝐶 ≲ 𝐼 𝐶 𝑋 = 𝐺 𝑍, 𝐶 ⇔ • Achieving the equality by setting 𝑄 𝑐, 𝑥 = 𝑝 𝐶 = 𝑐 𝑋 = 𝑥 • Maximizing the mutual information min 𝐺,𝑄 max 𝐷 𝑉GAN 𝐺, 𝐷 − 𝜆𝐿𝐼 𝐺, 𝑄 Optimization problem to solve in InfoGAN:
  • 11. Eliminate sampling from posterior Lemma 𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥) 𝑓 𝑥, 𝑦 = 𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥),𝑥′∼𝑝 𝑋′ 𝑌=𝑦) 𝑓 𝑥′ , 𝑦 . 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 = 𝐸 𝑐∼𝑝 𝐶 ,𝑧∼𝑝 𝑧 𝑍 ,𝑥=𝐺 𝑧,𝑐 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 , By using this lemma and noting that 𝐸 𝑥∼𝑝 𝐺 𝑋 ,𝑐∼𝑝 𝐶|𝑋=𝑥 ln 𝑄 𝑐, 𝑥 = 𝑬 𝒄∼𝒑 𝑪 ,𝒛∼𝒑 𝒛 𝒁 ,𝒙=𝑮 𝒛,𝒄 𝐥𝐧 𝑸 𝒄, 𝒙 we can eliminate the sampling from 𝑝 𝐶|𝑋 = 𝑥 : Easy to estimate!
  • 12. Proof of lemma Lemma 𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥) 𝑓 𝑥, 𝑦 = 𝐸 𝑥∼𝑝 𝑋 ,𝑦∼𝑝 𝑌 𝑋=𝑥),𝑥′∼𝑝 𝑋′ 𝑌=𝑦) 𝑓 𝑥′ , 𝑦 . ∵ l. h. s. = 𝑥 𝑦 𝑝 𝑋 = 𝑥 𝑝 𝑌 = 𝑦 𝑋 = 𝑥 𝑓 𝑥, 𝑦 = 𝑥 𝑦 𝑝 𝑌 = 𝑦 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 𝑓 𝑥, 𝑦 = 𝑥 𝑦 𝑥′ 𝑝 𝑋 = 𝑥′, 𝑌 = 𝑦 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 𝑓 𝑥, 𝑦 = 𝑥 𝑦 𝑥′ 𝑝 𝑋 = 𝑥′ 𝑝 𝑌 = 𝑦 𝑋 = 𝑥′ 𝑝 𝑋 = 𝑥 𝑌 = 𝑦 𝑓 𝑥, 𝑦 = r. h. s. ∵ Bayes’ theorem
  • 13. Sharing layers between 𝐷 and 𝑄 Model 𝑄 𝑐, 𝑥 using neural network Reduce the calculation costs by sharing all the convolution layers with 𝐷 Image from Odena, et al., arXiv:1610.09585. Convolution layers of the discriminator 𝐷 𝑄 Given DCGANs, InfoGAN comes for negligible additional costs!
  • 14. Experiment – MI Maximization • InfoGAN on MNIST dataset • Latent code 𝑐 = 10-class categorical code 𝐿𝐼 quickly saturates to 𝐻 𝑐 = ln 10 ∼ 2.3 in InfoGAN Figure 1 in the original paper
  • 15. Experiment – Disentangled Representation – Figure 2 in the original paper • InfoGAN on MNIST dataset • Latent codes  𝑐1: 10-class categorical code  𝑐2, 𝑐3: continuous code  𝑐1 can be used as a classifier with 5% error rate.  𝑐2 and 𝑐3 captured the rotation and width, respectively
  • 16. Experiment – Disentangled Representation – Dataset: P. Paysan, et al., AVSS, 2009, pp. 296–301. Figure 3 in the original paper
  • 17. Experiment – Disentangled Representation – Dataset: M. Aubry, et al., CVPR, 2014, pp. 3762–3769. InfoGAN learned salient features without supervision Figure 4 in the original paper
  • 18. Experiment – Disentangled Representation – Dataset: Street View House Number Figure 5 in the original paper
  • 19. Experiment – Disentangled Representation – Dataset: CelebA Figure 6 in the original paper
  • 20. Future Prospect and Conclusion Mutual information maximization can be applied to other methods, e.g. VAE Learning hierarchical latent representation Improving semi-supervised learning High-dimentional data discovery Unsupervised learning of disentangled representations Goal GANs + Maximizing Mutual Information between generated images and input codes Approach Benefit Interpretable representation obtained without supervision and substantial additional costs