SlideShare a Scribd company logo
1 of 19
Download to read offline
Google confidential | Do not distribute
DCGAN How does it work?
Etsuji Nakai
Cloud Solutions Architect at Google
2016/09/26 ver1.1 GIF Animation
https://goo.gl/zXL1bV
$ who am i
▪Etsuji Nakai
Cloud Solutions Architect at Google
Twitter @enakai00
Now on Sale!
What is DCGAN?
What is DCGAN?
▪ DCGAN: Deep Convolutional Generative Adversarial Networks
● It works in the opposite direction of the image classifier (CNN).
● CNN transforms an image to a class label (list of probabilities).
● DCGAN generates an image from random parameters.
(0.01, 0.05, 0.91, 0.02, ...)
deer dog cat human ...
(0.01, 0.05, 0.91, 0.02, ...)
CNN
DCGAN
Probabilities of each entry.
What do these
numbers mean?
Random parameters
Examples of Convolutional Filters
▪ Convolutional filters are ... just an image filter you sometimes apply in Photoshop!
Filter to blur images Filter to extract vertical edges
Convolutional Filters in CNN
▪ CNN applies a lot of filters to extract various features from a single image.
▪ CNN applies multi-layered filters to a single image (to extract features of
features?)
▪ A filtered image becomes smaller to drop off unnecessary details.
Extracting vertical and horizontal edges using two filters.
Convolutional Filters in CNN
▪ This shows how filters are
applied to a multi-layered image.
Input image
Output image A
Output image B
Filter A
Filter B
Apply independent
filters to each layer
Sum up resulting images
from each layer
Typical CNN Filtering Layers
http://arxiv.org/abs/1511.06434
RGB layers of a
single 64x64 image.
128 layers of
32x32 images.
256 layers of
16x16 images.A list of
probabilities
・・・
▪ Starting from a single RGB image on the right, multiple filtering layers are applied
to produce smaller (and more) images.
Image Generation Flow of DCGAN
http://arxiv.org/abs/1511.06434
RGB layers of a
single 64x64 image.
512 layers of
8x8 images.
1024 layers of
4x4 images.A list of random
numbers
・・・
▪ Basically, it's just flipping the direction. No magic!
Illustration of Convolution Operations
▪ Convolutional filters in CNN and transposed-convolutional filters in DCGAN works
in the opposite directions. Here's a good Illustration how they work.
http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html
Convolution:
(Up to) 3x3 blue pixels contribute to
generate a single green pixel. Each
of 3x3 blue pixels is multiplied by
the corresponding filter value, and
the results from different blue
pixels are summed up to be a
single green pixel.
Transposed-convolution:
A single green pixel contributes to
generate (up to) 3x3 blue pixels.
Each green pixel is multiplied by
each of 3x3 filter values, and the
results from different green pixels
are summed up to be a single blue
pixel.
GIF Animation
https://goo.gl/tAY4BL
Training Strategy of DCGAN
It's a fake!
▪ We train two models simultaneously.
● CNN: Classifying authentic and fake images.
● "Authentic" images are provided as training data to CNN.
● DCGAN: Trained to generate images classified as authentic by CNN.
● By trying to fool CNN, DCGAN learns to generate images similar to the training data.
CNN DCGAN
Training data
Training Loop of DCGAN
▪ By repeating this loop, CNN
becomes more accurate and
DCGAN becomes more crafty.
CNN
DCGAN
Training data B
Generated image A
Random numbers
P(A) : Probability that
A is authentic.
P(B) : Probability that
B is authentic.
Modify parameters such that
P(A) becomes large
Modify parameters such that
P(A) becomes small
and P(B) becomes large
Demo
https://goo.gl/D8RBGm
Model
▪ Training data : MNIST (28x28 pixels, grayscale images)
▪ DCGAN : Generate a single 28x28 image from 64 parameters.
● → 128 x (7x7) → 64 x (14x14) → 1 x (28x28)
▪ CNN : Calculate a probability that a single 28x28 image is authentic.
● 1 x (28x28) → 64 x (14✕14) → 128 x (7x7) → Probability of authentic image
▪ Batch size : 32
● Modify filter parameters using 32 generated images and 32 MNIST images at a
time.
Learning Process
▪ This shows the evolution of images
generated from the same input parameters
during the training loop. (DCGAN's filters are
initialized with random values.)
Playing with Input Parameters
▪ If we change the input parameter, the shape of generated image changes too. By
making small, contiguous changes to the input, we can achieve a morphing effect.
▪ Since the input parameter is a point in the 64 dimensional space, we can draw a
straight line between two points. The end points represent images before and
after morphing.
Playing with Input Parameters
▪ Using more complicated closed loop in the parameter space, we can even make a
dancing image :)
▪ The sample image on this page is generated from the trajectory over a sphere
(embedded in the 64 dimensional space.)
GIF Animation
https://goo.gl/zXL1bV
Interpretation of Input Parameters
▪ In the DCGAN paper, it is suggested that the input parameters could use a
semantic structure as in the following example.
Smile
ManWoman
Neutral
Neutral Woman
Smiling Woman Smiling Man
Neutral Man
http://arxiv.org/abs/1511.06434
Thank you!

More Related Content

More from Etsuji Nakai

A Brief History of My English Learning
A Brief History of My English LearningA Brief History of My English Learning
A Brief History of My English LearningEtsuji Nakai
 
TensorFlowプログラミングと分類アルゴリズムの基礎
TensorFlowプログラミングと分類アルゴリズムの基礎TensorFlowプログラミングと分類アルゴリズムの基礎
TensorFlowプログラミングと分類アルゴリズムの基礎Etsuji Nakai
 
TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門Etsuji Nakai
 
Using Kubernetes on Google Container Engine
Using Kubernetes on Google Container EngineUsing Kubernetes on Google Container Engine
Using Kubernetes on Google Container EngineEtsuji Nakai
 
Lecture note on PRML 8.2
Lecture note on PRML 8.2Lecture note on PRML 8.2
Lecture note on PRML 8.2Etsuji Nakai
 
Machine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application DevelopersMachine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application DevelopersEtsuji Nakai
 
Your first TensorFlow programming with Jupyter
Your first TensorFlow programming with JupyterYour first TensorFlow programming with Jupyter
Your first TensorFlow programming with JupyterEtsuji Nakai
 
Deep Q-Network for beginners
Deep Q-Network for beginnersDeep Q-Network for beginners
Deep Q-Network for beginnersEtsuji Nakai
 
TensorFlowで学ぶDQN
TensorFlowで学ぶDQNTensorFlowで学ぶDQN
TensorFlowで学ぶDQNEtsuji Nakai
 
DevOpsにおける組織に固有の事情を どのように整理するべきか
DevOpsにおける組織に固有の事情を どのように整理するべきかDevOpsにおける組織に固有の事情を どのように整理するべきか
DevOpsにおける組織に固有の事情を どのように整理するべきかEtsuji Nakai
 
インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜
インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜
インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜Etsuji Nakai
 
Exploring the Philosophy behind Docker/Kubernetes/OpenShift
Exploring the Philosophy behind Docker/Kubernetes/OpenShiftExploring the Philosophy behind Docker/Kubernetes/OpenShift
Exploring the Philosophy behind Docker/Kubernetes/OpenShiftEtsuji Nakai
 
「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)
「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)
「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)Etsuji Nakai
 
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!Etsuji Nakai
 
Open Shift v3 主要機能と内部構造のご紹介
Open Shift v3 主要機能と内部構造のご紹介Open Shift v3 主要機能と内部構造のご紹介
Open Shift v3 主要機能と内部構造のご紹介Etsuji Nakai
 
Docker with RHEL7 技術勉強会
Docker with RHEL7 技術勉強会Docker with RHEL7 技術勉強会
Docker with RHEL7 技術勉強会Etsuji Nakai
 
分散ストレージソフトウェアCeph・アーキテクチャー概要
分散ストレージソフトウェアCeph・アーキテクチャー概要分散ストレージソフトウェアCeph・アーキテクチャー概要
分散ストレージソフトウェアCeph・アーキテクチャー概要Etsuji Nakai
 
OpenStackとDockerの未来像
OpenStackとDockerの未来像OpenStackとDockerの未来像
OpenStackとDockerの未来像Etsuji Nakai
 

More from Etsuji Nakai (20)

A Brief History of My English Learning
A Brief History of My English LearningA Brief History of My English Learning
A Brief History of My English Learning
 
TensorFlowプログラミングと分類アルゴリズムの基礎
TensorFlowプログラミングと分類アルゴリズムの基礎TensorFlowプログラミングと分類アルゴリズムの基礎
TensorFlowプログラミングと分類アルゴリズムの基礎
 
TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門TensorFlowによるニューラルネットワーク入門
TensorFlowによるニューラルネットワーク入門
 
Using Kubernetes on Google Container Engine
Using Kubernetes on Google Container EngineUsing Kubernetes on Google Container Engine
Using Kubernetes on Google Container Engine
 
Lecture note on PRML 8.2
Lecture note on PRML 8.2Lecture note on PRML 8.2
Lecture note on PRML 8.2
 
Machine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application DevelopersMachine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application Developers
 
Your first TensorFlow programming with Jupyter
Your first TensorFlow programming with JupyterYour first TensorFlow programming with Jupyter
Your first TensorFlow programming with Jupyter
 
Deep Q-Network for beginners
Deep Q-Network for beginnersDeep Q-Network for beginners
Deep Q-Network for beginners
 
Life with jupyter
Life with jupyterLife with jupyter
Life with jupyter
 
TensorFlowで学ぶDQN
TensorFlowで学ぶDQNTensorFlowで学ぶDQN
TensorFlowで学ぶDQN
 
DevOpsにおける組織に固有の事情を どのように整理するべきか
DevOpsにおける組織に固有の事情を どのように整理するべきかDevOpsにおける組織に固有の事情を どのように整理するべきか
DevOpsにおける組織に固有の事情を どのように整理するべきか
 
PRML7.2
PRML7.2PRML7.2
PRML7.2
 
インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜
インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜
インタークラウドを実現する技術 〜 デファクトスタンダードからの視点 〜
 
Exploring the Philosophy behind Docker/Kubernetes/OpenShift
Exploring the Philosophy behind Docker/Kubernetes/OpenShiftExploring the Philosophy behind Docker/Kubernetes/OpenShift
Exploring the Philosophy behind Docker/Kubernetes/OpenShift
 
「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)
「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)
「TensorFlow Tutorialの数学的背景」 クイックツアー(パート1)
 
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
 
Open Shift v3 主要機能と内部構造のご紹介
Open Shift v3 主要機能と内部構造のご紹介Open Shift v3 主要機能と内部構造のご紹介
Open Shift v3 主要機能と内部構造のご紹介
 
Docker with RHEL7 技術勉強会
Docker with RHEL7 技術勉強会Docker with RHEL7 技術勉強会
Docker with RHEL7 技術勉強会
 
分散ストレージソフトウェアCeph・アーキテクチャー概要
分散ストレージソフトウェアCeph・アーキテクチャー概要分散ストレージソフトウェアCeph・アーキテクチャー概要
分散ストレージソフトウェアCeph・アーキテクチャー概要
 
OpenStackとDockerの未来像
OpenStackとDockerの未来像OpenStackとDockerの未来像
OpenStackとDockerの未来像
 

Recently uploaded

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

DCGAN - How does it work?

  • 1. Google confidential | Do not distribute DCGAN How does it work? Etsuji Nakai Cloud Solutions Architect at Google 2016/09/26 ver1.1 GIF Animation https://goo.gl/zXL1bV
  • 2. $ who am i ▪Etsuji Nakai Cloud Solutions Architect at Google Twitter @enakai00 Now on Sale!
  • 4. What is DCGAN? ▪ DCGAN: Deep Convolutional Generative Adversarial Networks ● It works in the opposite direction of the image classifier (CNN). ● CNN transforms an image to a class label (list of probabilities). ● DCGAN generates an image from random parameters. (0.01, 0.05, 0.91, 0.02, ...) deer dog cat human ... (0.01, 0.05, 0.91, 0.02, ...) CNN DCGAN Probabilities of each entry. What do these numbers mean? Random parameters
  • 5. Examples of Convolutional Filters ▪ Convolutional filters are ... just an image filter you sometimes apply in Photoshop! Filter to blur images Filter to extract vertical edges
  • 6. Convolutional Filters in CNN ▪ CNN applies a lot of filters to extract various features from a single image. ▪ CNN applies multi-layered filters to a single image (to extract features of features?) ▪ A filtered image becomes smaller to drop off unnecessary details. Extracting vertical and horizontal edges using two filters.
  • 7. Convolutional Filters in CNN ▪ This shows how filters are applied to a multi-layered image. Input image Output image A Output image B Filter A Filter B Apply independent filters to each layer Sum up resulting images from each layer
  • 8. Typical CNN Filtering Layers http://arxiv.org/abs/1511.06434 RGB layers of a single 64x64 image. 128 layers of 32x32 images. 256 layers of 16x16 images.A list of probabilities ・・・ ▪ Starting from a single RGB image on the right, multiple filtering layers are applied to produce smaller (and more) images.
  • 9. Image Generation Flow of DCGAN http://arxiv.org/abs/1511.06434 RGB layers of a single 64x64 image. 512 layers of 8x8 images. 1024 layers of 4x4 images.A list of random numbers ・・・ ▪ Basically, it's just flipping the direction. No magic!
  • 10. Illustration of Convolution Operations ▪ Convolutional filters in CNN and transposed-convolutional filters in DCGAN works in the opposite directions. Here's a good Illustration how they work. http://deeplearning.net/software/theano_versions/dev/tutorial/conv_arithmetic.html Convolution: (Up to) 3x3 blue pixels contribute to generate a single green pixel. Each of 3x3 blue pixels is multiplied by the corresponding filter value, and the results from different blue pixels are summed up to be a single green pixel. Transposed-convolution: A single green pixel contributes to generate (up to) 3x3 blue pixels. Each green pixel is multiplied by each of 3x3 filter values, and the results from different green pixels are summed up to be a single blue pixel. GIF Animation https://goo.gl/tAY4BL
  • 11. Training Strategy of DCGAN It's a fake! ▪ We train two models simultaneously. ● CNN: Classifying authentic and fake images. ● "Authentic" images are provided as training data to CNN. ● DCGAN: Trained to generate images classified as authentic by CNN. ● By trying to fool CNN, DCGAN learns to generate images similar to the training data. CNN DCGAN Training data
  • 12. Training Loop of DCGAN ▪ By repeating this loop, CNN becomes more accurate and DCGAN becomes more crafty. CNN DCGAN Training data B Generated image A Random numbers P(A) : Probability that A is authentic. P(B) : Probability that B is authentic. Modify parameters such that P(A) becomes large Modify parameters such that P(A) becomes small and P(B) becomes large
  • 14. Model ▪ Training data : MNIST (28x28 pixels, grayscale images) ▪ DCGAN : Generate a single 28x28 image from 64 parameters. ● → 128 x (7x7) → 64 x (14x14) → 1 x (28x28) ▪ CNN : Calculate a probability that a single 28x28 image is authentic. ● 1 x (28x28) → 64 x (14✕14) → 128 x (7x7) → Probability of authentic image ▪ Batch size : 32 ● Modify filter parameters using 32 generated images and 32 MNIST images at a time.
  • 15. Learning Process ▪ This shows the evolution of images generated from the same input parameters during the training loop. (DCGAN's filters are initialized with random values.)
  • 16. Playing with Input Parameters ▪ If we change the input parameter, the shape of generated image changes too. By making small, contiguous changes to the input, we can achieve a morphing effect. ▪ Since the input parameter is a point in the 64 dimensional space, we can draw a straight line between two points. The end points represent images before and after morphing.
  • 17. Playing with Input Parameters ▪ Using more complicated closed loop in the parameter space, we can even make a dancing image :) ▪ The sample image on this page is generated from the trajectory over a sphere (embedded in the 64 dimensional space.) GIF Animation https://goo.gl/zXL1bV
  • 18. Interpretation of Input Parameters ▪ In the DCGAN paper, it is suggested that the input parameters could use a semantic structure as in the following example. Smile ManWoman Neutral Neutral Woman Smiling Woman Smiling Man Neutral Man http://arxiv.org/abs/1511.06434