Deep Learningによる超解像の進歩

Copyright © DeNA Co.,Ltd. All Rights Reserved.
Deep Learningによる
超解像の進歩

⾃⼰紹介
2
Hiroto Honda
@hirotomusiker
n メーカー研究所 → 2017/1 DeNA
n ETH Zurich CVLにて客員(2013-2014)
n CVPR NTIRE Workshop Program Committee
n DeNA AI研究開発エンジニア
n 現職：Object Detection
(OSS: https://github.com/DeNA/Chainer_Mask_R-CNN )
n 前職：Low-Level Vision, Computational, Sensor LSI

Contents
n 超解像は試しやすい
n 初期のSISRネットワーク
⁃ SRCNN, ESPCN, VDSR
⁃ Upsampling⼿法– deconv or pixelshuffle
n ベースライン⼿法：SRResNet
⁃ SRResNet, SRGAN, and EDSR
n 超解像とperception
⁃ 復元結果とロス関数の関係
⁃ Perception – Distortion Tradeoff
n まとめ
3

超解像とは
n 低解像度画像
n ⾼解像度画像
4
復元

超解像は試しやすい!
5
original(HR) LR
resize
train
アノテーションが不要な
Self-supervised learningの⼀種

超解像の進歩
6
https://github.com/jbhuang0604/SelfExSRPSNR* [dB] (over bicubic)
on Set5 dataset, x4
+1.86
+2.93
+2.06
+3.63
A+0.0
bicubic
2015 20172014 2016
+4.20
+2.48
PSNR data from：5)
SRCNN VDSR SRResNet EDSRESPCN
超解像の精度は年々向上している
* PSNR = 10 log10 (2552 / MSE ) when max value is 255

超解像ネットワークの学習
n 正解画像からpatchをcropする HR
n patchをダウンサンプルする LR = g(HR)
n バッチを編成する {LR}, {HR}
n ネットワークfを学習するロス関数は: MSE(HR, f(LR))
n ...以上!
7
LR=g(HR) f(LR) HR
f
MSE
e.g. bicubic down-sampling

Non-deep⼿法: 辞書ベースのアルゴリズム
8
=
係数を最適化する
8
ベースライン: A+ (2014)
http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-ACCV-2014.pdf
=
学習済みの辞書
x 0 +
x 0 +
x 0.8 +
x 0.8 +
x 0.05 +
x 0.05 +
LR
patch
HR
patch

n 初期のSISR networks
⁃ SRCNN, ESPCN, VDSR
⁃ Upsampling⼿法 – deconv or pixelshuffle
9

最初のDeep超解像– SRCNN
10
Kernel size: 9 – 1 – 5 or 9 – 3 – 5 or 9 – 5 – 5
from：1)
⾮常にシンプルで計算量も少ない
bicubic x2

VDSR: ディープなSRCNN
11
from：3)
3x3, 64 ch D= 5 to 20

Efficient sub-pixel CNN (ESPCN)
12
SRCNNと違い、LR画像をconvするので効率的
Kernel size
5 – 3 – 3
from：2)

SRCNN / VDSR とESPCNの違い
n Post-upsamplingのほうが効率的だが、1.6倍といった⾮整数の
upsamplingができない
13
SRCNN,
VDSR
ESPCN
bicubic x2 output
input
Pixel shuffle x2
ch
h
w

CNNによるアップスケール - Deconvolution or PixelShuffle?
n Deconvolution
14
https://distill.pub/2016/deconv-checkerboard/
位置ごとに関与する画素数が均⼀ではないため
格⼦パターンが出てしまう

n resize – convolutionしては？
15
格⼦パターンはなくなる
Resize(low-pass)により情報が失われる可能性があるので、
Nearest neighborで埋める⽅法も

n Sub-pixel convolution (aka. PixelShuffle)
16
各位置でチャネルの情報をタイルする
e.g. 9 channels -> 3x3 サブピクセル
格⼦ノイズフリーではない
from：2)

n ベースライン⼿法：SRResNet
⁃ SRResNet, SRGAN, and EDSR
17

SRResnet and SRGAN – twitter CVPR’17
18
Skip connection
pixel
shuffle
x2
MSE
MSE
Discriminator
Trained VGG
Perceptual Loss
Discriminator
Loss
MSE Loss
from：4)
pixel
shuffle
x2
ch
h
w
・3種類のロス関数
・MSEのみを使⽤する場合SRResNetと呼ぶ
24 residual blocks, 64 ch

SRResnet* and SRGAN – ネットワーク詳細
19
・resblockとskip connection
・pixel shuffle upsampling
from：4)

さらに⾼精度に特化したEnhanced Deep Super Resolution (EDSR)
ソウル⼤
20
32 residual blocks, 256 ch
Skip connection x2
x2
l1
l1 Loss
from：5)

PSNRと⾒た⽬
21
from：5)
20dB台で1dB違うと明らかに⾒た⽬が変わる

n 超解像とPerception
⁃ 復元結果とロス関数の関係
⁃ Perception – Distortion Tradeoff
22

主観評価とPSNR
23
Original
SRResNet
25.53dB
SRGAN
21.15dB
bicubic
21.59dB
Method→
PSNR →
from: 4)

SRResnet and SRGAN – lossでこんなに違う
24
MSE loss ● ●
Perceptual loss using VGG ●
Discriminator loss ● ●
from：4)
PSNRが
最も⾼い

3タイプのロス関数
①l1/l2 loss
②perceptual loss
③GAN loss
25
generated
image
real / fake
ground
truth
multi-scale
feature
matching
VGG
discrimi-
nator
generated
image
ground
truth
generated
image
ground
truth
Low
Distortion
Good
Perception

Perception-Distortion Tradeoff
どの⼿法も、low distortionとgood perceptual qualityを
同時に満たせない → tradeoff把握が⼤事
26
from：8)

超解像の⽬的はなにか？
27
Accurate Plausible
正確な復元
⾃然な復元
どちらを選ぶかは、⽤途次第！！
引⽤元：4)

n まとめ
28

Progress on SISR – 精度と速度
29
PSNR [dB] (over bicubic)
on Set5 dataset, x4
+1.86
+2.93
+2.06
+3.63
A+ SRCNN VDSR SRResNet EDSR0.0
bicubic
2015 20172014 2016
+4.20
ESPCN
+2.48
0.44
0.04
0.74
1.33
40.7
・CNNを通る画像サイズ
・中間レイヤのチャネル数
で計算量が⼤きく変化する PSNRデータ引⽤元：5)
Mega-Multiplication
per one input pixel
for x2 restoration

NTIRE 2017 超解像コンペでのベンチマーク詳細
30
EDSR
SRResNet
VDSR
ESPCN
SRCNN
A+
from: 9)

まとめ
n 超解像はdeepが主流、⾼精度だが計算量が⼤きい
n resblock連結 + skip connectionや、pixel shuffle upsamplingが重要
n SRResNetベースの⼿法がベースライン
n ʻAccurateʼ か ʻPlausibleʼ かは⽤途次第。
31

Appendix: Residual Dense Network for Super-Resolution
32
DenseNetベースのSRResNet
from: 6)

Appendix: Deep Back-Projection Networks For Super-Resolution
(best PSNR in NTIRE ʼ18 x8 bicubic downsampling track)
33
from: 7)

Datasets
n DIV2K dataset (train, val)
https://data.vision.ee.ethz.ch/cvl/DIV2K/
n Set5 dataset (test)
http://people.rennes.inria.fr/Aline.Roumy/results/SR_BMVC12.html
n B100 dataset (test)
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
n Urban100 dataset (test)
https://sites.google.com/site/jbhuang0604/publications/struct_sr
34

Competitions
n NTIRE2017:
New Trends in Image Restoration and Enhancement workshop and challenge on image super-
resolution in conjunction with CVPR 2017
http://www.vision.ee.ethz.ch/ntire17/
report: http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf
n NTIRE2018:
New Trends in Image Restoration and Enhancement workshop and challenge on super-resolution,
dehazing, and spectral reconstructionin conjunction with CVPR 2018
http://www.vision.ee.ethz.ch/ntire18/
report:
http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w13/Timofte_NTIRE_2018
_Challenge_CVPR_2018_paper.pdf
n PIRM2018:
Workshop and Challenge on Perceptual Image Restoration and Manipulation in conjunction with
ECCV 2018
https://www.pirm2018.org/
35

References
1) Dong et al., Image Super-Resolution Using Deep Convolutional Networks,
https://arxiv.org/abs/1501.00092
2) Shi et al., Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel
Convolutional Neural Network, https://arxiv.org/abs/1609.05158
3) Kim et al., Accurate Image Super-Resolution Using Very Deep Convolutional Networks,
https://arxiv.org/pdf/1511.04587
4) Ledig et al., Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial
Network ,
5) Lim et al., Enhanced Deep Residual Networks for Single Image Super-Resolution,
6) Zhang et al., Residual Dense Network for Image Super-Resolution,
7) Haris et al., Deep Back-Projection Networks For Super-Resolution,
https://arxiv.org/pdf/1803.02735.pdf
8) Blau et al., Perception Distortion Tradeoff, https://arxiv.org/abs/1711.06077
9) Timofte et al., NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and
Results , http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf

Deep Learningによる超解像の進歩

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Deep Learningによる超解像の進歩

Similar to Deep Learningによる超解像の進歩 (20)

Recently uploaded

Recently uploaded (20)

Deep Learningによる超解像の進歩