Senior Researcher's Year-End Seminar on Deep Learning Advances

Senior Researcher
Masayuki Tanaka
2018/12/27
Social Intelligence Research Team
Year-End Seminar
http://www.ok.sc.e.titech.ac.jp/~mtanaka
https://twitter.com/likesilkto

Recent Conference Presentations
Gradient-Based Low-Light Image Enhancement
Masayuki Tanaka, Takashi Shibata and Masatoshi Okutomi
ProceedingsofIEEEInternationalConferenceonConsumerElectronics(ICCE2019),January,2019
Pixelwise JPEG compression detection and quality factor
estimation based on convolutional neural network
Kazutaka Uchida, Masayuki Tanaka, and Masatoshi Okutomi
Proceedings of IS&T International Symposium on Electronic Imaging (EI2019), January, 2019
Disparity Map Estimation from Cross-Modal Stereo
Thapanapong Rukkanchanunt, Takashi Shibata, Masayuki Tanaka and Masatoshi Okutomi
Proceedings of 6th IEEE Global Conference on Signal and Information Processing
(GlobalSIP2018), pp.988-992, November, 2018
Non-blindImageRestorationBasedonConvolutional NeuralNetwork
Kazutaka Uchida, Masayuki Tanaka and Masatoshi Okutomi
ProceedingsofIEEE7thGlobalConferenceonConsumerElectronics(GCCE2018),pp.12-16,October,2018
RemoteHeartRateMeasurementfromRGB-NIRVideoBasedonSpatial
andSpectralFacePatchSelection
ShiikaKado,YusukeMonno,KentaMoriwaki,KazunoriYoshizaki,MasayukiTanakaandMasatoshiOkutomi
Proceedings of 40th International Conference of the IEEE Engineering in Medicine and Biology
Society (EMBC2018), pp.5676-5680, July, 2018

Activation Functions for DNNs
Input
x
Activation
function
Weight
Output
y
Conv.
Activation
function
Input
x
Output
y
Activation functions
Sigmoid tanh ReLU
𝜎𝜎 𝑥𝑥 =
1
1 + 𝑒𝑒−𝑥𝑥
max(𝑥𝑥, 0)

Advanced Activation Functions
ReLU
max(𝑥𝑥, 0)
�
𝑥𝑥 (𝑥𝑥 ≥ 0)
𝛼𝛼 (𝑥𝑥 < 0)
Leaky ReLU
Parametric ReLU
swish, SiL
𝑥𝑥 𝜎𝜎 𝑤𝑤𝑤𝑤 + 𝑏𝑏
Existing activation functions are
element-wise function.
Dying ReLU:
Dead ReLU units always
return zero.

WiG: Weighted Sigmoid Gate (Proposed)
Existing activation functions are
element-wise function.
Sigmoid Gated Network can be
used as activation function.
Weight
Activation
function
Weight
Activation
networkunit
Proposed WiG (Weighted sigmoid gate unit)
W ×
Wg
WiG activation unit
It is compatible to existing activation functions.
It includes the ReLU.
Sigmoid
W
Wg
×
My recommendation is:
You can improve the network performance just by
replacing the ReLU by the proposed WiG.

Experimental Validations
Object recognition
Average accuracy
Image denoising
The reproduction code is available
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/WiG/

Reference
Masayuki Tanaka, Weighted Sigmoid Gate Unit for an Activation Function
of Deep Neural Network, arXiv preprint arXiv:1810.01829, 2018.
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/WiG/
V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann
machines,” in Proceedings of the 27th international conference on machine
learning (ICML-10), 2010, pp. 807–814.
P. Ramachandran, B. Zoph, and Q. V. Le, “Searching for activation functions,” arXiv
preprint arXiv:1710.05941, 2017.
S. Elfwing, E. Uchibe, and K. Doya, “Sigmoid-weighted linear units for neural
network function approximation in reinforcement learning,”arXiv preprint
arXiv:1702.03118, 2017.
A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural
network acoustic models,” in International Conference on Machine Learning (ICML),
vol. 30, no. 1, 2013, p. 3.
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing
human-level performance on imagenet classification,” in IEEE International
Conference on Computer Vision (ICCV), 2015, pp. 1026–1034.

Promotion: Train1000 project
The deep learning usually requires a huge size of training data to improve the
performance. However, a training with such a huge data needs high computational
cost in terms of both computational power and time. It is very tough especially for
beginners. In practice, it is hard to collect a huge number of annotated training
samples. I think that 1,000 samples are minimum number for the training of the
network. The training with 1,000 samples also includes technical challenges. One
of them is to improve generalization performance while avoiding the over fitting.
Let’s enjoy train with 1000.
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/train1000/
Sample codes of matlab and keras for mnits and cifar are available.

Senior Researcher's Year-End Seminar on Deep Learning Advances

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to Senior Researcher's Year-End Seminar on Deep Learning Advances

Similar to Senior Researcher's Year-End Seminar on Deep Learning Advances (20)

More from Masayuki Tanaka

More from Masayuki Tanaka (20)

Recently uploaded

Recently uploaded (20)

Senior Researcher's Year-End Seminar on Deep Learning Advances