This document summarizes recent conference presentations and research by Senior Researcher Masayuki Tanaka and his social intelligence research team. It discusses Tanaka's proposed new activation function called Weighted Sigmoid Gate (WiG) for deep neural networks. The document provides background on existing activation functions and describes how WiG improves upon these by incorporating a gated network structure. Experimental results show WiG can improve performance on tasks like object recognition and image denoising compared to traditional activations like ReLU. Code implementing WiG is available online.
2. Recent Conference Presentations
Gradient-Based Low-Light Image Enhancement
Masayuki Tanaka, Takashi Shibata and Masatoshi Okutomi
ProceedingsofIEEEInternationalConferenceonConsumerElectronics(ICCE2019),January,2019
Pixelwise JPEG compression detection and quality factor
estimation based on convolutional neural network
Kazutaka Uchida, Masayuki Tanaka, and Masatoshi Okutomi
Proceedings of IS&T International Symposium on Electronic Imaging (EI2019), January, 2019
Disparity Map Estimation from Cross-Modal Stereo
Thapanapong Rukkanchanunt, Takashi Shibata, Masayuki Tanaka and Masatoshi Okutomi
Proceedings of 6th IEEE Global Conference on Signal and Information Processing
(GlobalSIP2018), pp.988-992, November, 2018
Non-blindImageRestorationBasedonConvolutional NeuralNetwork
Kazutaka Uchida, Masayuki Tanaka and Masatoshi Okutomi
ProceedingsofIEEE7thGlobalConferenceonConsumerElectronics(GCCE2018),pp.12-16,October,2018
RemoteHeartRateMeasurementfromRGB-NIRVideoBasedonSpatial
andSpectralFacePatchSelection
ShiikaKado,YusukeMonno,KentaMoriwaki,KazunoriYoshizaki,MasayukiTanakaandMasatoshiOkutomi
Proceedings of 40th International Conference of the IEEE Engineering in Medicine and Biology
Society (EMBC2018), pp.5676-5680, July, 2018
3. Activation Functions for DNNs
Input
x
Activation
function
Weight
Output
y
Conv.
Activation
function
Input
x
Output
y
Activation functions
Sigmoid tanh ReLU
𝜎𝜎 𝑥𝑥 =
1
1 + 𝑒𝑒−𝑥𝑥
max(𝑥𝑥, 0)
5. WiG: Weighted Sigmoid Gate (Proposed)
Existing activation functions are
element-wise function.
Sigmoid Gated Network can be
used as activation function.
Weight
Activation
function
Weight
Activation
networkunit
Proposed WiG (Weighted sigmoid gate unit)
W ×
Wg
WiG activation unit
It is compatible to existing activation functions.
It includes the ReLU.
Sigmoid
W
Wg
×
My recommendation is:
You can improve the network performance just by
replacing the ReLU by the proposed WiG.
7. Reference
Masayuki Tanaka, Weighted Sigmoid Gate Unit for an Activation Function
of Deep Neural Network, arXiv preprint arXiv:1810.01829, 2018.
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/WiG/
V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann
machines,” in Proceedings of the 27th international conference on machine
learning (ICML-10), 2010, pp. 807–814.
P. Ramachandran, B. Zoph, and Q. V. Le, “Searching for activation functions,” arXiv
preprint arXiv:1710.05941, 2017.
S. Elfwing, E. Uchibe, and K. Doya, “Sigmoid-weighted linear units for neural
network function approximation in reinforcement learning,”arXiv preprint
arXiv:1702.03118, 2017.
A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural
network acoustic models,” in International Conference on Machine Learning (ICML),
vol. 30, no. 1, 2013, p. 3.
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing
human-level performance on imagenet classification,” in IEEE International
Conference on Computer Vision (ICCV), 2015, pp. 1026–1034.
8. Promotion: Train1000 project
The deep learning usually requires a huge size of training data to improve the
performance. However, a training with such a huge data needs high computational
cost in terms of both computational power and time. It is very tough especially for
beginners. In practice, it is hard to collect a huge number of annotated training
samples. I think that 1,000 samples are minimum number for the training of the
network. The training with 1,000 samples also includes technical challenges. One
of them is to improve generalization performance while avoiding the over fitting.
Let’s enjoy train with 1000.
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/train1000/
Sample codes of matlab and keras for mnits and cifar are available.