GoogLeNet introduced several key insights for designing efficient deep learning networks:
1. Exploit local correlations in images by concatenating 1x1, 3x3, and 5x5 convolutions along with pooling.
2. Decrease dimensions before expensive convolutions using 1x1 convolutions for dimension reduction.
3. Stack inception modules upon each other, occasionally inserting max pooling layers, to allow tweaking each module.
4. Counter vanishing gradients with intermediate losses added to the total loss for training deep networks.
5. End with a global average pooling layer instead of fully connected layers to avoid overfitting.
18. Two AddiKonal Loss Layers
for Training to Depth
3b 4a 4b 4c 4d 4e 5a3a 5b
Input
Average
Pooling
Traditional
Convolutions
(Conv + MaxPool +
Conv + MaxPool)
Linear
Nine Inception Modules
SoftMax
w/Loss 2MaxPool
Average
Pooling
1x1
Conv
DropOutFully
Connected
SoftMax
w/Loss 0Linear
Label
SoftMax
w/Loss 1
www.shaBerline.com 18