Sogang University Machine Learning and Data Mining lab seminar, Neural Networks for newbies and Convolutional Neural Networks. This is prerequisite material to understand deep convolutional architecture.
370. The Scientist and Engineer’s Guide to Digital Signal Processing, Ch 26, Steven W. Smith, ISBN 0-7506-7444 X
• Combining
perceptrons
• Feed
forward
Informa9on
flow
• Passive
node
• without
weighted
sum
input
• Ac9ve
node
• with
weighted
sum
structure
685.
Convolutional Deep Belief Networks for
Scalable Unsupervised Learning of Hierarchical Representations
Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng
Computer Science Department
Stanford University
Stanford, CA 94305, USA
ICML 2009
To
696.
There has been much interest in unsupervised learning of hierarchical generative models such
as deep belief networks. Scaling such models to full-sized, high-dimensional images remains
a difficult problem. To address this problem, we present the convolutional deep belief
network, a hierarchical generative model which scales to realistic image sizes.
This model is translation-invariant and supports efficient bottom-up and top-down
probabilistic inference. Key to our approach is probabilistic max-pooling, a novel technique
which shrinks the representations of higher layers in a probabilistically sound way. Our
experiments show that the algorithm learns useful high-level visual features, such as object
parts, from unlabeled images of objects and natural scenes. We demonstrate excel- lent
performance on several visual recognition tasks and show that our model can perform
hierarchical (bottom-up and top-down) inference over full-sized images.
705.
There has been much interest in unsupervised learning of hierarchical generative models such
as deep belief networks. Scaling such models to full-sized, high-dimensional images remains
a difficult problem. To address this problem, we present the convolutional deep belief
network, a hierarchical generative model which scales to realistic image sizes.
This model is translation-invariant and supports efficient bottom-up and top-down
probabilistic inference. Key to our approach is probabilistic max-pooling, a novel technique
which shrinks the representations of higher layers in a probabilistically sound way. Our
experiments show that the algorithm learns useful high-level visual features, such as object
parts, from unlabeled images of objects and natural scenes. We demonstrate excellent
performance on several visual recognition tasks and show that our model can perform
hierarchical (bottom-up and top-down) inference over full-sized images.
719.
– “First, we introduce the convolutional RBM (CRBM). Intuitively, the CRBM is similar to the
RBM, but the weights between the hidden and visible layers are shared among all locations in
an image.”
– Probabilistic
721.
“In general, higher-level feature detectors need information from progressively larger input
regions. Existing translation-invariant representations, such as convolutional networks, often
involve two kinds of layers in alternation: “detection” layers, whose responses are computed
by convolving a feature detector with the previous layer, and “pooling” layers, which shrink
the representation of the detection layers by a constant factor. More specifically, each unit in
a pooling layer computes the maximum activation of the units in a small region of the
detection layer. Shrinking the representation with max-pooling allows higher-layer
representations to be invariant to small translations of the input and reduces the
computational burden.”
30
736.
– “First, we introduce the convolutional RBM (CRBM). Intuitively, the CRBM is similar to the
RBM, but the weights between the hidden and visible layers are shared among all locations in
an image.”
– Probabilistic
738.
“In general, higher-level feature detectors need information from progressively larger input
regions. Existing translation-invariant representations, such as convolutional networks, often
involve two kinds of layers in alternation: “detection” layers, whose responses are computed
by convolving a feature detector with the previous layer, and “pooling” layers, which shrink
the representation of the detection layers by a constant factor. More specifically, each unit in
a pooling layer computes the maximum activation of the units in a small region of the
detection layer. Shrinking the representation with max-pooling allows higher-layer
representations to be invariant to small translations of the input and reduces the
computational burden.”
31
766.
– “To avoid the situation that there exist billions of parameters if all
layers are fully connected, the idea of using a convolution
operation on small regions” (Avoid overfitting / computational overhead)
840.
features
LeNet-5, Convolutional Neural Networks, LeCun, IEEE, Nov 1998, http://yann.lecun.com/exdb/lenet/
Scikit-learn, Clustering, http://scikit-learn.org/0.11/modules/clustering.html
More
ra9onally
differen9able
on
‘Swiss-‐roll
Problem’
977.
features
LeNet-5, Convolutional Neural Networks, LeCun, IEEE, Nov 1998
Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations