Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Restricted Boltzmann Machine - A co... by Indraneel Pole 5094 views
- Brief Introduction to Boltzmann Mac... by Arunabha Saha 10581 views
- Restricted Boltzman Machine (RBM) p... by Seongwon Hwang 1227 views
- Deep Belief Networks by Hasan H Topcu 3516 views
- restrictedboltzmannmachines by Hafiyyan Putra Pr... 374 views
- DNN and RBM by Masayuki Tanaka 16818 views

In Deep Learning, learning RBM is basic hierarchical components of the layer. In this slide, we can learn basic components of RBM (bipartite graph, Gibbs Sampling, Contrastive Divergence (1-CD), Energy function of entropy).

No Downloads

Total views

1,499

On SlideShare

0

From Embeds

0

Number of Embeds

8

Shares

0

Downloads

0

Comments

4

Likes

3

No notes for slide

- 1. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) Jaehyun Ahn Nov. 27. 2015 Sogang University 1
- 2. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 2
- 3. A Prac'cal Guide to Training Restricted Boltzmann Machine • Overview • RBM Requires 7 meta parameters to learn – Learning Rate – The Momentum – The weight-cost – The sparsity target – The ini'al values of the weights – The number of hidden units – The size of each mini-batch • But this does not explain why the decisions were made or how minor changes will aﬀect performance Aug 2010, Geoﬀrey Hinton (University of Toronto) 3
- 4. A Prac'cal Guide to Training Restricted Boltzmann Machine • Overview • RBM Requires 7 meta parameters to learn – Learning Rate – The Momentum – The weight-cost – The sparsity target – The ini'al values of the weights – The number of hidden units – The size of each mini-batch • But this does not explain why the decisions were made or how minor changes will aﬀect performance Aug 2010, Geoﬀrey Hinton (University of Toronto) A comparison of Neural Network Architectures 4
- 5. A Prac'cal Guide to Training Restricted Boltzmann Machine • Hopﬁeld Energy Func'on of RBM • That decides probability distribu'on of visible and hidden vector, which will be Aug 2010, Geoﬀrey Hinton (University of Toronto) {0, 1}! ! !, ℎ = − !!!! !∈!"#"$%& − !!ℎ! !∈!!""#$ − !!ℎ!!!" !,! ! !, ℎ = 1 ! !!!(!,!) ! = !!!(!,!) !,! 5
- 6. A Prac'cal Guide to Training Restricted Boltzmann Machine • Probability that the network assigns to a visible vector is given by summing over all possible hidden vectors: Aug 2010, Geoﬀrey Hinton (University of Toronto) ! = !!!(!,!) !,! 6 ! ! = 1 ! !!!(!,!) !
- 7. A Prac'cal Guide to Training Restricted Boltzmann Machine • Probability the network assignment to training image can be raised by adjus'ng the weights and biases to lower the energy: Aug 2010, Geoﬀrey Hinton (University of Toronto) ! = !!!(!,!) !,! 7 ! ! = 1 ! !!!(!,!) !
- 8. A Prac'cal Guide to Training Restricted Boltzmann Machine • Probability the network assignment to training image can be raised by adjus'ng the weights and biases to lower the energy: Aug 2010, Geoﬀrey Hinton (University of Toronto) 8 ! log !(!) !!!" =< !!ℎ! >!"#" −< !!ℎ! >!"#$% Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%) ! : learning rate Max Log likelihood Contras've Divergence K = k-step CD
- 9. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 9 Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%)
- 10. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 10 Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%) 우리가 v-train을 통해 얻게 되는 결과 = Posi've Phase 이상적인 weight의 분포로 이루어진 prob distrib를 weight deriv = Nega've Phase
- 11. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 11 Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%) 우리가 v-train을 통해 얻게 되는 결과 구하기 어려운부분, 왜? 우리는 이상적인 hidden node (feature)의 분포를 구성하게 하는 weight 를 모른다 이상적인 weight의 분포로 이루어진 prob distrib를 weight deriv
- 12. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 12 < !!ℎ! >!"#$% !(!, ℎ) By using Gibbs Sampling we can get joint distribu'on of , but we need to know !(!, ℎ)
- 13. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 13 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , ! ℎ! = 1 ! = !(!! + !!!!" ! ) Sampling update rule
- 14. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 14 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , ! ℎ! = 1 ! = !(!! + !!!!" ! ) ! !! = 1 ℎ = !(!! + ℎ!!!" ! ) Sampling update rule Energy equilibrium
- 15. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 15 01010110 … Image Input vector 01010110 … ! ℎ! = 1 ! = !(!! + !!!!" ! ) !" ! ℎ! = 1 ! > 1 ! 1 or 0 recogni'on < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) ,
- 16. A Prac'cal Guide to Training Restricted Boltzmann Machine • How to get: Aug 2010, Geoﬀrey Hinton (University of Toronto) 16 01010110 … Weight update Comparing 11110110 … ∆!!" Genera'on (=inference) 11110111 … ! !! = 1 ℎ = !(!! + ℎ!!!" ! ) < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) ,
- 17. A Prac'cal Guide to Training Restricted Boltzmann Machine • Now we got: Aug 2010, Geoﬀrey Hinton (University of Toronto) 17 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , Energy equilibrium !!" When N-th Gibbs Sampling has done will decided
- 18. A Prac'cal Guide to Training Restricted Boltzmann Machine • Now we got: Aug 2010, Geoﬀrey Hinton (University of Toronto) 18 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , Energy equilibrium !!" When N-th Gibbs Sampling has done will decided Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%)
- 19. 19 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto)
- 20. 20 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto)
- 21. 21 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%)
- 22. 22 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) ! !! ℎ = !! + ℎ!!!" !
- 23. 23 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) ! ℎ! ! = !! + !!!!" !
- 24. 24 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) 알고리즘 종료시, 1-CD weights, biases (b, c)를 구할 수 있고 Training이 완료됨
- 25. 25 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto)
- 26. An Analysis of Single-Layer Networks in Unsupervised Feature Learning • Eﬀec've Learning Features of 1-Hidden Layer RBM – Features (# of hidden nodes) – Recep've Fields (Filters, Field size) – Whitening 26 2011, Honglak Lee There are two things we are trying to accomplish with whitening: 1. Make the features less correlated with one another. 2. Give all of the features the same variance. Whitening has two simple steps: 1. Project the dataset onto the eigenvectors. This rotates the dataset so that there is no correlation between the components. 2. Normalize the the dataset to have a variance of 1 for all components. This is done by simply dividing each component by the square root of its eigenvalue.
- 27. Example: Olivee faces • 64x64 pixel gray scale image, 400 samples • 40 classes, 10 faces of each person 27 출처: hfp://corpocrat.com/2014/10/17/machine-learning-using-restricted-boltzmann-machines/
- 28. Example: Olivee faces 1. {0-1} scaling 2. Convolve (상, 하, 좌, 우) 28 출처: hfp://corpocrat.com/2014/10/17/machine-learning-using-restricted-boltzmann-machines/ X = np.asarray( X, 'ﬂoat32') X = (X - np.min(X, 0)) / (np.max(X, 0) + 0.0001) # 0<x<1 scaling 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html
- 29. Example: Olivee faces 2. Convolve (상, 하, 좌, 우) 29 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html def nudge_dataset(X, Y): """ This produces a dataset 5 'mes bigger than the original one, by moving the 8x8 images in X around by 1px to les, right, down, up """ direc'on_vectors = [ [[0, 1, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [1, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 1], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 1, 0]]] shiN = lambda x, w: convolve(x.reshape((64, 64)), mode='constant’,weights=w).ravel() X = np.concatenate([X] + [np.apply_along_axis(shis, 1, X, vector) for vector in direc'on_vectors]) Y = np.concatenate([Y for _ in range(5)], axis=0) return X, Y # Convert image array to binary with threshold X = X > 0.5 # True / False
- 30. Example: Olivee faces 3. Training 30 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html logis'c = linear_model.Logis'cRegression(C=10) rbm = BernoulliRBM(n_components=180, learning_rate=0.01, batch_size=10, n_iter=50, verbose=True, random_state=None) clf = Pipeline(steps=[('rbm', rbm), ('clf', logis'c)]) X_train, X_test, Y_train, Y_test = cross_valida'on.train_test_split( X, Y, test_size=0.2, random_state=0) clf.ﬁt(X_train,Y_train) Y_pred = clf.predict(X_test) print 'Score: ',(metrics.classiﬁca'on_report(Y_test, Y_pred)) *n_components: # of binary hidden units
- 31. Example: Olivee faces 4. Plot RBM components of ﬁrst 16 faces 31 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html comp = rbm.components_ image_shape = (64, 64) def plot_gallery('tle, images, n_col, n_row): plt.ﬁgure(ﬁgsize=(2. * n_col, 2.26 * n_row)) plt.sub'tle('tle, size=16) for i, comp in enumerate(images): plt.subplot(n_row, n_col, i + 1) vmax = max(comp.max(), -comp.min()) plt.imshow(comp.reshape(image_shape), cmap=plt.cm.gray, vmin=-vmax, vmax=vmax) plt.x'cks(()) plt.y'cks(()) plt.subplots_adjust(0.01, 0.05, 0.99, 0.93, 0.04, 0.) plt.show() plot_gallery('RBM componenets', comp[:16], 4,4)
- 32. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 32
- 33. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoﬀrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 33
- 34. Learning Mul'ple layers of representa'on • Overview – Mul'layer genera've model – Approximate inference for mul'layer genera've model – Learning many layers of features by composing RBMs Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 34
- 35. Learning Mul'ple layers of representa'on • Overview – Mul'layer genera've model – Approximate inference for mul'layer genera've model – Learning many layers of features by composing RBMs Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 35 We already covered this slide, page at 11 to 25
- 36. Learning Mul'ple layers of representa'on • Overview – Mul'layer genera've model – Approximate inference for mul'layer genera've model – Learning many layers of features by composing RBMs Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 36 Generated sample (image) = 인식(recogni'on)에 최적화 됨 Genera've model?
- 37. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 37 Mul'layer Genera've Model Genera've Model Why we use mul'layer genera've model for complex recogni'on? (= Why Deep learning?) “Generative model with only one hidden layer are much too simple for modeling The high-dimensional and richly structured sensory data that arrive at the cortex, But they have been pressed into service because, until recently, it was too difficult to Perform inference…”
- 38. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 38 Mul'layer Genera've Model Genera've Model Why we use mul'layer genera've model for complex recogni'on? (= Why Deep learning?) “Generative model with only one hidden layer are much too simple for modeling The high-dimensional and richly structured sensory data that arrive at the cortex, But they have been pressed into service because, until recently, it was too difficult to Perform inference…” Who? Yann LeCun!
- 39. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 39 Mul'layer Genera've Model Take an advantage of high dimensional rich data recogniton
- 40. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 40 그렇다면 왜? 이렇게 선형 요소부터 관심을 가질까요? “The role of the bottom up connection is to enable the network to determine activations For the features in each layer that constitute a plausible explanation (…) Some test images That the network classifies correctly even though it has never seen them before” Yann LeCun! Yann LeCun! 이렇게 바로 구해도 될텐데.
- 41. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 41 그렇다면 어떻게 이렇게 선형/일부/전체 구조 형태의 특징을 찾을 수 있는 weight를 구해 낼 수 있는걸까요? Yann LeCun!
- 42. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 42 (a) Two separate restricted Boltzmann Machines(RBMs). The higher-level RBM is trained by using the hidden ac'vates of the lower RBM as data. (b) Composing the two RBMs. Note that the connec'ons in the lower level of the composite genera've model are directed. The hidden states are s'll inferred by using bofom-up recogni'on connec'ons, but these are no longer part of the genera've model.
- 43. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 43 recogni'on inference !" ! ℎ! = 1 ! > 1 ! !" ! !! = 1 ! > 1 ! While updaMng weights,
- 44. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 44 recogni'on inference !" ! ℎ! = 1 ! > 1 ! !" ! !! = 1 ! > 1 ! While updaMng weights,
- 45. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 45 recogni'on inference !" ! ℎ! = 1 ! > 1 ! !" ! !! = 1 ! > 1 ! While updaMng weights,
- 46. Learning Mul'ple layers of representa'on Science Direct 2007, Geoﬀrey Hinton (University of Toronto) 46 recogni'on inference While updaMng weights, inference Why inference only? - quick, fast recogni'on: no repeated weight calcula'on - misclassiﬁca'on을 확인 가능 및 용인함으로서 local feature 확보 가능

No public clipboards found for this slide

Login to see the comments