Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Learning RBM(Restricted Boltzmann Machine in Practice)

In Deep Learning, learning RBM is basic hierarchical components of the layer. In this slide, we can learn basic components of RBM (bipartite graph, Gibbs Sampling, Contrastive Divergence (1-CD), Energy function of entropy).

  • Login to see the comments

Learning RBM(Restricted Boltzmann Machine in Practice)

  1. 1. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) Jaehyun Ahn Nov. 27. 2015 Sogang University 1
  2. 2. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 2
  3. 3. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Overview •  RBM Requires 7 meta parameters to learn –  Learning Rate –  The Momentum –  The weight-cost –  The sparsity target –  The ini'al values of the weights –  The number of hidden units –  The size of each mini-batch •  But this does not explain why the decisions were made or how minor changes will affect performance Aug 2010, Geoffrey Hinton (University of Toronto) 3
  4. 4. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Overview •  RBM Requires 7 meta parameters to learn –  Learning Rate –  The Momentum –  The weight-cost –  The sparsity target –  The ini'al values of the weights –  The number of hidden units –  The size of each mini-batch •  But this does not explain why the decisions were made or how minor changes will affect performance Aug 2010, Geoffrey Hinton (University of Toronto) A comparison of Neural Network Architectures 4
  5. 5. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Hopfield Energy Func'on of RBM •  That decides probability distribu'on of visible and hidden vector, which will be Aug 2010, Geoffrey Hinton (University of Toronto) {0, 1}! ! !, ℎ = − !!!! !∈!"#"$%& − !!ℎ! !∈!!""#$ − !!ℎ!!!" !,! ! !, ℎ = 1 ! !!!(!,!) ! = !!!(!,!) !,! 5
  6. 6. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Probability that the network assigns to a visible vector is given by summing over all possible hidden vectors: Aug 2010, Geoffrey Hinton (University of Toronto) ! = !!!(!,!) !,! 6 ! ! = 1 ! !!!(!,!) !
  7. 7. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Probability the network assignment to training image can be raised by adjus'ng the weights and biases to lower the energy: Aug 2010, Geoffrey Hinton (University of Toronto) ! = !!!(!,!) !,! 7 ! ! = 1 ! !!!(!,!) !
  8. 8. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Probability the network assignment to training image can be raised by adjus'ng the weights and biases to lower the energy: Aug 2010, Geoffrey Hinton (University of Toronto) 8 ! log !(!) !!!" =< !!ℎ! >!"#" −< !!ℎ! >!"#$% Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%) ! : learning rate Max Log likelihood Contras've Divergence K = k-step CD
  9. 9. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 9 Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%)
  10. 10. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 10 Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%) 우리가 v-train을 통해 얻게 되는 결과 = Posi've Phase 이상적인 weight의 분포로 이루어진 prob distrib를 weight deriv = Nega've Phase
  11. 11. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 11 Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%) 우리가 v-train을 통해 얻게 되는 결과 구하기 어려운부분, 왜? 우리는 이상적인 hidden node (feature)의 분포를 구성하게 하는 weight 를 모른다 이상적인 weight의 분포로 이루어진 prob distrib를 weight deriv
  12. 12. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 12 < !!ℎ! >!"#$% !(!, ℎ) By using Gibbs Sampling we can get joint distribu'on of , but we need to know !(!, ℎ)
  13. 13. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 13 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , ! ℎ! = 1 ! = !(!! + !!!!" ! ) Sampling update rule
  14. 14. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 14 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , ! ℎ! = 1 ! = !(!! + !!!!" ! ) ! !! = 1 ℎ = !(!! + ℎ!!!" ! ) Sampling update rule Energy equilibrium
  15. 15. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 15 01010110 … Image Input vector 01010110 … ! ℎ! = 1 ! = !(!! + !!!!" ! ) !" ! ℎ! = 1 ! > 1 ! 1 or 0 recogni'on < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) ,
  16. 16. A Prac'cal Guide to Training Restricted Boltzmann Machine •  How to get: Aug 2010, Geoffrey Hinton (University of Toronto) 16 01010110 … Weight update Comparing 11110110 … ∆!!" Genera'on (=inference) 11110111 … ! !! = 1 ℎ = !(!! + ℎ!!!" ! ) < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) ,
  17. 17. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Now we got: Aug 2010, Geoffrey Hinton (University of Toronto) 17 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , Energy equilibrium !!" When N-th Gibbs Sampling has done will decided
  18. 18. A Prac'cal Guide to Training Restricted Boltzmann Machine •  Now we got: Aug 2010, Geoffrey Hinton (University of Toronto) 18 < !!ℎ! >!"#$% !(!, ℎ) Gibbs sampling !(!|ℎ) !(ℎ|!) , Energy equilibrium !!" When N-th Gibbs Sampling has done will decided Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%)
  19. 19. 19 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto)
  20. 20. 20 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto)
  21. 21. 21 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) Δ!!" = !(< !!ℎ! >!"#" −< !!ℎ! >!"#$%)
  22. 22. 22 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) ! !! ℎ = !! + ℎ!!!" !
  23. 23. 23 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) ! ℎ! ! = !! + !!!!" !
  24. 24. 24 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) 알고리즘 종료시, 1-CD weights, biases (b, c)를 구할 수 있고 Training이 완료됨
  25. 25. 25 A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto)
  26. 26. An Analysis of Single-Layer Networks in Unsupervised Feature Learning •  Effec've Learning Features of 1-Hidden Layer RBM –  Features (# of hidden nodes) –  Recep've Fields (Filters, Field size) –  Whitening 26 2011, Honglak Lee There are two things we are trying to accomplish with whitening: 1.  Make the features less correlated with one another. 2.  Give all of the features the same variance. Whitening has two simple steps: 1.  Project the dataset onto the eigenvectors. This rotates the dataset so that there is no correlation between the components. 2.  Normalize the the dataset to have a variance of 1 for all components. This is done by simply dividing each component by the square root of its eigenvalue.
  27. 27. Example: Olivee faces •  64x64 pixel gray scale image, 400 samples •  40 classes, 10 faces of each person 27 출처: hfp://corpocrat.com/2014/10/17/machine-learning-using-restricted-boltzmann-machines/
  28. 28. Example: Olivee faces 1.  {0-1} scaling 2.  Convolve (상, 하, 좌, 우) 28 출처: hfp://corpocrat.com/2014/10/17/machine-learning-using-restricted-boltzmann-machines/ X = np.asarray( X, 'float32') X = (X - np.min(X, 0)) / (np.max(X, 0) + 0.0001) # 0<x<1 scaling 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html
  29. 29. Example: Olivee faces 2. Convolve (상, 하, 좌, 우) 29 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html def nudge_dataset(X, Y): """ This produces a dataset 5 'mes bigger than the original one, by moving the 8x8 images in X around by 1px to les, right, down, up """ direc'on_vectors = [ [[0, 1, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [1, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 1], [0, 0, 0]], [[0, 0, 0], [0, 0, 0], [0, 1, 0]]] shiN = lambda x, w: convolve(x.reshape((64, 64)), mode='constant’,weights=w).ravel() X = np.concatenate([X] + [np.apply_along_axis(shis, 1, X, vector) for vector in direc'on_vectors]) Y = np.concatenate([Y for _ in range(5)], axis=0) return X, Y # Convert image array to binary with threshold X = X > 0.5 # True / False
  30. 30. Example: Olivee faces 3. Training 30 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html logis'c = linear_model.Logis'cRegression(C=10) rbm = BernoulliRBM(n_components=180, learning_rate=0.01, batch_size=10, n_iter=50, verbose=True, random_state=None) clf = Pipeline(steps=[('rbm', rbm), ('clf', logis'c)]) X_train, X_test, Y_train, Y_test = cross_valida'on.train_test_split( X, Y, test_size=0.2, random_state=0) clf.fit(X_train,Y_train) Y_pred = clf.predict(X_test) print 'Score: ',(metrics.classifica'on_report(Y_test, Y_pred)) *n_components: # of binary hidden units
  31. 31. Example: Olivee faces 4. Plot RBM components of first 16 faces 31 컨볼브: hfp://juanreyero.com/ar'cle/python/python-convolu'on.html comp = rbm.components_ image_shape = (64, 64) def plot_gallery('tle, images, n_col, n_row): plt.figure(figsize=(2. * n_col, 2.26 * n_row)) plt.sub'tle('tle, size=16) for i, comp in enumerate(images): plt.subplot(n_row, n_col, i + 1) vmax = max(comp.max(), -comp.min()) plt.imshow(comp.reshape(image_shape), cmap=plt.cm.gray, vmin=-vmax, vmax=vmax) plt.x'cks(()) plt.y'cks(()) plt.subplots_adjust(0.01, 0.05, 0.99, 0.93, 0.04, 0.) plt.show() plot_gallery('RBM componenets', comp[:16], 4,4)
  32. 32. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 32
  33. 33. A Prac'cal Guide to Training Restricted Boltzmann Machine Aug 2010, Geoffrey Hinton (University of Toronto) Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 33
  34. 34. Learning Mul'ple layers of representa'on •  Overview – Mul'layer genera've model – Approximate inference for mul'layer genera've model – Learning many layers of features by composing RBMs Science Direct 2007, Geoffrey Hinton (University of Toronto) 34
  35. 35. Learning Mul'ple layers of representa'on •  Overview – Mul'layer genera've model – Approximate inference for mul'layer genera've model – Learning many layers of features by composing RBMs Science Direct 2007, Geoffrey Hinton (University of Toronto) 35 We already covered this slide, page at 11 to 25
  36. 36. Learning Mul'ple layers of representa'on •  Overview – Mul'layer genera've model – Approximate inference for mul'layer genera've model – Learning many layers of features by composing RBMs Science Direct 2007, Geoffrey Hinton (University of Toronto) 36 Generated sample (image) = 인식(recogni'on)에 최적화 됨 Genera've model?
  37. 37. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 37 Mul'layer Genera've Model Genera've Model Why we use mul'layer genera've model for complex recogni'on? (= Why Deep learning?) “Generative model with only one hidden layer are much too simple for modeling The high-dimensional and richly structured sensory data that arrive at the cortex, But they have been pressed into service because, until recently, it was too difficult to Perform inference…”
  38. 38. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 38 Mul'layer Genera've Model Genera've Model Why we use mul'layer genera've model for complex recogni'on? (= Why Deep learning?) “Generative model with only one hidden layer are much too simple for modeling The high-dimensional and richly structured sensory data that arrive at the cortex, But they have been pressed into service because, until recently, it was too difficult to Perform inference…” Who? Yann LeCun!
  39. 39. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 39 Mul'layer Genera've Model Take an advantage of high dimensional rich data recogniton
  40. 40. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 40 그렇다면 왜? 이렇게 선형 요소부터 관심을 가질까요? “The role of the bottom up connection is to enable the network to determine activations For the features in each layer that constitute a plausible explanation (…) Some test images That the network classifies correctly even though it has never seen them before” Yann LeCun! Yann LeCun! 이렇게 바로 구해도 될텐데.
  41. 41. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 41 그렇다면 어떻게 이렇게 선형/일부/전체 구조 형태의 특징을 찾을 수 있는 weight를 구해 낼 수 있는걸까요? Yann LeCun!
  42. 42. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 42 (a)  Two separate restricted Boltzmann Machines(RBMs). The higher-level RBM is trained by using the hidden ac'vates of the lower RBM as data. (b)  Composing the two RBMs. Note that the connec'ons in the lower level of the composite genera've model are directed. The hidden states are s'll inferred by using bofom-up recogni'on connec'ons, but these are no longer part of the genera've model.
  43. 43. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 43 recogni'on inference !" ! ℎ! = 1 ! > 1 ! !" ! !! = 1 ! > 1 ! While updaMng weights,
  44. 44. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 44 recogni'on inference !" ! ℎ! = 1 ! > 1 ! !" ! !! = 1 ! > 1 ! While updaMng weights,
  45. 45. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 45 recogni'on inference !" ! ℎ! = 1 ! > 1 ! !" ! !! = 1 ! > 1 ! While updaMng weights,
  46. 46. Learning Mul'ple layers of representa'on Science Direct 2007, Geoffrey Hinton (University of Toronto) 46 recogni'on inference While updaMng weights, inference Why inference only? -  quick, fast recogni'on: no repeated weight calcula'on -  misclassifica'on을 확인 가능 및 용인함으로서 local feature 확보 가능

×