SlideShare a Scribd company logo
1 of 15
Download to read offline
One-­‐shot	
  learning	
  by	
  inver2ng	
  
a	
  composi2onal	
  causal	
  process
Brenden	
  M.Lake
Ruslan	
  Salakhutdinov
Joshua	
  B.	
  Tenenbaum
能地宏	
  @NII	
  

※スライド中の図は論文からの引用です
people can classify new images of a foreign handwritte
[23, 16, 17]. Similarly, while classifiers are generally t
16, 17]. Similarly, while classifiers are generally
[23,
g benchmark One-­‐shot	
  classifica2on[4]and CIFAR
benchmark datasets such as ImageNet [4] and CIFA
datasets such as ImageNet
b)
b)

an you learn a new concept from just one example? (a & b)
n in learn a new concept from just one example? (a & b
wnyoured? Answers for b) are row 4 column 3 (left) and row
wn in red? Answers for b) are row 4 column 3 (left) and ro
ample
mple

Example

People

HBP

One-­‐shot	
  Genera2on
People
People

HBPL
HBPL

Affi
Affine
One-­‐shot	
  Genera2on

Visual Turing test. To compare
ample
mple

Example

People

HBP

One-­‐shot	
  Genera2on
People
People
People

HBPL
HBPLmodel
The	
  

Affi
Affine
One-­‐shot	
  Genera2on
The	
  model

People

Visual Turing test. To compare
Overview
‣ 人間はたった一つの例から、そのシンボルの特徴を取り出せる
-­‐ 分類:似たものを取り出せる
-­‐ 生成:新しいサンプルを作り出せる

‣ 機械学習は典型的に、ラベル毎に大量のデータを必要とする
-­‐ ex)	
  MNIST:	
  6000	
  training	
  data	
  /	
  class

‣ タスクと貢献
-­‐ 機械学習はこの人間の能力を模倣できるか?
-­‐ 丁寧に生成モデルを定義したら、人間と同じような結果が得られた
-­‐ 人間も同じような仕組みで特徴を抽出していると言えるかも
ing algorithms typically require hundreds or thousand
typically require hundreds or thousan
same problems. Here we present a Hierarchical Bay
Here we present a Hierarchical Bay
positionality and causality that can learn aawide rang
causality that can learn wide ran
ple) visual concepts, generalizing in human-like way
generalizing in human-like wa
evaluated performance on a challenging one-shot c
evaluated performance on a challenging one-shot cla
model
model achieved a human-level error rate while subst
human-level error rate while subs
learn	
  h We also tested the model on a
deep
deep learning models.yper	
  parameters: on an
models. We also tested the model
erating
examples, by using
by strokes
30	
  aa) library oferating new examples, number ofusing aa “visual Turing tes
lphabets primitives human-like performance. “visual Turing te
b)
motor
produces
produces
performance.
Figure 4: Lear

データと学習

frequency

Omniglot	
  dataset

Number of strokes

6000

1

1

2

2

parameters. a)
2000
primitives, whe
row shows the
0
0
2
4
6
8
mon ones. Th
c)
stroke start positions
trol point (circle
b&c) Empirica
People can acquire a new concept from only the barestof e
People can acquire a new concept from only the barestwhere ex
of th
tions
examples in a high-dimensional space of raw perceptualinp
c) show
sta
examples in a high-dimensional space of raw perceptualhowinp
1
2
≥4
differs by stroke
tackled some of the same classification 3and recognition proble

1 Introduction
1 Introduction

1

4000

1

2

3

2

3

4

4

tackled some of the same classification and recognition probl
Image. standard transformation Arequire4 hundredsfrom P (A(m) ) = Nof ex
An image algorithms (m) 2 R is sampled
the standard algorithms require hundreds or thousands ([1, 1,
the
where the first two elements control a global re-scaling and the or thousands of e
While centerstandard T (m) . The transformed trajectoriessecond two control a globa
tion of the the of mass of MNIST benchmark dataset then be rendered as
can for digit recog
While the standard MNIST benchmark dataset for digit recogn
class image,
can classify new [10] (see rom SI-2). handw
grayscale[19], people can classify new images of foreignhandwri
20	
  alphabetsusing an inklearn	
  posterior	
  fof aaforeign This graysc
class [19], peoplenoisemodel adapted fromimages Sectionmore robust during
is (Figure 1b)by two16, 17]. Similarly,make the classifiers are general
then perturbed [23,
processes, which while gradient
(Figure 1b) partial solutions Similarly,xample:
tion and encourage[23, 16, 17]. during classification. These processes include peo
Figure 2: Four alphabets from Omniglot, each with five only	
  one	
  such asby four differentconvol
characters e while classifiers are genera
class, using benchmark datasets drawnas ImageNet [4] and✏(m) ,
using benchmark datasets and pixel flipping with probability CIF
(m)
a class, filter with standard deviation b
Gaussian
such ImageNet [4] and CI
3

3

4

4

50	
  alphabets;
a)
amount noise “Segway” b) Figure on a These range larger
drawn
new visual object from just one examplea) ofpixels then parameterize 105x105uniformly1a).pre-specifiednew (Section S
(e.g., a and ✏ areb) independent Bernoulli distributions, completi
in
grayscale
1600	
  characters; along with larger and “deeper” model |✓ ) = P (I |T , A , while). perform
have developed
architectures, and , ✏
model of binary images P (I
steadily (and even spectacularly [15]) improved in this big data setting, it is unknown
20	
  examples	
  /	
  character
2.3 Learning high-level knowledge of motor programs
(m)
b

(m)

(m)

(m)

(m)

(m)

(m)

(m)
b

(m)

progress translates to the “one-shot” setting that is a hallmark of human learning [3, 22, 28

The Omniglot dataset was randomly split into a 30 alphabet “background” set and a 20
“evaluation” set, constrained such that the background set included the six most common
as determined by Google hits. Background images, paired with their motor data, were use
the hyperparameters of the HBPL model, including a set of 1000 primitive motor elemen
4a) and position models for a drawing’s first, second, and third stroke, etc. (Figure 4c).
possible, cross-validation (within the background set) was used to decide issues of model c
within the conditional probability distributions of HBPL. Details are provided in Sectio
learning the1: Can of primitives, a new concept from just one example? transfo
Figure models you learn positions, relations, token variability, and image (a &

Additionally, while classification has received most of the attention in machine learning
can generalize in a variety of other ways after learning a new concept. Equipped with the
“Segway” or a new handwritten character (Figure 1c), people can produce new examples,
object into its critical parts, and fill in a missing part of an image. While this flexibility highl
Figure 1:shown much Answers for are row 4 column features&
richness of people’s concepts, suggesting they Can youred? more thanb)discriminative 3 (left) and
concept are in learn a new concept from just one example? (a
2.4

Inference
先に結果を紹介
One-­‐shot	
  classifica2on	
  (Error	
  rate)
34.8 38
18.2
4.5
human

4.8
HBPL affine DBM HD

‣ Deep	
  learning	
  よりも良い性能
‣ ほぼ人間と同じエラー率

One-­‐shot	
  genera2on
Visual	
  Turing	
  test:
9個の同じシンボルを見て
どちらが人間かを当ててもらう
	
   	
  56%	
  で正解
モデル
type level

primitives

R1

} y11

x11
(m)
R1 x(m)
11

}

(m)
L1

(m)
y11

(m)

R2

x12
y12

17

= along s11

42

x21
y21

(m)
(m)
R2
(m)
x12
x21 (m)
(m)
y12 (m)
y21
L2

T2

(m)

2

character type 2 ( = 2)

157

z11 = 5

z21 = 42

(m)

T1
{A, ✏,

5

z12 = 17

z11 = 17

= independent

token level ✓(m)

...

character type 1 ( = 2)

R1

I (m)

x11
y11

= independent
(m)

R2

= start of s11

(m)

(m)
x11(m) R2
y11
(m)
L2

R1

(m)

L1

x21
y21
(m)

x21
(m)
y21

(m)

(m)
T1

{A, ✏,

b}

z21 = 17

T2
(m)

b}

I (m)

Figure 3: An illustration of the HBPL model generating two character types (left and right), where the dotted
line separates the type-level from the token-level variables. Legend: number of strokes , relations R, primitive
id z (color-coded to highlight sharing), control points x (open circles), scale y, start locations L, trajectories T ,
transformation A, noise ✏ and ✓b , and image I.
ハイパーパラメータの学習
a)

b)

library of motor primitives

number of of strokes
Number strokes

frequency

6000
1

1

2

2

4000
2000
0

c)

0

2

4

6

8

stroke start positions

1

1

2

1

2

3

3

4

2

3

3

3
4

4

≥4

4

Figur
param
primi
row
mon
trol p
b&c)
tions
c) sh
differ

Image. An image transformation A(m) 2 R4 is sampled from P (A(m) ) =
where the first two elements control a global re-scaling and the second two cont
‣ シンボルの描き方に関する“常識”を学習
tion of the center of mass of T (m) . The transformed trajectories can then be ren
grayscale image, using an ink model adapted from [10] (see Section SI-2). Th
is then ycle	
  data	
  by two noise processes,
‣ motor	
  cperturbed (動画)を用いる which make the gradient more robu
tion and encourage partial solutions during classification. These processes inclu
(m)
a Gaussian filter with standard deviation b and pixel flipping with probabi
(m)

(m)
d MNIST benchmark dataset for digit recognition has 6000 training example
can classify new images of a foreign handwritten character from just one exa
can classify new imagesin theinference intestedcharacterveryjust one exam
e. Forty participants of a foreign were this model is fromchallenging
Posterior USA handwritten on one-shot classificati
6, 17]. Similarly, while classifiers are generally trainedon hundreds of images
6, 17]. Similarly,Figureclassifiersare generally traineddifferent numbersaan
ch trial, as in whileImageNet [4] and CIFAR-10/100 [14], image can lea
aas 1b, participants space shown an peopleofof n
large combinatorial were of on hundreds image
hmark datasets such as ImageNet [4] and CIFAR-10/100 [14], people can le
mark datasets such that shows the same character. To ensure class
on another image

one-­‐shot	
  classifica2on

developed an algorithm for finding K high-probabi
c)
c)
各イメージに対して、
completed mostone randomly selected proposed by a f
just promising candidates trial from each
are the
fication tasks, so5a and detailed in Section の	
  posterior	
  を推定 appro
that characters never repeated These parses Ther
stroke	
   SI-5. across trials.
wo practice trials with the Latin and Greek alphabets, and K
feedback
type
X
(m) (m)
Human dr
( , a |I ) ⇡
rchial Bayesian Program Learning.P For ✓ test image I (T ) wHuman
and (✓
i 2
token
..., 20, we use a Bayesian classification rule for which wei=1
compute
1
(T ) (c)
where each weight wi is proportional to parse 212 3
score
argmax log P (I |I ).
b)
b)
participants

c

[i

˜
earn a new concept from just one example? (a & b) Where arewi / wiexamples of
the other = P (
vely, new conceptare row 4 column 3 (left) (a &rowWhere are 4 (right). c) The lear
earnAnswersapproximationone example? and b) 2 column the other examples
a the for b) from just uses the HBPL search algorithm to get K
ed?
Pcolumn 4 (right). c) The le
d? Answers abilities suchconstrained such that
CMC chains to are rowas generating (left) and row parsing. = 1. around tha
rt many other for b) estimatecolumn 3 examples and 2 variability Rather eac
and 4 the local type-level i wi
1
rt many other abilities suchre-optimizes the token-level variables ✓ (T ) (al
as generating examples and parsing.
2
nt-based searches to
1
approximation can be improved by incorporating so
posterior
2
推定されたtypeからの
(T )
3

canoni
x
cano

mage I

. The approximation can be written aswhich closelySI-7 fo
the token-level variables ✓(m) , (see Section track
1Z
ターゲットの生成確率
6.2
(T ) (c) is inexpensive to draw conditional samples from the
1 P (I (T ) |✓ (T ) )P (✓ (T ) | )Q(✓ (c) , , I
log P (I |I ) ⇡ log
6
it does not require evaluating the likelihood of the im
an algorithm for finding K high-probability parses, [1] , ✓(m)[1] , ..., [K] , ✓(m
st promising candidates proposed by a fast, bottom-up image analysis, show
iled in Section SI-5. These parses approximate the posterior with a discrete d

posterior	
  inference

P ( , ✓(m) |I (m) ) ⇡
train

e

K
X
i=1
train

wi (✓(m)

✓(m)[i] ) (

[i]

),

prior	
  からのスコア	
   	
  正規化
train

weight wi is proportional to parse score, marginalizing over 1
shape variables
1
1
Binary image

e

b)

1

2

train
train
train
traintrain
train

22
222
2
1
111
11

aw)

P

0

2
2

train

1

1
2
22 11
222 111
2i 1
1

wi / w = P (
˜
−59.6
0

2
1 2

22 1
1
1
11
111
[i]
(m)[i] 222 (m)
22
11
11
111
2
2
1
22
1
x 0
1
−59.6
train

1
1

,✓

−88.9
−59.6

,I

12

)

−159
−88.9

1

1

11
111 2 1
1
−88.9

−168
−159

2

1

1

12

−159
−168

ined such that i wi0 = 1. Rather than using just a point estimate for eac
-159
-60
-89
-168
1
1
22
22 1
12
2
ion can be improved by incorporating some1of the local variance around the
2 1
1 21
11
1
2
1
1
111
22
222
(m) 2
111
パースの候補を選んで近似
2
1
2 allow
evel variables ✓ 1, which closely track122 11111image, 2222211 for1 little variability,
the11
2
11
2
2
11
2 1
1
22 11
222 11
1
11
11
111
1
(m)[i] (m) 1
ive to draw conditional samples from the1type-level P ( |✓
,I ) = P(
1.	
  シンボルの上でランダムウォークを行い、ストロークのサンプル
equire evaluating the likelihood of the image, just the local variance around th
を得る(150個)
d with the token-level fixed. Metropolis Hastings is run to produce N samp
each parse ✓(m)[i] , denoted by [i1] , ..., [iN ] , where the improved approxim

ge

Thinned image

ge

aned)

test
00000
0

test
−59.6
−59.6
−59.6
−59.6
−59.6
0
−59.6

test
test
test
test test
test

test

−831

−88.9
−88.9
−88.9
−88.9
−88.9
−59.6
−88.9
0

−159
−159
−159
−159
−159
−88.9
−159
−59.6

−168
−168
−168
−168
−168
−159
−168
−88.9

−168
−159

test

−881
−2.12e+03

−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−831

test

−1.98e+03
−1.98e+03
−1.98e+03
−1.98e+03
−1.98e+03
−2.12e+03
−881

−1.41e+03
−983
−1.98e+03

−1.22e+03
−979
−2.07e+03

−2.07e+03
−2.07e+03
−2.07e+03
−2.07e+03
−2.07e+03
−1.41e+03
−1.98e+03
−983

−2.09e+03
−2.09e+03
−2.09e+03
−2.09e+03
−2.09e+03
−1.22e+03
−2.07e+03
−979

−1.18e+03
−1.17e+03
−2.09e+03

−1.72e+03
−2.12e+03

−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−1.18e+03
−2.09e+03
−1.17e+03

−1.72e+03
−2.12e+03

planning

2.	
  そのストロークのスコアを	
  prior	
  -1273
から計算、上位	
  K	
  個に絞る
-831
-2041 N
K
X by a thinning algorithm X (ii) and
image. (m)
is processed
(m) a) The raw image (i) (m)
(m)
(m)
(m)[i] 1 [18]

ned

,✓

ned

|I

) ⇡ Q( , ✓

planning cleaned

,I

)=

wi (✓

✓

)

(
barest of experience – just one or a handful of
の計算
adjustment
rceptual input. Although machine learning has
nition problems that people solve so effortlessly, 1
2
1
b)
2
1
he baresta)of examples to reach good performance.2121 1
experience – just one or a handful of 1 2 1
i
1
1
1
usands of
11
22
2
2
2
22 1
222 1
22
222
11
111
perceptual input. Although machine learning111has
11
11
111
111
digit recognition people solve so effortlessly,
ognition problems that has 6000 training examples per
-60
-168
eign handwritten character 2from just one example-159
housands ofii examples to reach good0 performance. -89
1
22
2
1
1
2
1
1
1
for digit recognitiontrained on hundreds of images per 111111
has 6000 training examples per
s are generally
22
222
1
22
222
foreign handwritten character from111just one example
11
111
22 11
222 111
11
111
1
1
t [4] andiiiCIFAR-10/100 [14], peopleper learn a
can
fiers are generally trained on hundreds of images
Thinned

Binary image

train

train

Binary image

Binary image

train

train
train
train
traintrain
train

1

1

Traced graph (raw)

2

0

22
1

1

−59.6
0

test
test
test
test test
test

Thinned image

test
−59.6
−59.6
−59.6
−59.6
−59.6
0
−59.6

−831

1
2
2
1
1

−881
−2.12e+03

Net [4] and CIFAR-10/100 [14], people can learn a
c)
planning

planning

−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−831

−1.98e+03
−1.98e+03
−1.98e+03
−1.98e+03
−1.98e+03
−2.12e+03
−881

1

2
2 11
0
−88.9
−59.6

2

1

1

1
22

1
1

−59.6
−159
−88.9

1

2

1

−88.9
−168
−159

2
1

1

12

−
−

test

test
2

12

traced graph (cleaned)

1

1

1

train

2

test
00000
0

Thinned image

Thinned image

2

train

−88.9
−88.9
−88.9
−88.9
−88.9
−59.6
−88.9
0

−159
−159
−159
−159
−159
−88.9
−159
−59.6

−168
−168
−168
−168
−168
−159
−168
−88.9

1
2

test

2

1

1

1

2 1
1
2
−1.41e+03
−983
−1.98e+03
1

2 2
11

−2.07e+03
−2.07e+03
−2.07e+03
−2.07e+03
−2.07e+03
−1.41e+03
−1.98e+03
−983

−2.09e+03
−2.09e+03
−2.09e+03
−2.09e+03
−2.09e+03
−1.22e+03
−2.07e+03
−979

−1.22e+03
−979
−2.07e+03

1

1
2
1

−1.18e+03
−1.17e+03
−2.09e+03

−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−2.12e+03
−1.18e+03
−2.09e+03
−1.17e+03

−
−1

1

2

2 1
1

−1.7
−2.1

−1.72
−2.1

planning

c)
-831

-1273

-2041

e 5: Parsing a raw image. a) The raw image (i) is processed by a thinning algorithm [18] (ii)
zed as an undirected graph [20] (iii) where parses are guided random walks (Section SI-5). b)
parses found for that image (top row) are shown with their log wj (Eq. 5), where numbers insid
Human drawers
推定された	
  type	
  circles denote sub-stroke breaks. These fi
Human drawers
e stroke order and starting position, and smaller open 変数を用いて、ターゲットの
re-fit to three different raw images of characters (left in image triplets), where the best parse (t
token	
  変数を推定(MCMC)
2
1
1 are shown 1
1
s associated image reconstruction (bottom right)
above its score (Eq. 9).
2
2
2
1
1
3
1
3
n an approximate posterior for a particular image, the model 3can evaluate the posterior
2 3
2
score of a new& b) Wherere-fitting the examples of the
image by are the other token-level variables 5(bottom Figure 5b), as 5.3
expl
canonical
5.1
3
3
e example? (a
planning cleaned

planning cleaned

planning cleaned
まとめ

‣ モデルはかなり作り込んでいてアドホック
‣ しかし、機械学習も人間と同じように一つのexampleから分類・
生成が行えることを示した、という点で面白い
‣ 人間がどのように特徴を抽出しているか、ということの
理解に繋がる

More Related Content

What's hot

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Auto-encoding variational bayes
Auto-encoding variational bayesAuto-encoding variational bayes
Auto-encoding variational bayesKyuri Kim
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksKyuri Kim
 
Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Shunta Saito
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural NetKen'ichi Matsui
 
Auto encoding-variational-bayes
Auto encoding-variational-bayesAuto encoding-variational-bayes
Auto encoding-variational-bayesmehdi Cherti
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerSeiya Tokui
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task LearningMasahiro Suzuki
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain홍배 김
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionEun Ji Lee
 
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...Susang Kim
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowOswald Campesato
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationHidekazu Oiwa
 
Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2TOMMYLINK1
 
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...Universitat Politècnica de Catalunya
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC datatuxette
 

What's hot (20)

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Auto-encoding variational bayes
Auto-encoding variational bayesAuto-encoding variational bayes
Auto-encoding variational bayes
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向
 
"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net"Deep Learning" Chap.6 Convolutional Neural Net
"Deep Learning" Chap.6 Convolutional Neural Net
 
Auto encoding-variational-bayes
Auto encoding-variational-bayesAuto encoding-variational-bayes
Auto encoding-variational-bayes
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
 
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
 
Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2
 
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
 
Differential analyses of structures in HiC data
Differential analyses of structures in HiC dataDifferential analyses of structures in HiC data
Differential analyses of structures in HiC data
 

Viewers also liked

論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative ModelsSeiya Tokui
 
半教師あり学習
半教師あり学習半教師あり学習
半教師あり学習syou6162
 
NIPS2015読み会: Ladder Networks
NIPS2015読み会: Ladder NetworksNIPS2015読み会: Ladder Networks
NIPS2015読み会: Ladder NetworksEiichi Matsumoto
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習Preferred Networks
 
機械学習とコンピュータビジョン入門
機械学習とコンピュータビジョン入門機械学習とコンピュータビジョン入門
機械学習とコンピュータビジョン入門Kinki University
 
Active Learning 入門
Active Learning 入門Active Learning 入門
Active Learning 入門Shuyo Nakatani
 

Viewers also liked (6)

論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models
 
半教師あり学習
半教師あり学習半教師あり学習
半教師あり学習
 
NIPS2015読み会: Ladder Networks
NIPS2015読み会: Ladder NetworksNIPS2015読み会: Ladder Networks
NIPS2015読み会: Ladder Networks
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
 
機械学習とコンピュータビジョン入門
機械学習とコンピュータビジョン入門機械学習とコンピュータビジョン入門
機械学習とコンピュータビジョン入門
 
Active Learning 入門
Active Learning 入門Active Learning 入門
Active Learning 入門
 

Similar to NIPS読み会2013: One-shot learning by inverting a compositional causal process

A proposed accelerated image copy-move forgery detection-vcip2014
A proposed accelerated image copy-move forgery detection-vcip2014A proposed accelerated image copy-move forgery detection-vcip2014
A proposed accelerated image copy-move forgery detection-vcip2014SondosFadl
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-feiTianlu Wang
 
20140530.journal club
20140530.journal club20140530.journal club
20140530.journal clubHayaru SHOUNO
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Riccardo Satta
 
20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptxssuser7807522
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...Mokhtar SELLAMI
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer VisionDavid Dao
 
A hybrid genetic algorithm and chaotic function model for image encryption
A hybrid genetic algorithm and chaotic function model for image encryptionA hybrid genetic algorithm and chaotic function model for image encryption
A hybrid genetic algorithm and chaotic function model for image encryptionsadique_ghitm
 
FAN search for image copy-move forgery-amalta 2014
 FAN search for image copy-move forgery-amalta 2014 FAN search for image copy-move forgery-amalta 2014
FAN search for image copy-move forgery-amalta 2014SondosFadl
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image RestorationMathankumar S
 
A comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrievalA comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrievalcsandit
 
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVALA COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVALcscpconf
 
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...EL-Hachemi Guerrout
 

Similar to NIPS読み会2013: One-shot learning by inverting a compositional causal process (20)

A proposed accelerated image copy-move forgery detection-vcip2014
A proposed accelerated image copy-move forgery detection-vcip2014A proposed accelerated image copy-move forgery detection-vcip2014
A proposed accelerated image copy-move forgery detection-vcip2014
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
 
IPT.pdf
IPT.pdfIPT.pdf
IPT.pdf
 
20140530.journal club
20140530.journal club20140530.journal club
20140530.journal club
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011Person re-identification, PhD Day 2011
Person re-identification, PhD Day 2011
 
20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx20230213_ComputerVision_연구.pptx
20230213_ComputerVision_연구.pptx
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
 
Human Emotion Recognition
Human Emotion RecognitionHuman Emotion Recognition
Human Emotion Recognition
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
 
Rn d presentation_gurulingannk
Rn d presentation_gurulingannkRn d presentation_gurulingannk
Rn d presentation_gurulingannk
 
A hybrid genetic algorithm and chaotic function model for image encryption
A hybrid genetic algorithm and chaotic function model for image encryptionA hybrid genetic algorithm and chaotic function model for image encryption
A hybrid genetic algorithm and chaotic function model for image encryption
 
FAN search for image copy-move forgery-amalta 2014
 FAN search for image copy-move forgery-amalta 2014 FAN search for image copy-move forgery-amalta 2014
FAN search for image copy-move forgery-amalta 2014
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image Restoration
 
A comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrievalA comparative analysis of retrieval techniques in content based image retrieval
A comparative analysis of retrieval techniques in content based image retrieval
 
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVALA COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
 
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
3D Brain Image Segmentation Model using Deep Learning and Hidden Markov Rando...
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

NIPS読み会2013: One-shot learning by inverting a compositional causal process

  • 1. One-­‐shot  learning  by  inver2ng   a  composi2onal  causal  process Brenden  M.Lake Ruslan  Salakhutdinov Joshua  B.  Tenenbaum 能地宏  @NII   ※スライド中の図は論文からの引用です
  • 2. people can classify new images of a foreign handwritte [23, 16, 17]. Similarly, while classifiers are generally t 16, 17]. Similarly, while classifiers are generally [23, g benchmark One-­‐shot  classifica2on[4]and CIFAR benchmark datasets such as ImageNet [4] and CIFA datasets such as ImageNet b) b) an you learn a new concept from just one example? (a & b) n in learn a new concept from just one example? (a & b wnyoured? Answers for b) are row 4 column 3 (left) and row wn in red? Answers for b) are row 4 column 3 (left) and ro
  • 7. Overview ‣ 人間はたった一つの例から、そのシンボルの特徴を取り出せる -­‐ 分類:似たものを取り出せる -­‐ 生成:新しいサンプルを作り出せる ‣ 機械学習は典型的に、ラベル毎に大量のデータを必要とする -­‐ ex)  MNIST:  6000  training  data  /  class ‣ タスクと貢献 -­‐ 機械学習はこの人間の能力を模倣できるか? -­‐ 丁寧に生成モデルを定義したら、人間と同じような結果が得られた -­‐ 人間も同じような仕組みで特徴を抽出していると言えるかも
  • 8. ing algorithms typically require hundreds or thousand typically require hundreds or thousan same problems. Here we present a Hierarchical Bay Here we present a Hierarchical Bay positionality and causality that can learn aawide rang causality that can learn wide ran ple) visual concepts, generalizing in human-like way generalizing in human-like wa evaluated performance on a challenging one-shot c evaluated performance on a challenging one-shot cla model model achieved a human-level error rate while subst human-level error rate while subs learn  h We also tested the model on a deep deep learning models.yper  parameters: on an models. We also tested the model erating examples, by using by strokes 30  aa) library oferating new examples, number ofusing aa “visual Turing tes lphabets primitives human-like performance. “visual Turing te b) motor produces produces performance. Figure 4: Lear データと学習 frequency Omniglot  dataset Number of strokes 6000 1 1 2 2 parameters. a) 2000 primitives, whe row shows the 0 0 2 4 6 8 mon ones. Th c) stroke start positions trol point (circle b&c) Empirica People can acquire a new concept from only the barestof e People can acquire a new concept from only the barestwhere ex of th tions examples in a high-dimensional space of raw perceptualinp c) show sta examples in a high-dimensional space of raw perceptualhowinp 1 2 ≥4 differs by stroke tackled some of the same classification 3and recognition proble 1 Introduction 1 Introduction 1 4000 1 2 3 2 3 4 4 tackled some of the same classification and recognition probl Image. standard transformation Arequire4 hundredsfrom P (A(m) ) = Nof ex An image algorithms (m) 2 R is sampled the standard algorithms require hundreds or thousands ([1, 1, the where the first two elements control a global re-scaling and the or thousands of e While centerstandard T (m) . The transformed trajectoriessecond two control a globa tion of the the of mass of MNIST benchmark dataset then be rendered as can for digit recog While the standard MNIST benchmark dataset for digit recogn class image, can classify new [10] (see rom SI-2). handw grayscale[19], people can classify new images of foreignhandwri 20  alphabetsusing an inklearn  posterior  fof aaforeign This graysc class [19], peoplenoisemodel adapted fromimages Sectionmore robust during is (Figure 1b)by two16, 17]. Similarly,make the classifiers are general then perturbed [23, processes, which while gradient (Figure 1b) partial solutions Similarly,xample: tion and encourage[23, 16, 17]. during classification. These processes include peo Figure 2: Four alphabets from Omniglot, each with five only  one  such asby four differentconvol characters e while classifiers are genera class, using benchmark datasets drawnas ImageNet [4] and✏(m) , using benchmark datasets and pixel flipping with probability CIF (m) a class, filter with standard deviation b Gaussian such ImageNet [4] and CI 3 3 4 4 50  alphabets; a) amount noise “Segway” b) Figure on a These range larger drawn new visual object from just one examplea) ofpixels then parameterize 105x105uniformly1a).pre-specifiednew (Section S (e.g., a and ✏ areb) independent Bernoulli distributions, completi in grayscale 1600  characters; along with larger and “deeper” model |✓ ) = P (I |T , A , while). perform have developed architectures, and , ✏ model of binary images P (I steadily (and even spectacularly [15]) improved in this big data setting, it is unknown 20  examples  /  character 2.3 Learning high-level knowledge of motor programs (m) b (m) (m) (m) (m) (m) (m) (m) b (m) progress translates to the “one-shot” setting that is a hallmark of human learning [3, 22, 28 The Omniglot dataset was randomly split into a 30 alphabet “background” set and a 20 “evaluation” set, constrained such that the background set included the six most common as determined by Google hits. Background images, paired with their motor data, were use the hyperparameters of the HBPL model, including a set of 1000 primitive motor elemen 4a) and position models for a drawing’s first, second, and third stroke, etc. (Figure 4c). possible, cross-validation (within the background set) was used to decide issues of model c within the conditional probability distributions of HBPL. Details are provided in Sectio learning the1: Can of primitives, a new concept from just one example? transfo Figure models you learn positions, relations, token variability, and image (a & Additionally, while classification has received most of the attention in machine learning can generalize in a variety of other ways after learning a new concept. Equipped with the “Segway” or a new handwritten character (Figure 1c), people can produce new examples, object into its critical parts, and fill in a missing part of an image. While this flexibility highl Figure 1:shown much Answers for are row 4 column features& richness of people’s concepts, suggesting they Can youred? more thanb)discriminative 3 (left) and concept are in learn a new concept from just one example? (a 2.4 Inference
  • 9. 先に結果を紹介 One-­‐shot  classifica2on  (Error  rate) 34.8 38 18.2 4.5 human 4.8 HBPL affine DBM HD ‣ Deep  learning  よりも良い性能 ‣ ほぼ人間と同じエラー率 One-­‐shot  genera2on Visual  Turing  test: 9個の同じシンボルを見て どちらが人間かを当ててもらう    56%  で正解
  • 10. モデル type level primitives R1 } y11 x11 (m) R1 x(m) 11 } (m) L1 (m) y11 (m) R2 x12 y12 17 = along s11 42 x21 y21 (m) (m) R2 (m) x12 x21 (m) (m) y12 (m) y21 L2 T2 (m) 2 character type 2 ( = 2) 157 z11 = 5 z21 = 42 (m) T1 {A, ✏, 5 z12 = 17 z11 = 17 = independent token level ✓(m) ... character type 1 ( = 2) R1 I (m) x11 y11 = independent (m) R2 = start of s11 (m) (m) x11(m) R2 y11 (m) L2 R1 (m) L1 x21 y21 (m) x21 (m) y21 (m) (m) T1 {A, ✏, b} z21 = 17 T2 (m) b} I (m) Figure 3: An illustration of the HBPL model generating two character types (left and right), where the dotted line separates the type-level from the token-level variables. Legend: number of strokes , relations R, primitive id z (color-coded to highlight sharing), control points x (open circles), scale y, start locations L, trajectories T , transformation A, noise ✏ and ✓b , and image I.
  • 11. ハイパーパラメータの学習 a) b) library of motor primitives number of of strokes Number strokes frequency 6000 1 1 2 2 4000 2000 0 c) 0 2 4 6 8 stroke start positions 1 1 2 1 2 3 3 4 2 3 3 3 4 4 ≥4 4 Figur param primi row mon trol p b&c) tions c) sh differ Image. An image transformation A(m) 2 R4 is sampled from P (A(m) ) = where the first two elements control a global re-scaling and the second two cont ‣ シンボルの描き方に関する“常識”を学習 tion of the center of mass of T (m) . The transformed trajectories can then be ren grayscale image, using an ink model adapted from [10] (see Section SI-2). Th is then ycle  data  by two noise processes, ‣ motor  cperturbed (動画)を用いる which make the gradient more robu tion and encourage partial solutions during classification. These processes inclu (m) a Gaussian filter with standard deviation b and pixel flipping with probabi (m) (m)
  • 12. d MNIST benchmark dataset for digit recognition has 6000 training example can classify new images of a foreign handwritten character from just one exa can classify new imagesin theinference intestedcharacterveryjust one exam e. Forty participants of a foreign were this model is fromchallenging Posterior USA handwritten on one-shot classificati 6, 17]. Similarly, while classifiers are generally trainedon hundreds of images 6, 17]. Similarly,Figureclassifiersare generally traineddifferent numbersaan ch trial, as in whileImageNet [4] and CIFAR-10/100 [14], image can lea aas 1b, participants space shown an peopleofof n large combinatorial were of on hundreds image hmark datasets such as ImageNet [4] and CIFAR-10/100 [14], people can le mark datasets such that shows the same character. To ensure class on another image one-­‐shot  classifica2on developed an algorithm for finding K high-probabi c) c) 各イメージに対して、 completed mostone randomly selected proposed by a f just promising candidates trial from each are the fication tasks, so5a and detailed in Section の  posterior  を推定 appro that characters never repeated These parses Ther stroke   SI-5. across trials. wo practice trials with the Latin and Greek alphabets, and K feedback type X (m) (m) Human dr ( , a |I ) ⇡ rchial Bayesian Program Learning.P For ✓ test image I (T ) wHuman and (✓ i 2 token ..., 20, we use a Bayesian classification rule for which wei=1 compute 1 (T ) (c) where each weight wi is proportional to parse 212 3 score argmax log P (I |I ). b) b) participants c [i ˜ earn a new concept from just one example? (a & b) Where arewi / wiexamples of the other = P ( vely, new conceptare row 4 column 3 (left) (a &rowWhere are 4 (right). c) The lear earnAnswersapproximationone example? and b) 2 column the other examples a the for b) from just uses the HBPL search algorithm to get K ed? Pcolumn 4 (right). c) The le d? Answers abilities suchconstrained such that CMC chains to are rowas generating (left) and row parsing. = 1. around tha rt many other for b) estimatecolumn 3 examples and 2 variability Rather eac and 4 the local type-level i wi 1 rt many other abilities suchre-optimizes the token-level variables ✓ (T ) (al as generating examples and parsing. 2 nt-based searches to 1 approximation can be improved by incorporating so posterior 2 推定されたtypeからの (T ) 3 canoni x cano mage I . The approximation can be written aswhich closelySI-7 fo the token-level variables ✓(m) , (see Section track 1Z ターゲットの生成確率 6.2 (T ) (c) is inexpensive to draw conditional samples from the 1 P (I (T ) |✓ (T ) )P (✓ (T ) | )Q(✓ (c) , , I log P (I |I ) ⇡ log 6 it does not require evaluating the likelihood of the im
  • 13. an algorithm for finding K high-probability parses, [1] , ✓(m)[1] , ..., [K] , ✓(m st promising candidates proposed by a fast, bottom-up image analysis, show iled in Section SI-5. These parses approximate the posterior with a discrete d posterior  inference P ( , ✓(m) |I (m) ) ⇡ train e K X i=1 train wi (✓(m) ✓(m)[i] ) ( [i] ), prior  からのスコア    正規化 train weight wi is proportional to parse score, marginalizing over 1 shape variables 1 1 Binary image e b) 1 2 train train train traintrain train 22 222 2 1 111 11 aw) P 0 2 2 train 1 1 2 22 11 222 111 2i 1 1 wi / w = P ( ˜ −59.6 0 2 1 2 22 1 1 1 11 111 [i] (m)[i] 222 (m) 22 11 11 111 2 2 1 22 1 x 0 1 −59.6 train 1 1 ,✓ −88.9 −59.6 ,I 12 ) −159 −88.9 1 1 11 111 2 1 1 −88.9 −168 −159 2 1 1 12 −159 −168 ined such that i wi0 = 1. Rather than using just a point estimate for eac -159 -60 -89 -168 1 1 22 22 1 12 2 ion can be improved by incorporating some1of the local variance around the 2 1 1 21 11 1 2 1 1 111 22 222 (m) 2 111 パースの候補を選んで近似 2 1 2 allow evel variables ✓ 1, which closely track122 11111image, 2222211 for1 little variability, the11 2 11 2 2 11 2 1 1 22 11 222 11 1 11 11 111 1 (m)[i] (m) 1 ive to draw conditional samples from the1type-level P ( |✓ ,I ) = P( 1.  シンボルの上でランダムウォークを行い、ストロークのサンプル equire evaluating the likelihood of the image, just the local variance around th を得る(150個) d with the token-level fixed. Metropolis Hastings is run to produce N samp each parse ✓(m)[i] , denoted by [i1] , ..., [iN ] , where the improved approxim ge Thinned image ge aned) test 00000 0 test −59.6 −59.6 −59.6 −59.6 −59.6 0 −59.6 test test test test test test test −831 −88.9 −88.9 −88.9 −88.9 −88.9 −59.6 −88.9 0 −159 −159 −159 −159 −159 −88.9 −159 −59.6 −168 −168 −168 −168 −168 −159 −168 −88.9 −168 −159 test −881 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −831 test −1.98e+03 −1.98e+03 −1.98e+03 −1.98e+03 −1.98e+03 −2.12e+03 −881 −1.41e+03 −983 −1.98e+03 −1.22e+03 −979 −2.07e+03 −2.07e+03 −2.07e+03 −2.07e+03 −2.07e+03 −2.07e+03 −1.41e+03 −1.98e+03 −983 −2.09e+03 −2.09e+03 −2.09e+03 −2.09e+03 −2.09e+03 −1.22e+03 −2.07e+03 −979 −1.18e+03 −1.17e+03 −2.09e+03 −1.72e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −1.18e+03 −2.09e+03 −1.17e+03 −1.72e+03 −2.12e+03 planning 2.  そのストロークのスコアを  prior  -1273 から計算、上位  K  個に絞る -831 -2041 N K X by a thinning algorithm X (ii) and image. (m) is processed (m) a) The raw image (i) (m) (m) (m) (m)[i] 1 [18] ned ,✓ ned |I ) ⇡ Q( , ✓ planning cleaned ,I )= wi (✓ ✓ ) (
  • 14. barest of experience – just one or a handful of の計算 adjustment rceptual input. Although machine learning has nition problems that people solve so effortlessly, 1 2 1 b) 2 1 he baresta)of examples to reach good performance.2121 1 experience – just one or a handful of 1 2 1 i 1 1 1 usands of 11 22 2 2 2 22 1 222 1 22 222 11 111 perceptual input. Although machine learning111has 11 11 111 111 digit recognition people solve so effortlessly, ognition problems that has 6000 training examples per -60 -168 eign handwritten character 2from just one example-159 housands ofii examples to reach good0 performance. -89 1 22 2 1 1 2 1 1 1 for digit recognitiontrained on hundreds of images per 111111 has 6000 training examples per s are generally 22 222 1 22 222 foreign handwritten character from111just one example 11 111 22 11 222 111 11 111 1 1 t [4] andiiiCIFAR-10/100 [14], peopleper learn a can fiers are generally trained on hundreds of images Thinned Binary image train train Binary image Binary image train train train train traintrain train 1 1 Traced graph (raw) 2 0 22 1 1 −59.6 0 test test test test test test Thinned image test −59.6 −59.6 −59.6 −59.6 −59.6 0 −59.6 −831 1 2 2 1 1 −881 −2.12e+03 Net [4] and CIFAR-10/100 [14], people can learn a c) planning planning −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −831 −1.98e+03 −1.98e+03 −1.98e+03 −1.98e+03 −1.98e+03 −2.12e+03 −881 1 2 2 11 0 −88.9 −59.6 2 1 1 1 22 1 1 −59.6 −159 −88.9 1 2 1 −88.9 −168 −159 2 1 1 12 − − test test 2 12 traced graph (cleaned) 1 1 1 train 2 test 00000 0 Thinned image Thinned image 2 train −88.9 −88.9 −88.9 −88.9 −88.9 −59.6 −88.9 0 −159 −159 −159 −159 −159 −88.9 −159 −59.6 −168 −168 −168 −168 −168 −159 −168 −88.9 1 2 test 2 1 1 1 2 1 1 2 −1.41e+03 −983 −1.98e+03 1 2 2 11 −2.07e+03 −2.07e+03 −2.07e+03 −2.07e+03 −2.07e+03 −1.41e+03 −1.98e+03 −983 −2.09e+03 −2.09e+03 −2.09e+03 −2.09e+03 −2.09e+03 −1.22e+03 −2.07e+03 −979 −1.22e+03 −979 −2.07e+03 1 1 2 1 −1.18e+03 −1.17e+03 −2.09e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −2.12e+03 −1.18e+03 −2.09e+03 −1.17e+03 − −1 1 2 2 1 1 −1.7 −2.1 −1.72 −2.1 planning c) -831 -1273 -2041 e 5: Parsing a raw image. a) The raw image (i) is processed by a thinning algorithm [18] (ii) zed as an undirected graph [20] (iii) where parses are guided random walks (Section SI-5). b) parses found for that image (top row) are shown with their log wj (Eq. 5), where numbers insid Human drawers 推定された  type  circles denote sub-stroke breaks. These fi Human drawers e stroke order and starting position, and smaller open 変数を用いて、ターゲットの re-fit to three different raw images of characters (left in image triplets), where the best parse (t token  変数を推定(MCMC) 2 1 1 are shown 1 1 s associated image reconstruction (bottom right) above its score (Eq. 9). 2 2 2 1 1 3 1 3 n an approximate posterior for a particular image, the model 3can evaluate the posterior 2 3 2 score of a new& b) Wherere-fitting the examples of the image by are the other token-level variables 5(bottom Figure 5b), as 5.3 expl canonical 5.1 3 3 e example? (a planning cleaned planning cleaned planning cleaned