SlideShare a Scribd company logo
1 of 21
Download to read offline
2017/05/23
¤
¤
¤
¤
¤
¤
→
¤
¤ [Guillaumin+ 10] [Cheng+ 16]
¤ end-to-end
¤ VAE [Kingma+ 14][Maaløe+ 16] GAN
[Salimans+ 16]
¤ JMVAE[Suzuki+
16]
¤ Guillaumin [Guillaumin+ 10]
¤
¤ 2
¤ MKL
¤
¤ Cheng [Cheng+ 16]
¤ RGB-D
¤ Co-training
¤
¤ ->
¤ Semi-Supervised Learning with Deep Generative Models
[Kingma+ 14]
¤
¤
!
"
#
ℒ = ℒ " + ℒ ", # + ( ) *[−log01 # " ]
34 " #, !
01 ! ", #
01 # "
¤ Joint multimodal variational autoencoders (JMVAE)[Suzuki+ 16]
¤ joint 3(", 6)
¤ " 6 ! joint representation
¤
それらの生成過程を次のように考える.
z ∼ pθ(z) (1)
x, w ∼ pθ(x, w|z) (2)
,それぞれのドメインのデータについて条件付き独立と仮定する.
pθ(x, w|z) = pθx (x|z)pθw (w|z) (3)
x
z
w
図 1 両方のドメインが観測されたときの TrVB
モデルの変分下界 L は,次のようになる.
2
01 ! ", #
34 ", 6 !
34 ", 6 ! = 34("|!)34(6|!)
¤
¤
¤
¤
¤
Sea
[blue, sky, sand,…]
SS-MVAE
¤ Semi-Supervised Multimodal Variational AutoEncoder (SS-MVAE)
¤ JMVAE
¤
L = {(x1, w1, y1, ), ...,
xi wi
∈ {0, 1}C
N , wN )}
q(y|x, w)
(a) SS-MVAE (b) SS-HMVAE
1:
2
34 ", 6 #, !01 # ", 6
34 ", 6 #, ! = 34 " #, ! 34 6 #, !
01 ! ", 6, #
SS-MVAE
¤ SS-MVAE
¤
¤
= µ + σ2
⊙ ϵ
12
Maddison 16]
1
y)dz
Ll(x, w, y) = L(x, w, y) − α · log qφ(y|x, w) (4)
α
α = 0.5 · M+N
M
J =
(xi,wi,yi)∈DL
Ll(xi, wi, yi) +
(xj ,wj )∈DU
U(xj, wj) (5)
JMVAE φ θ
qφ(y|x)
Semi-Supervised Multimodal
Variational AutoEncoder SS-MVAE
3.4 SS-HMVAE
SS-MVAE
a p(x, w, y) =
pθ(x|a)pθ(w|a)pθ(a|z, y)p(z)p(y)dadz 1
(a) (b) SS-MVAE y z x w
y z a x w
|z)pθ(z)dz
θ
z)pθ(z)
w)
]
(1)
µ + σ2
⊙ ϵ
12
ddison 16]
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w)
qφ(y|x)
2
Ll(x, w, y) = L(x, w, y) − α · log qφ(y|x, w) (4)
α
α = 0.5 · M+N
M
J =
(xi,wi,yi)∈DL
Ll(xi, wi, yi) +
(xj ,wj )∈DU
U(xj, wj) (5)
JMVAE φ θ
qφ(y|x)
Gumbel softmax[Jang 16, Maddison 16]
φ θ 1
3.3 SS-MVAE
JMVAE
y
p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz
1(a)
log p(x, w, y) = log pθ(x, w, z, y)dz
≥ Eqφ(z|x,w,y)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z|x, w, y)
]
≡ −L(x, w, y) (2)
∗1 C
JMVAE
qφ(
Se
Variational AutoEncoder SS-M
3.4
SS-MVAE
a
pθ(x|a)pθ(w|a)pθ(a|z, y)p(z)
(a) (b) SS-MVAE
q(a, z|x, w, y) =
q(z|x, w, y) =
p(z|x, w, y)
Gulrajani 16]
2
12
Gumbel softmax[Jang 16, Maddison 16]
φ θ 1
3.3 SS-MVAE
JMVAE
y
p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz
1(a)
log p(x, w, y) = log pθ(x, w, z, y)dz
≥ Eqφ(z|x,w,y)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z|x, w, y)
]
≡ −L(x, w, y) (2)
∗1 C
(xi,wi,yi)∈
Variational Au
3.4
S
a
pθ(x|a)pθ(w
(a) (b)
p(z|x, w
Gulrajani 16]
2
Gumbel softmax[Jang 16, Maddison 16]
φ θ 1
3.3 SS-MVAE
JMVAE
y
p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz
1(a)
log p(x, w, y) = log pθ(x, w, z, y)dz
≥ Eqφ(z|x,w,y)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z|x, w, y)
]
≡ −L(x, w, y) (2)
∗1 C
Va
3.
a
(a
G
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w)
qφ(y|x)
2
x1, w1, y1, ), ...,
xi wi
q(y|x, w)
oder JMVAE
x w
pθ(w|z)pθ(z)dz
θ
(w|z)pθ(z)
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w)
qφ(y|x)
[Kingma 14a, Rezende 14]
12
Gumbel softmax[Jang 16, Maddison 16]
φ θ 1
3.3 SS-MVAE
JMVAE
y
p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz
1(a)
log p(x, w, y) = log pθ(x, w, z, y)dz
≥ Eqφ(z|x,w,y)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z|x, w, y)
]
J =
(xi,wi,yi)∈
Variational Au
3.4
S
a
pθ(x|a)pθ(
(a) (b)
= {(x1, w1, y1, ), ...,
xi wi
{0, 1}C
wN )}
q(y|x, w)
utoencoder JMVAE
x w
pθ(x|z)pθ(w|z)pθ(z)dz
θ
)dz
(x|z)pθ(w|z)pθ(z)
qφ(z|x, w)
]
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w)
qφ(y|x)
2
0
(a) SS-MVAE (b) SS-HMVAE
1:
SS-HMVAE
¤ Semi-Supervised Hierarchical Multimodal Variational AutoEn-
coder (SS-HMVAE)
¤
¤
2
34 9 #, !
34 ", 6 9
01(9|", 6)
01(!|9, #)
01(#|", 6)
2
¤ SS-HMVAE 9
¤ auxiliary variables
¤
¤
[Maaløe+ 16]
DL = {(x1, w1, y1, ), ...,
xi wi
y ∈ {0, 1}C
, (xN , wN )}
N
q(y|x, w)
ional autoencoder JMVAE
x w
w) = pθ(x|z)pθ(w|z)pθ(z)dz
θ
E(x, w)
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
L = {(x1, w1, y1, ), ...,
xi wi
∈ {0, 1}C
N , wN )}
q(y|x, w)
l autoencoder JMVAE
x w
= pθ(x|z)pθ(w|z)pθ(z)dz
θ
w)
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
SS-MVAE SS-HMVAE
q(z|x, w, y) =
Z
q(a, z|x, w, y)da
¤
¤
¤
¤ Gumbel softmax[Jang+ 2016]
¤
15,000 10,000
975,000
M = 15, 000 N = 975, 000
4.2
x w
R3857
{0, 1}2000
pθ(x|z, y) = N(x|µθ(z, y), diag(σ2
θ (z, y))) (8)
pθ(w|z, y) = Ber(w|πθ(z, y)) (9)
pθ(x|a) = N(x|µθ(a), diag(σ2
θ (a))) (10)
pθ(w|a) = Ber(w|πθ(a)) (11)
y {0, 1}38
qφ(y|x, w) = Ber(y|πθ(x, w)) (12)
SS-MVAE SS-HMVAE
∗2 http://www.flickr.com
∗3 http://www.cs.toronto.edu/˜nitish/multimodal/index.html
SS-HMVAE
MC=10
SS-MVAE
MAP
MAP
HMVAE
5.
2
∗4 https://github.com/Thean
∗5 https://github.com/Lasag
∗6 https://github.com/masa-
∗7 [ 16] LRAP
MAP
3
4.2
x w
R3857
{0, 1}2000
pθ(x|z, y) = N(x|µθ(z, y), diag(σ2
θ (z, y))) (8)
pθ(w|z, y) = Ber(w|πθ(z, y)) (9)
pθ(x|a) = N(x|µθ(a), diag(σ2
θ (a))) (10)
pθ(w|a) = Ber(w|πθ(a)) (11)
y {0, 1}38
qφ(y|x, w) = Ber(y|πθ(x, w)) (12)
SS-MVAE SS-HMVAE
∗2 http://www.flickr.com
∗3 http://www.cs.toronto.edu/˜nitish/multimodal/index.html
MAP
MAP
HMVAE
5.
2
∗4 https://github.com/Thea
∗5 https://github.com/Lasag
∗6 https://github.com/masa
∗7 [ 16] LRAP
MAP
3
Semi-
ultimodal Variational AutoEn-
|a)pθ(w|a)pθ(a|z, y)p(z)p(y)
qφ(a, z|x, w, y)
]
(6)
p(z) = N(z|0, I) (13)
p(y) = Ber(y|π) (14)
pθ(a|z, y) = N(a|µθ(z, y), diag(σ2
θ (z, y))) (15)
qφ(a|x, w) = N(z|µθ(x, w), diag(σ2
θ (x, w))) (16)
qφ(z|a, y) = N(z|µθ(a, y), diag(σ2
θ (a, y))) (17)
rectified linear unit
Adam [Kingma 14b]
¤ Tars
¤ Tars
¤
¤
¤ Github https://github.com/masa-su/Tars
P(A,B,C,D)=P(A)P(B∣A)P(C∣A)P(D∣A,B)
Tars
¤ VAE
x = InputLayer((None,n_x))
q_0 = DenseLayer(x,num_units=512,nonlinearity=activation)
q_1 = DenseLayer(q_0,num_units=512,nonlinearity=activation)
q_mean = DenseLayer(q_1,num_units=n_z,nonlinearity=linear)
q_var = DenseLayer(q_1,num_units=n_z,nonlinearity=softplus)
q = Gauss(q_mean,q_var,given=[x])
0(!|")
z = InputLayer((None,n_z))
p_0 = DenseLayer(z,num_units=512,nonlinearity=activation)
p_1 = DenseLayer(p_0,num_units=512,nonlinearity=activation)
p_mean = DenseLayer(p_1,num_units=n_x,nonlinearity=sigmoid)
p = Bernoulli(p_mean,given=[z])
3("|!)
model = VAE(q, p, n_batch=n_batch, optimizer=adam)
lower_bound_train = model.train([train_x])
Tars
¤
¤
z = q.sample_given_x(x) #
z = q.sample_mean_given_x(x) #
log_likelihood = q.log_likelihood_given_x(x, z)
•
•
!~0(!|")
log 0 (!|")
¤ Flickr25k
¤
¤ 38 one-hot
¤ 3,857 2,000
¤
¤ 100 2 5000
-> 97 5000
desert, nature, landscape, sky rose, pink
clouds, plant life, sky, tree flower, plant life
¤
¤ SS-MVAE
¤ SS-HMVAE
¤
¤ SVM DBN Autoencoder DBM JMVAE
¤ mean average precision (mAP)
¤
3.
3.1
DL = {(x1, w1, y1, ), ...,
(xM , wM , yM )} xi wi
y ∈ {0, 1}C
∗1
DU = {(x1, w1, y1, ), ..., (xN , wN )}
M << N
q(y|x, w)
3.2 JMVAE
joint multimodal variational autoencoder JMVAE
[Suzuki 16][ 16]
z x w
p(x, w) = pθ(x|z)pθ(w|z)pθ(z)dz
VAE JMVAE θ
θ −UJMV AE(x, w)
log p(x, w) = log pθ(x, w, z)dz
≥ Eqφ(z|x,w)[log
pθ(x|z)pθ(w|z)pθ(z)
qφ(z|x, w)
]
≡ −UJMV AE(x, w) (1)
qφ(z|x, w) φ
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w)
qφ(y|x)
2
3.
3.1
DL = {(x1, w1, y1, ), ...,
(xM , wM , yM )} xi wi
y ∈ {0, 1}C
∗1
DU = {(x1, w1, y1, ), ..., (xN , wN )}
M << N
q(y|x, w)
3.2 JMVAE
joint multimodal variational autoencoder JMVAE
[Suzuki 16][ 16]
z x w
p(x, w) = pθ(x|z)pθ(w|z)pθ(z)dz
VAE JMVAE θ
θ −UJMV AE(x, w)
log p(x, w) = log pθ(x, w, z)dz
≥ Eqφ(z|x,w)[log
pθ(x|z)pθ(w|z)pθ(z)
qφ(z|x, w)
]
≡ −UJMV AE(x, w) (1)
qφ(z|x, w) φ
(a) SS-MVAE (b) SS-HMVAE
1:
qφ(y|x, w)
log p(x, w) = log pθ(x, w, z, y)dzdy
≥ Eqφ(z,y|x,w)[log
pθ(x|z, y)pθ(w|z, y)pθ(z)
qφ(z, y|x, w)
]
≡ −U(x, w) (3)
qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w)
qφ(y|x)
2
SS-MVAE SS-HMVAE
mAP
SVM [Huiskes+] 0.475
DBN [Srivastava+]* 0.609
Autoencoder [Ngiam+]* 0.612
DBM [Srivastava+]* 0.622
JMVAE [Suzuki+] 0.618
SS-MVAE (MC=1) 0.612
SS-MVAE (MC=10) 0.626
SS-HMVAE (MC=1) 0.632
SS-HMVAE (MC=10) 0.628
•
• SS-HMVAE
•
•
•
• *
• MC
¤ mAP validation curve
MAP
0.618
SS-MVAE (MC=1) 0.612
SS-HMVAE (MC=1) 0.632
SS-MVAE (MC=10) 0.626
SS-HMVAE (MC=10) 0.628
2: MAP
Flickr retrie
internation
trieval, pp.
[Ioffe 15] Ioffe
Acceleratin
covariate sh
[Jang 16] Jan
cal Repara
preprint ar
[Kingma 13]
Auto-encod
arXiv:1312
[Kingma 14a]
and Wellin
generative
Processing
[Kingma 14b]
stochastic o
(2014)
[Maaløe 16] M
• SS-HMVAE
• SS-MVAE JMVAE
¤ MIR Flickr25k
¤
¤
¤
¤
¤
¤ RGB-D
¤
¤ JMVAE SS-HMVAE SS-MVAE
¤ Tars
¤
¤ SS-HMVAE
¤
¤
¤
¤ GAN VAT

More Related Content

What's hot

[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...Deep Learning JP
 
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion ModelsDeep Learning JP
 
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Yusuke Uchida
 
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...Deep Learning JP
 
ドメイン適応の原理と応用
ドメイン適応の原理と応用ドメイン適応の原理と応用
ドメイン適応の原理と応用Yoshitaka Ushiku
 
近年のHierarchical Vision Transformer
近年のHierarchical Vision Transformer近年のHierarchical Vision Transformer
近年のHierarchical Vision TransformerYusuke Uchida
 
「世界モデル」と関連研究について
「世界モデル」と関連研究について「世界モデル」と関連研究について
「世界モデル」と関連研究についてMasahiro Suzuki
 
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜SSII
 
[DL輪読会]ドメイン転移と不変表現に関するサーベイ
[DL輪読会]ドメイン転移と不変表現に関するサーベイ[DL輪読会]ドメイン転移と不変表現に関するサーベイ
[DL輪読会]ドメイン転移と不変表現に関するサーベイDeep Learning JP
 
PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説弘毅 露崎
 
【DL輪読会】Flow Matching for Generative Modeling
【DL輪読会】Flow Matching for Generative Modeling【DL輪読会】Flow Matching for Generative Modeling
【DL輪読会】Flow Matching for Generative ModelingDeep Learning JP
 
Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)yukihiro domae
 
機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)Kota Matsui
 
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習cvpaper. challenge
 
動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)cvpaper. challenge
 
Control as Inference (強化学習とベイズ統計)
Control as Inference (強化学習とベイズ統計)Control as Inference (強化学習とベイズ統計)
Control as Inference (強化学習とベイズ統計)Shohei Taniguchi
 
深層生成モデルと世界モデル
深層生成モデルと世界モデル深層生成モデルと世界モデル
深層生成モデルと世界モデルMasahiro Suzuki
 

What's hot (20)

[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
 
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
 
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
 
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
 
ドメイン適応の原理と応用
ドメイン適応の原理と応用ドメイン適応の原理と応用
ドメイン適応の原理と応用
 
ELBO型VAEのダメなところ
ELBO型VAEのダメなところELBO型VAEのダメなところ
ELBO型VAEのダメなところ
 
近年のHierarchical Vision Transformer
近年のHierarchical Vision Transformer近年のHierarchical Vision Transformer
近年のHierarchical Vision Transformer
 
「世界モデル」と関連研究について
「世界モデル」と関連研究について「世界モデル」と関連研究について
「世界モデル」と関連研究について
 
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
 
[DL輪読会]ドメイン転移と不変表現に関するサーベイ
[DL輪読会]ドメイン転移と不変表現に関するサーベイ[DL輪読会]ドメイン転移と不変表現に関するサーベイ
[DL輪読会]ドメイン転移と不変表現に関するサーベイ
 
PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説
 
【DL輪読会】Flow Matching for Generative Modeling
【DL輪読会】Flow Matching for Generative Modeling【DL輪読会】Flow Matching for Generative Modeling
【DL輪読会】Flow Matching for Generative Modeling
 
Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)
 
機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)
 
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
 
coordinate descent 法について
coordinate descent 法についてcoordinate descent 法について
coordinate descent 法について
 
動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)
 
Control as Inference (強化学習とベイズ統計)
Control as Inference (強化学習とベイズ統計)Control as Inference (強化学習とベイズ統計)
Control as Inference (強化学習とベイズ統計)
 
深層生成モデルと世界モデル
深層生成モデルと世界モデル深層生成モデルと世界モデル
深層生成モデルと世界モデル
 

Similar to Semi-Supervised Multimodal Variational AutoEncoder (SS-MVAE) for Images and Text

情報幾何の基礎とEMアルゴリズムの解釈
情報幾何の基礎とEMアルゴリズムの解釈情報幾何の基礎とEMアルゴリズムの解釈
情報幾何の基礎とEMアルゴリズムの解釈Fukumu Tsutsumi
 
A Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILNA Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILNTomonari Masada
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihoodDeep Learning JP
 
統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半Ken'ichi Matsui
 
関西NIPS+読み会発表スライド
関西NIPS+読み会発表スライド関西NIPS+読み会発表スライド
関西NIPS+読み会発表スライドYuchi Matsuoka
 
数式処理ソフトMathematicaで数学の問題を解く
数式処理ソフトMathematicaで数学の問題を解く数式処理ソフトMathematicaで数学の問題を解く
数式処理ソフトMathematicaで数学の問題を解くYoshihiro Mizoguchi
 
20191123 bayes dl-jp
20191123 bayes dl-jp20191123 bayes dl-jp
20191123 bayes dl-jpTaku Yoshioka
 
A Course in Fuzzy Systems and Control Matlab Chapter Three
A Course in Fuzzy Systems and Control Matlab Chapter ThreeA Course in Fuzzy Systems and Control Matlab Chapter Three
A Course in Fuzzy Systems and Control Matlab Chapter ThreeChung Hua Universit
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
Hiroaki Shiokawa
Hiroaki ShiokawaHiroaki Shiokawa
Hiroaki ShiokawaSuurist
 
Preserving Personalized Pagerank in Subgraphs(ICML 2011)
Preserving Personalized Pagerank in Subgraphs(ICML 2011) Preserving Personalized Pagerank in Subgraphs(ICML 2011)
Preserving Personalized Pagerank in Subgraphs(ICML 2011) ybenjo
 
A neural attention model for sentence summarization
A neural attention model for sentence summarizationA neural attention model for sentence summarization
A neural attention model for sentence summarizationAkihiko Watanabe
 
ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-ssusere0a682
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892drayertaurus
 
Bellman ford
Bellman fordBellman ford
Bellman fordKiran K
 
Christian Gill ''Functional programming for the people''
Christian Gill ''Functional programming for the people''Christian Gill ''Functional programming for the people''
Christian Gill ''Functional programming for the people''OdessaJS Conf
 

Similar to Semi-Supervised Multimodal Variational AutoEncoder (SS-MVAE) for Images and Text (20)

Semi vae memo (2)
Semi vae memo (2)Semi vae memo (2)
Semi vae memo (2)
 
情報幾何の基礎とEMアルゴリズムの解釈
情報幾何の基礎とEMアルゴリズムの解釈情報幾何の基礎とEMアルゴリズムの解釈
情報幾何の基礎とEMアルゴリズムの解釈
 
Semi vae memo (1)
Semi vae memo (1)Semi vae memo (1)
Semi vae memo (1)
 
A Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILNA Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILN
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半
 
関西NIPS+読み会発表スライド
関西NIPS+読み会発表スライド関西NIPS+読み会発表スライド
関西NIPS+読み会発表スライド
 
数式処理ソフトMathematicaで数学の問題を解く
数式処理ソフトMathematicaで数学の問題を解く数式処理ソフトMathematicaで数学の問題を解く
数式処理ソフトMathematicaで数学の問題を解く
 
20191123 bayes dl-jp
20191123 bayes dl-jp20191123 bayes dl-jp
20191123 bayes dl-jp
 
A Course in Fuzzy Systems and Control Matlab Chapter Three
A Course in Fuzzy Systems and Control Matlab Chapter ThreeA Course in Fuzzy Systems and Control Matlab Chapter Three
A Course in Fuzzy Systems and Control Matlab Chapter Three
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
Hiroaki Shiokawa
Hiroaki ShiokawaHiroaki Shiokawa
Hiroaki Shiokawa
 
Preserving Personalized Pagerank in Subgraphs(ICML 2011)
Preserving Personalized Pagerank in Subgraphs(ICML 2011) Preserving Personalized Pagerank in Subgraphs(ICML 2011)
Preserving Personalized Pagerank in Subgraphs(ICML 2011)
 
A neural attention model for sentence summarization
A neural attention model for sentence summarizationA neural attention model for sentence summarization
A neural attention model for sentence summarization
 
ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-ゲーム理論BASIC 演習6 -仁を求める-
ゲーム理論BASIC 演習6 -仁を求める-
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892
 
Lambda calculus
Lambda calculusLambda calculus
Lambda calculus
 
Bellman ford
Bellman fordBellman ford
Bellman ford
 
Christian Gill ''Functional programming for the people''
Christian Gill ''Functional programming for the people''Christian Gill ''Functional programming for the people''
Christian Gill ''Functional programming for the people''
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 

More from Masahiro Suzuki

深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)Masahiro Suzuki
 
確率的推論と行動選択
確率的推論と行動選択確率的推論と行動選択
確率的推論と行動選択Masahiro Suzuki
 
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについてMasahiro Suzuki
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural NetworksMasahiro Suzuki
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot LearningMasahiro Suzuki
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural NetworkMasahiro Suzuki
 
深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習Masahiro Suzuki
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...Masahiro Suzuki
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi DivergenceMasahiro Suzuki
 
(DL hacks輪読) Deep Kalman Filters
(DL hacks輪読) Deep Kalman Filters(DL hacks輪読) Deep Kalman Filters
(DL hacks輪読) Deep Kalman FiltersMasahiro Suzuki
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural NetworksMasahiro Suzuki
 
(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel LearningMasahiro Suzuki
 
(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...
(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...
(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...Masahiro Suzuki
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task LearningMasahiro Suzuki
 
(DL hacks輪読) Difference Target Propagation
(DL hacks輪読) Difference Target Propagation(DL hacks輪読) Difference Target Propagation
(DL hacks輪読) Difference Target PropagationMasahiro Suzuki
 
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization TrickMasahiro Suzuki
 
(DL Hacks輪読) How transferable are features in deep neural networks?
(DL Hacks輪読) How transferable are features in deep neural networks?(DL Hacks輪読) How transferable are features in deep neural networks?
(DL Hacks輪読) How transferable are features in deep neural networks?Masahiro Suzuki
 

More from Masahiro Suzuki (17)

深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)
 
確率的推論と行動選択
確率的推論と行動選択確率的推論と行動選択
確率的推論と行動選択
 
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network
 
深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習
 
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence
 
(DL hacks輪読) Deep Kalman Filters
(DL hacks輪読) Deep Kalman Filters(DL hacks輪読) Deep Kalman Filters
(DL hacks輪読) Deep Kalman Filters
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks
 
(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning
 
(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...
(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...
(DL hacks輪読) Seven neurons memorizing sequences of alphabetical images via sp...
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
 
(DL hacks輪読) Difference Target Propagation
(DL hacks輪読) Difference Target Propagation(DL hacks輪読) Difference Target Propagation
(DL hacks輪読) Difference Target Propagation
 
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
 
(DL Hacks輪読) How transferable are features in deep neural networks?
(DL Hacks輪読) How transferable are features in deep neural networks?(DL Hacks輪読) How transferable are features in deep neural networks?
(DL Hacks輪読) How transferable are features in deep neural networks?
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Semi-Supervised Multimodal Variational AutoEncoder (SS-MVAE) for Images and Text

  • 3. ¤ ¤ [Guillaumin+ 10] [Cheng+ 16] ¤ end-to-end ¤ VAE [Kingma+ 14][Maaløe+ 16] GAN [Salimans+ 16] ¤ JMVAE[Suzuki+ 16]
  • 4. ¤ Guillaumin [Guillaumin+ 10] ¤ ¤ 2 ¤ MKL ¤ ¤ Cheng [Cheng+ 16] ¤ RGB-D ¤ Co-training ¤ ¤ ->
  • 5. ¤ Semi-Supervised Learning with Deep Generative Models [Kingma+ 14] ¤ ¤ ! " # ℒ = ℒ " + ℒ ", # + ( ) *[−log01 # " ] 34 " #, ! 01 ! ", # 01 # "
  • 6. ¤ Joint multimodal variational autoencoders (JMVAE)[Suzuki+ 16] ¤ joint 3(", 6) ¤ " 6 ! joint representation ¤ それらの生成過程を次のように考える. z ∼ pθ(z) (1) x, w ∼ pθ(x, w|z) (2) ,それぞれのドメインのデータについて条件付き独立と仮定する. pθ(x, w|z) = pθx (x|z)pθw (w|z) (3) x z w 図 1 両方のドメインが観測されたときの TrVB モデルの変分下界 L は,次のようになる. 2 01 ! ", # 34 ", 6 ! 34 ", 6 ! = 34("|!)34(6|!)
  • 8. SS-MVAE ¤ Semi-Supervised Multimodal Variational AutoEncoder (SS-MVAE) ¤ JMVAE ¤ L = {(x1, w1, y1, ), ..., xi wi ∈ {0, 1}C N , wN )} q(y|x, w) (a) SS-MVAE (b) SS-HMVAE 1: 2 34 ", 6 #, !01 # ", 6 34 ", 6 #, ! = 34 " #, ! 34 6 #, ! 01 ! ", 6, #
  • 9. SS-MVAE ¤ SS-MVAE ¤ ¤ = µ + σ2 ⊙ ϵ 12 Maddison 16] 1 y)dz Ll(x, w, y) = L(x, w, y) − α · log qφ(y|x, w) (4) α α = 0.5 · M+N M J = (xi,wi,yi)∈DL Ll(xi, wi, yi) + (xj ,wj )∈DU U(xj, wj) (5) JMVAE φ θ qφ(y|x) Semi-Supervised Multimodal Variational AutoEncoder SS-MVAE 3.4 SS-HMVAE SS-MVAE a p(x, w, y) = pθ(x|a)pθ(w|a)pθ(a|z, y)p(z)p(y)dadz 1 (a) (b) SS-MVAE y z x w y z a x w |z)pθ(z)dz θ z)pθ(z) w) ] (1) µ + σ2 ⊙ ϵ 12 ddison 16] log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w) qφ(y|x) 2 Ll(x, w, y) = L(x, w, y) − α · log qφ(y|x, w) (4) α α = 0.5 · M+N M J = (xi,wi,yi)∈DL Ll(xi, wi, yi) + (xj ,wj )∈DU U(xj, wj) (5) JMVAE φ θ qφ(y|x) Gumbel softmax[Jang 16, Maddison 16] φ θ 1 3.3 SS-MVAE JMVAE y p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz 1(a) log p(x, w, y) = log pθ(x, w, z, y)dz ≥ Eqφ(z|x,w,y)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z|x, w, y) ] ≡ −L(x, w, y) (2) ∗1 C JMVAE qφ( Se Variational AutoEncoder SS-M 3.4 SS-MVAE a pθ(x|a)pθ(w|a)pθ(a|z, y)p(z) (a) (b) SS-MVAE q(a, z|x, w, y) = q(z|x, w, y) = p(z|x, w, y) Gulrajani 16] 2 12 Gumbel softmax[Jang 16, Maddison 16] φ θ 1 3.3 SS-MVAE JMVAE y p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz 1(a) log p(x, w, y) = log pθ(x, w, z, y)dz ≥ Eqφ(z|x,w,y)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z|x, w, y) ] ≡ −L(x, w, y) (2) ∗1 C (xi,wi,yi)∈ Variational Au 3.4 S a pθ(x|a)pθ(w (a) (b) p(z|x, w Gulrajani 16] 2 Gumbel softmax[Jang 16, Maddison 16] φ θ 1 3.3 SS-MVAE JMVAE y p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz 1(a) log p(x, w, y) = log pθ(x, w, z, y)dz ≥ Eqφ(z|x,w,y)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z|x, w, y) ] ≡ −L(x, w, y) (2) ∗1 C Va 3. a (a G (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w) qφ(y|x) 2 x1, w1, y1, ), ..., xi wi q(y|x, w) oder JMVAE x w pθ(w|z)pθ(z)dz θ (w|z)pθ(z) (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w) qφ(y|x) [Kingma 14a, Rezende 14] 12 Gumbel softmax[Jang 16, Maddison 16] φ θ 1 3.3 SS-MVAE JMVAE y p(x, w, y) = pθ(x|z, y)pθ(w|z, y)p(z)p(y)dz 1(a) log p(x, w, y) = log pθ(x, w, z, y)dz ≥ Eqφ(z|x,w,y)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z|x, w, y) ] J = (xi,wi,yi)∈ Variational Au 3.4 S a pθ(x|a)pθ( (a) (b) = {(x1, w1, y1, ), ..., xi wi {0, 1}C wN )} q(y|x, w) utoencoder JMVAE x w pθ(x|z)pθ(w|z)pθ(z)dz θ )dz (x|z)pθ(w|z)pθ(z) qφ(z|x, w) ] (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w) qφ(y|x) 2 0
  • 10. (a) SS-MVAE (b) SS-HMVAE 1: SS-HMVAE ¤ Semi-Supervised Hierarchical Multimodal Variational AutoEn- coder (SS-HMVAE) ¤ ¤ 2 34 9 #, ! 34 ", 6 9 01(9|", 6) 01(!|9, #) 01(#|", 6)
  • 11. 2 ¤ SS-HMVAE 9 ¤ auxiliary variables ¤ ¤ [Maaløe+ 16] DL = {(x1, w1, y1, ), ..., xi wi y ∈ {0, 1}C , (xN , wN )} N q(y|x, w) ional autoencoder JMVAE x w w) = pθ(x|z)pθ(w|z)pθ(z)dz θ E(x, w) (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] L = {(x1, w1, y1, ), ..., xi wi ∈ {0, 1}C N , wN )} q(y|x, w) l autoencoder JMVAE x w = pθ(x|z)pθ(w|z)pθ(z)dz θ w) (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) SS-MVAE SS-HMVAE q(z|x, w, y) = Z q(a, z|x, w, y)da
  • 12. ¤ ¤ ¤ ¤ Gumbel softmax[Jang+ 2016] ¤ 15,000 10,000 975,000 M = 15, 000 N = 975, 000 4.2 x w R3857 {0, 1}2000 pθ(x|z, y) = N(x|µθ(z, y), diag(σ2 θ (z, y))) (8) pθ(w|z, y) = Ber(w|πθ(z, y)) (9) pθ(x|a) = N(x|µθ(a), diag(σ2 θ (a))) (10) pθ(w|a) = Ber(w|πθ(a)) (11) y {0, 1}38 qφ(y|x, w) = Ber(y|πθ(x, w)) (12) SS-MVAE SS-HMVAE ∗2 http://www.flickr.com ∗3 http://www.cs.toronto.edu/˜nitish/multimodal/index.html SS-HMVAE MC=10 SS-MVAE MAP MAP HMVAE 5. 2 ∗4 https://github.com/Thean ∗5 https://github.com/Lasag ∗6 https://github.com/masa- ∗7 [ 16] LRAP MAP 3 4.2 x w R3857 {0, 1}2000 pθ(x|z, y) = N(x|µθ(z, y), diag(σ2 θ (z, y))) (8) pθ(w|z, y) = Ber(w|πθ(z, y)) (9) pθ(x|a) = N(x|µθ(a), diag(σ2 θ (a))) (10) pθ(w|a) = Ber(w|πθ(a)) (11) y {0, 1}38 qφ(y|x, w) = Ber(y|πθ(x, w)) (12) SS-MVAE SS-HMVAE ∗2 http://www.flickr.com ∗3 http://www.cs.toronto.edu/˜nitish/multimodal/index.html MAP MAP HMVAE 5. 2 ∗4 https://github.com/Thea ∗5 https://github.com/Lasag ∗6 https://github.com/masa ∗7 [ 16] LRAP MAP 3 Semi- ultimodal Variational AutoEn- |a)pθ(w|a)pθ(a|z, y)p(z)p(y) qφ(a, z|x, w, y) ] (6) p(z) = N(z|0, I) (13) p(y) = Ber(y|π) (14) pθ(a|z, y) = N(a|µθ(z, y), diag(σ2 θ (z, y))) (15) qφ(a|x, w) = N(z|µθ(x, w), diag(σ2 θ (x, w))) (16) qφ(z|a, y) = N(z|µθ(a, y), diag(σ2 θ (a, y))) (17) rectified linear unit Adam [Kingma 14b]
  • 13. ¤ Tars ¤ Tars ¤ ¤ ¤ Github https://github.com/masa-su/Tars P(A,B,C,D)=P(A)P(B∣A)P(C∣A)P(D∣A,B)
  • 14. Tars ¤ VAE x = InputLayer((None,n_x)) q_0 = DenseLayer(x,num_units=512,nonlinearity=activation) q_1 = DenseLayer(q_0,num_units=512,nonlinearity=activation) q_mean = DenseLayer(q_1,num_units=n_z,nonlinearity=linear) q_var = DenseLayer(q_1,num_units=n_z,nonlinearity=softplus) q = Gauss(q_mean,q_var,given=[x]) 0(!|") z = InputLayer((None,n_z)) p_0 = DenseLayer(z,num_units=512,nonlinearity=activation) p_1 = DenseLayer(p_0,num_units=512,nonlinearity=activation) p_mean = DenseLayer(p_1,num_units=n_x,nonlinearity=sigmoid) p = Bernoulli(p_mean,given=[z]) 3("|!) model = VAE(q, p, n_batch=n_batch, optimizer=adam) lower_bound_train = model.train([train_x])
  • 15. Tars ¤ ¤ z = q.sample_given_x(x) # z = q.sample_mean_given_x(x) # log_likelihood = q.log_likelihood_given_x(x, z) • • !~0(!|") log 0 (!|")
  • 16. ¤ Flickr25k ¤ ¤ 38 one-hot ¤ 3,857 2,000 ¤ ¤ 100 2 5000 -> 97 5000 desert, nature, landscape, sky rose, pink clouds, plant life, sky, tree flower, plant life
  • 17. ¤ ¤ SS-MVAE ¤ SS-HMVAE ¤ ¤ SVM DBN Autoencoder DBM JMVAE ¤ mean average precision (mAP) ¤ 3. 3.1 DL = {(x1, w1, y1, ), ..., (xM , wM , yM )} xi wi y ∈ {0, 1}C ∗1 DU = {(x1, w1, y1, ), ..., (xN , wN )} M << N q(y|x, w) 3.2 JMVAE joint multimodal variational autoencoder JMVAE [Suzuki 16][ 16] z x w p(x, w) = pθ(x|z)pθ(w|z)pθ(z)dz VAE JMVAE θ θ −UJMV AE(x, w) log p(x, w) = log pθ(x, w, z)dz ≥ Eqφ(z|x,w)[log pθ(x|z)pθ(w|z)pθ(z) qφ(z|x, w) ] ≡ −UJMV AE(x, w) (1) qφ(z|x, w) φ (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w) qφ(y|x) 2 3. 3.1 DL = {(x1, w1, y1, ), ..., (xM , wM , yM )} xi wi y ∈ {0, 1}C ∗1 DU = {(x1, w1, y1, ), ..., (xN , wN )} M << N q(y|x, w) 3.2 JMVAE joint multimodal variational autoencoder JMVAE [Suzuki 16][ 16] z x w p(x, w) = pθ(x|z)pθ(w|z)pθ(z)dz VAE JMVAE θ θ −UJMV AE(x, w) log p(x, w) = log pθ(x, w, z)dz ≥ Eqφ(z|x,w)[log pθ(x|z)pθ(w|z)pθ(z) qφ(z|x, w) ] ≡ −UJMV AE(x, w) (1) qφ(z|x, w) φ (a) SS-MVAE (b) SS-HMVAE 1: qφ(y|x, w) log p(x, w) = log pθ(x, w, z, y)dzdy ≥ Eqφ(z,y|x,w)[log pθ(x|z, y)pθ(w|z, y)pθ(z) qφ(z, y|x, w) ] ≡ −U(x, w) (3) qφ(z, y|x, w) = qφ(z|x, w, y)qφ(y|x, w) qφ(y|x) 2 SS-MVAE SS-HMVAE
  • 18. mAP SVM [Huiskes+] 0.475 DBN [Srivastava+]* 0.609 Autoencoder [Ngiam+]* 0.612 DBM [Srivastava+]* 0.622 JMVAE [Suzuki+] 0.618 SS-MVAE (MC=1) 0.612 SS-MVAE (MC=10) 0.626 SS-HMVAE (MC=1) 0.632 SS-HMVAE (MC=10) 0.628 • • SS-HMVAE • • • • * • MC
  • 19. ¤ mAP validation curve MAP 0.618 SS-MVAE (MC=1) 0.612 SS-HMVAE (MC=1) 0.632 SS-MVAE (MC=10) 0.626 SS-HMVAE (MC=10) 0.628 2: MAP Flickr retrie internation trieval, pp. [Ioffe 15] Ioffe Acceleratin covariate sh [Jang 16] Jan cal Repara preprint ar [Kingma 13] Auto-encod arXiv:1312 [Kingma 14a] and Wellin generative Processing [Kingma 14b] stochastic o (2014) [Maaløe 16] M • SS-HMVAE • SS-MVAE JMVAE
  • 21. ¤ ¤ JMVAE SS-HMVAE SS-MVAE ¤ Tars ¤ ¤ SS-HMVAE ¤ ¤ ¤ ¤ GAN VAT