6. Implicit generative model
• Likelihood not available
• Can generate samples
Low-dimensional
random variable
Deterministic transformation
(e.g., Neural network)
High-dimensional
random variable
Typical example of implicit model
7. Fitting without likelihood function
Observations Samples from generative model
(likelihood not available)
• Fitting generative model without likelihood function (aka
implicit models)
• Alternative to likelihood: distance between these sets of
points (underlying distributions)
• To do this, GAN utilizes classifier (JS-divergence)
8. Distance between distributions
Observations Samples from generative model
(likelihood not available)
i
j
Cost matrix
C =
c11 ⋯ c14
⋮ cij ⋮
c41 ⋯ c44
(e.g., Euclidean distance)
12. • Problem: Optimal transport
• Algorithm: Sinkhorn algorithm
• Python implementation: GeomLoss
• Another application: Jigsaw puzzle
13. Regularized OT problem
d = min
P
⟨C, P⟩,
d = min
P
⟨C, P⟩ − ϵH(P)
Optimal transport
Entropy regularized
P1 = r, PT
1 = c
14. Regularized OT problem
d = min
P
⟨C, P⟩,
d = min
P
⟨C, P⟩ − ϵH(P)
Optimal transport
Entropy regularized
Solution obtained with Lagrangian (Cuturi 2013, Lemma 2)
P = diag(u) ⋅ K ⋅ diag(v) Kij = exp(−cij/ϵ), u ≥ 0, v ≥ 0
The same form of the solution obtained with
Sinkhorn algorithm
P1 = r, PT
1 = c
15. Sinkhorn algorithm
u(k+1)
=
r
Kv(k)
v(k+1)
=
c
KTu(k+1)
pij = diag(u) ⋅ K ⋅ diag(v) Kij = exp(−cij/ϵ), u ≥ 0, v ≥ 0
How to get u and v: repeat applying linear operations until
convergence (stops at a finite iteration in practice)
Sinkhorn and Knopp (1967) proposed this iterative
algorithm and analyzed convergence.
16. Sinkhorn algorithm
u(k+1)
=
r
Kv(k)
v(k+1)
=
c
KTu(k+1)
P = diag(u) ⋅ K ⋅ diag(v) Kij = exp(−cij/ϵ), u ≥ 0, v ≥ 0
Backprop
How to get u and v: repeat applying linear operations until
convergence (stops at a finite iteration in practice)
https://arxiv.org/abs/1706.00292
17. • Problem: Optimal transport
• Algorithm: Sinkhorn algorithm
• Python implementation: GeomLoss
• Another application: Jigsaw puzzle
18. GeomLoss library (pytorch)
https://github.com/jeanfeydy/geomloss
• Provides the following loss functions:
• Kernel norms (maximum mean discrepancies)
• Hausdorff divergences
• Unbiased Sinkhorn divergences
• Can be used in a wide variety of settings, from shape
analysis (LDDMM, optimal transport…) to machine
learning (kernel methods, GANs…) and image processing.
26. Summary
• Optimal transport for measuring distance of distributions
• Sinkhorn algorithm is just a sequential application of
linear operation, thus it’s possible to backprop
• GeomLoss makes it easy to use Sinkhorn algorithm with
PyTorch
• Sinkhorn algorithm can be used to find a good latent
permutation from observation