2. Contents
• Inception Score (IS)
• Fr ́0chet Inception Distance (FID)
• Precision & Recall
Heusel, Martin, et al. "Gans trained by a two time-scale update
rule converge to a local nash equilibrium." Advances in neural
information processing systems. 2017.
Salimans, Tim, et al. "Improved techniques for training
gans." Advances in neural information processing
systems. 2016.
Lucic, Mario, et al. "Are gans created equal? a large-scale
study." Advances in neural information processing systems. 2018.
3. Inception Score (IS)
§ Semantic Predictor : Inception-v3
• p(y|x), y is one of the 1,000 ImageNet classes.
UC Berkeley 2020 -- Spring -- Deep Unsupervised Learning L5 & L6 Implicit Model / GAN
probability distribution of all generated images.
v Good generator generates samples that are
1. semantically diverse (Diversity) ∝ mode collapse
2. distinct images (Quality)
5. Fr ́$chet Inception Distance (FID)
• Fr ́$chet Distance : 곡선을 따라 점의 위치와 순서를 고려한 곡선 간의 유사성을 측정
http://www.kr.tuwien.ac.at/staff/eiter/et-archive/cdtr9464.pdf
• A man is walking a dog on a leash: the man can move on one curve, the dog on the other;
both may vary their speed, but backtracking is not allowed.
• What is the length of the shortest leash that is sufficient for traversing both curves?
• The Fr´echet distance is a measure of similarity between curves that takes into account the
1location and 2ordering of the points along the curves.
• Therefore it is often better than the well-known Hausdorff distance.
reparameterization
e.g.
• Distance b/w two polygonal curves in time
• p and q are the # of segments on the polygonal curves
6. Fr ́$chet Inception Distance (FID)
• Real data와 fake data의 feature space 상에서의 거리
• Inception-v3 network에서 real data와 fake data의 feature를 추출하여,
두 feature의 mean과 covariance구하여 거리를 계산
Code : http://research.sualab.com/introduction/practice/2019/05/08/generative-adversarial-network.html
• Lower FID values mean better image quality and diversity.
7.
8.
9. Precision, Recall and F1 Score
• High Precision : The generated images look similar to the real images on
average. (Quality)
• High recall : The generator can generate any sample found in the training
dataset. (Diversity)
• A F1 score : The harmonic average of precision and recall.
https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732
11. SOTA
Ferjad Naeem, Muhammad, et al. "Reliable Fidelity and Diversity
Metrics for Generative Models." CVPR, 2020. (CLOVA AI)
• Improved Precision & Recall
• Density and Coverage (D&C)
Kynkäänniemi, Tuomas, et al. "Improved precision and recall
metric for assessing generative models." Advances in Neural
Information Processing Systems. 2019.