GAN Evaluation

2. Contents • Inception Score (IS) • Fr ́0chet Inception Distance (FID) • Precision & Recall Heusel, Martin, et al. "Gans trained by a two time-scale update rule converge to a local nash equilibrium." Advances in neural information processing systems. 2017. Salimans, Tim, et al. "Improved techniques for training gans." Advances in neural information processing systems. 2016. Lucic, Mario, et al. "Are gans created equal? a large-scale study." Advances in neural information processing systems. 2018.

3. Inception Score (IS) § Semantic Predictor : Inception-v3 • p(y|x), y is one of the 1,000 ImageNet classes. UC Berkeley 2020 -- Spring -- Deep Unsupervised Learning L5 & L6 Implicit Model / GAN probability distribution of all generated images. v Good generator generates samples that are 1. semantically diverse (Diversity) ∝ mode collapse 2. distinct images (Quality)

4. 1Entropy P(y) ; 클수록(uniform) à as many classes generated as possible 2Entropy P(y|x) ; 작을수록, à each image x should have distinctly recognizable object https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732 UC Berkeley 2020 -- Spring -- Deep Unsupervised Learning L5 & L6 Implicit Model / GAN

5. Fr ́$chet Inception Distance (FID) • Fr ́$chet Distance : 곡선을 따라 점의 위치와 순서를 고려한 곡선 간의 유사성을 측정 http://www.kr.tuwien.ac.at/staff/eiter/et-archive/cdtr9464.pdf • A man is walking a dog on a leash: the man can move on one curve, the dog on the other; both may vary their speed, but backtracking is not allowed. • What is the length of the shortest leash that is sufficient for traversing both curves? • The Fr´echet distance is a measure of similarity between curves that takes into account the 1location and 2ordering of the points along the curves. • Therefore it is often better than the well-known Hausdorff distance. reparameterization e.g. • Distance b/w two polygonal curves in time • p and q are the # of segments on the polygonal curves

6. Fr ́$chet Inception Distance (FID) • Real data와 fake data의 feature space 상에서의 거리 • Inception-v3 network에서 real data와 fake data의 feature를 추출하여, 두 feature의 mean과 covariance구하여 거리를 계산 Code : http://research.sualab.com/introduction/practice/2019/05/08/generative-adversarial-network.html • Lower FID values mean better image quality and diversity.

9. Precision, Recall and F1 Score • High Precision : The generated images look similar to the real images on average. (Quality) • High recall : The generator can generate any sample found in the training dataset. (Diversity) • A F1 score : The harmonic average of precision and recall. https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732

10. https://medium.com/@jonathan_hui/gan-how-to-measure-gan-performance-64b988c47732

11. SOTA Ferjad Naeem, Muhammad, et al. "Reliable Fidelity and Diversity Metrics for Generative Models." CVPR, 2020. (CLOVA AI) • Improved Precision & Recall • Density and Coverage (D&C) Kynkäänniemi, Tuomas, et al. "Improved precision and recall metric for assessing generative models." Advances in Neural Information Processing Systems. 2019.

12. 감사합니다

GAN Evaluation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to GAN Evaluation

Similar to GAN Evaluation (20)

More from Dongheon Lee

More from Dongheon Lee (10)

Recently uploaded

Recently uploaded (20)

GAN Evaluation