Fréchet_inception_distance

Fréchet inception distance

Metric used to assess image quality

The Fréchet inception distance (FID) is a metric used to assess the quality of images created by a generative model, like a generative adversarial network (GAN).^[1] Unlike the earlier inception score (IS), which evaluates only the distribution of generated images, the FID compares the distribution of generated images with the distribution of a set of real images ("ground truth").^[1] The FID metric does not completely replace the IS metric. Classifiers that achieve the best (lowest) FID score tend to have greater sample variety while classifiers achieving the best (highest) IS score tend to have better quality within individual images.^[2]

The FID metric was introduced in 2017,^[1] and is the current standard metric for assessing the quality of generative models as of 2020. It has been used to measure the quality of many recent models including the high-resolution StyleGAN1^[3] and StyleGAN2^[4] networks and the Classifier-Free Diffusion Model.^[2]

Definition

For any two probability distributions $\mu ,\nu$ over $\mathbb {R} ^{n}$ having finite mean and variances, their Fréchet distance is^[5]

d_{F}(\mu ,\nu ):=\left(\inf _{\gamma \in \Gamma (\mu ,\nu )}\int _{\mathbb {R} ^{n}\times \mathbb {R} ^{n}}\|x-y\|^{2}\,\mathrm {d} \gamma (x,y)\right)^{1/2},

where $\Gamma (\mu ,\nu )$ is the set of all measures on $\mathbb {R} ^{n}\times \mathbb {R} ^{n}$ with marginals $\mu$ and $\nu$ on the first and second factors respectively. (The set $\Gamma (\mu ,\nu )$ is also called the set of all couplings of $\mu$ and $\nu$ .). In other words, it is the 2-Wasserstein distance on $\mathbb {R} ^{n}$ . For two multidimensional Gaussian distributions ${\mathcal {N}}(\mu ,\Sigma )$ and ${\mathcal {N}}(\mu ',\Sigma ')$ , it is explicitly solvable as^[6]

d_{F}({\mathcal {N}}(\mu ,\Sigma ),{\mathcal {N}}(\mu ',\Sigma '))^{2}=\lVert \mu -\mu '\rVert _{2}^{2}+\operatorname {tr} \left(\Sigma +\Sigma '-2\left(\Sigma \Sigma '\right)^{\frac {1}{2}}\right)

This allows us to define the FID in pseudocode form:

INPUT a function $f:\Omega _{X}\to \mathbb {R} ^{n}$ .

INPUT two datasets $S,S'\subset \Omega _{X}$ .

Compute $f(S),f(S')\subset \mathbb {R} ^{n}$ .

Fit two gaussian distributions ${\mathcal {N}}(\mu ,\Sigma ),{\mathcal {N}}(\mu ',\Sigma ')$ , respectively for $f(S),f(S')$ .

RETURN $d_{F}({\mathcal {N}}(\mu ,\Sigma ),{\mathcal {N}}(\mu ',\Sigma '))^{2}$ .

In most practical uses of the FID, $\Omega _{X}$ is the space of images, and $f$ is an Inception v3 model trained on the ImageNet, but without its final classification layer. Technically, it is the 2048-dimensional activation vector of its last pooling layer. Of the two datasets $S,S'$ , one of them is a reference dataset, which could be the ImageNet itself, and the other is a set of images generated by a generative model, such as GAN, or diffusion model.^[1]

References

[1]
Heusel, Martin; Ramsauer, Hubert; Unterthiner, Thomas; Nessler, Bernhard; Hochreiter, Sepp (2017). "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium". Advances in Neural Information Processing Systems. 30. arXiv:1706.08500.
[2]
Ho, Jonathan; Salimans, Tim (2022). "Classifier-Free Diffusion Guidance". arXiv:2207.12598 [cs.LG].
[3]
Karras, Tero; Laine, Samuli; Aila, Timo (2020). "A Style-Based Generator Architecture for Generative Adversarial Networks". IEEE Transactions on Pattern Analysis and Machine Intelligence. PP (12): 4217–4228. arXiv:1812.04948. doi:10.1109/TPAMI.2020.2970919. PMID 32012000. S2CID 211022860.
[4]
Karras, Tero; Laine, Samuli; Aittala, Miika; Hellsten, Janne; Lehtinen, Jaakko; Aila, Timo (23 March 2020). "Analyzing and Improving the Image Quality of StyleGAN". arXiv:1912.04958 [cs.CV].
[5]
Fréchet., M (1957). "Sur la distance de deux lois de probabilité". C. R. Acad. Sci. Paris. 244: 689–692.
[6]
Dowson, D. C; Landau, B. V (1 September 1982). "The Fréchet distance between multivariate normal distributions". Journal of Multivariate Analysis. 12 (3): 450–455. doi:10.1016/0047-259X(82)90077-X. ISSN 0047-259X.
[7]
Kilgour, Kevin; Zuluaga, Mauricio; Roblek, Dominik; Sharifi, Matthew (2019-09-15). "Fréchet Audio Distance: A Reference-Free Metric for Evaluating Music Enhancement Algorithms". Interspeech 2019: 2350–2354. doi:10.21437/Interspeech.2019-2219. S2CID 202725406.
[8]
Unterthiner, Thomas; Steenkiste, Sjoerd van; Kurach, Karol; Marinier, Raphaël; Michalski, Marcin; Gelly, Sylvain (2019-03-27). "FVD: A new Metric for Video Generation". Open Review.
[9]
Preuer, Kristina; Renz, Philipp; Unterthiner, Thomas; Hochreiter, Sepp; Klambauer, Günter (2018-09-24). "Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery". Journal of Chemical Information and Modeling. 58 (9): 1736–1741. arXiv:1803.09518. doi:10.1021/acs.jcim.8b00234. PMID 30118593. S2CID 51892387.
[10]
Chong, Min Jin; Forsyth, David (2020-06-15). "Effectively Unbiased FID and Inception Score and where to find them". arXiv:1911.07023 [cs.CV].
[11]
Liu, Shaohui; Wei, Yi; Lu, Jiwen; Zhou, Jie (2018-07-19). "An Improved Evaluation Framework for Generative Adversarial Networks". arXiv:1803.07474 [cs.CV].

Share this article:

This article uses material from the Wikipedia article Fréchet_inception_distance, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[fid-1] [1]
Heusel, Martin; Ramsauer, Hubert; Unterthiner, Thomas; Nessler, Bernhard; Hochreiter, Sepp (2017). "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium". Advances in Neural Information Processing Systems. 30. arXiv:1706.08500.

[classifier-free-diffusion-2] [2]
Ho, Jonathan; Salimans, Tim (2022). "Classifier-Free Diffusion Guidance". arXiv:2207.12598 [cs.LG].

[stylegan1-3] [3]
Karras, Tero; Laine, Samuli; Aila, Timo (2020). "A Style-Based Generator Architecture for Generative Adversarial Networks". IEEE Transactions on Pattern Analysis and Machine Intelligence. PP (12): 4217–4228. arXiv:1812.04948. doi:10.1109/TPAMI.2020.2970919. PMID 32012000. S2CID 211022860.

[4] [4]
Karras, Tero; Laine, Samuli; Aittala, Miika; Hellsten, Janne; Lehtinen, Jaakko; Aila, Timo (23 March 2020). "Analyzing and Improving the Image Quality of StyleGAN". arXiv:1912.04958 [cs.CV].

[5] [5]
Fréchet., M (1957). "Sur la distance de deux lois de probabilité". C. R. Acad. Sci. Paris. 244: 689–692.

[gaussian-6] [6]
Dowson, D. C; Landau, B. V (1 September 1982). "The Fréchet distance between multivariate normal distributions". Journal of Multivariate Analysis. 12 (3): 450–455. doi:10.1016/0047-259X(82)90077-X. ISSN 0047-259X.

[7] [7]
Kilgour, Kevin; Zuluaga, Mauricio; Roblek, Dominik; Sharifi, Matthew (2019-09-15). "Fréchet Audio Distance: A Reference-Free Metric for Evaluating Music Enhancement Algorithms". Interspeech 2019: 2350–2354. doi:10.21437/Interspeech.2019-2219. S2CID 202725406.

[8] [8]
Unterthiner, Thomas; Steenkiste, Sjoerd van; Kurach, Karol; Marinier, Raphaël; Michalski, Marcin; Gelly, Sylvain (2019-03-27). "FVD: A new Metric for Video Generation". Open Review.

[9] [9]
Preuer, Kristina; Renz, Philipp; Unterthiner, Thomas; Hochreiter, Sepp; Klambauer, Günter (2018-09-24). "Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery". Journal of Chemical Information and Modeling. 58 (9): 1736–1741. arXiv:1803.09518. doi:10.1021/acs.jcim.8b00234. PMID 30118593. S2CID 51892387.

[10] [10]
Chong, Min Jin; Forsyth, David (2020-06-15). "Effectively Unbiased FID and Inception Score and where to find them". arXiv:1911.07023 [cs.CV].

[11] [11]
Liu, Shaohui; Wei, Yi; Lu, Jiwen; Zhou, Jie (2018-07-19). "An Improved Evaluation Framework for Generative Adversarial Networks". arXiv:1803.07474 [cs.CV].

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Fréchet_inception_distance

Fréchet inception distance

Definition

Interpretation

Variants

Limitations

See also

References

Share this article: