17 research outputs found
An error analysis of generative adversarial networks for learning distributions
This paper studies how well generative adversarial networks (GANs) learn
probability distributions from finite samples. Our main results establish the
convergence rates of GANs under a collection of integral probability metrics
defined through H\"older classes, including the Wasserstein distance as a
special case. We also show that GANs are able to adaptively learn data
distributions with low-dimensional structures or have H\"older densities, when
the network architectures are chosen properly. In particular, for distributions
concentrated around a low-dimensional set, we show that the learning rates of
GANs do not depend on the high ambient dimension, but on the lower intrinsic
dimension. Our analysis is based on a new oracle inequality decomposing the
estimation error into the generator and discriminator approximation error and
the statistical error, which may be of independent interest
Statistical Guarantees of Generative Adversarial Networks for Distribution Estimation
Generative Adversarial Networks (GANs) have achieved great success in
unsupervised learning. Despite the remarkable empirical performance, there are
limited theoretical understandings on the statistical properties of GANs. This
paper provides statistical guarantees of GANs for the estimation of data
distributions which have densities in a H\"{o}lder space. Our main result shows
that, if the generator and discriminator network architectures are properly
chosen (universally for all distributions with H\"{o}lder densities), GANs are
consistent estimators of the data distributions under strong discrepancy
metrics, such as the Wasserstein distance. To our best knowledge, this is the
first statistical theory of GANs for H\"{o}lder densities. In comparison with
existing works, our theory requires minimum assumptions on data distributions.
Our generator and discriminator networks utilize general weight matrices and
the non-invertible ReLU activation function, while many existing works only
apply to invertible weight matrices and invertible activation functions. In our
analysis, we decompose the error into a statistical error and an approximation
error by a new oracle inequality, which may be of independent interest