Search CORE

34,608 research outputs found

Learning Probability Measures with respect to Optimal Transport Metrics

Author: Canas Guillermo D.
Rosasco Lorenzo
Publication venue
Publication date: 01/01/2012
Field of study

We study the problem of estimating, in the sense of optimal transport metrics, a measure which is assumed supported on a manifold embedded in a Hilbert space. By establishing a precise connection between optimal transport metrics, optimal quantization, and learning theory, we derive new probabilistic bounds for the performance of a classic algorithm in unsupervised learning (k-means), when used to produce a probability measure derived from the data. In the course of the analysis, we arrive at new lower bounds, as well as probabilistic upper bounds on the convergence rate of the empirical law of large numbers, which, unlike existing bounds, are applicable to a wide class of measures.Comment: 13 pages, 2 figures. Advances in Neural Information Processing Systems, NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Università di Genova

Learning Generative Models with Sinkhorn Divergences

Author: Cuturi Marco
Genevay Aude
Peyré Gabriel
Publication venue
Publication date: 20/10/2017
Field of study

The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Geometrical Insights for Implicit Generative Modeling

Author: A Auffinger
A Gretton
A Müller
AA Zinger
B Schölkopf
B Sriperumbudur
BK Sriperumbudur
BK Sriperumbudur
C Villani
D Sejdinovic
GJ Székely
H Cramér
IJ Schoenberg
JM Hammersley
MA Aizerman
N Aronszajn
N Fournier
P Milgrom
R Mises von
RJ Serfling
RM Neal
ST Rachev
Steffen Dereich
T Hastie
VS Borkar
X Nguyen
Publication venue
Publication date: 21/08/2019
Field of study

Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion. A careful look at the geometries induced by these distances on the space of probability measures reveals interesting differences. In particular, we can establish surprising approximate global convergence guarantees for the

1

-Wasserstein distance,even when the parametric generator has a nonconvex parametrization.Comment: this version fixes a typo in a definitio

arXiv.org e-Print Archive

Crossref