506 research outputs found

    A Smoothed Dual Approach for Variational Wasserstein Problems

    Full text link
    Variational problems that involve Wasserstein distances have been recently proposed to summarize and learn from probability measures. Despite being conceptually simple, such problems are computationally challenging because they involve minimizing over quantities (Wasserstein distances) that are themselves hard to compute. We show that the dual formulation of Wasserstein variational problems introduced recently by Carlier et al. (2014) can be regularized using an entropic smoothing, which leads to smooth, differentiable, convex optimization problems that are simpler to implement and numerically more stable. We illustrate the versatility of this approach by applying it to the computation of Wasserstein barycenters and gradient flows of spacial regularization functionals

    Fast Optimal Transport Averaging of Neuroimaging Data

    Full text link
    Knowing how the Human brain is anatomically and functionally organized at the level of a group of healthy individuals or patients is the primary goal of neuroimaging research. Yet computing an average of brain imaging data defined over a voxel grid or a triangulation remains a challenge. Data are large, the geometry of the brain is complex and the between subjects variability leads to spatially or temporally non-overlapping effects of interest. To address the problem of variability, data are commonly smoothed before group linear averaging. In this work we build on ideas originally introduced by Kantorovich to propose a new algorithm that can average efficiently non-normalized data defined over arbitrary discrete domains using transportation metrics. We show how Kantorovich means can be linked to Wasserstein barycenters in order to take advantage of an entropic smoothing approach. It leads to a smooth convex optimization problem and an algorithm with strong convergence guarantees. We illustrate the versatility of this tool and its empirical behavior on functional neuroimaging data, functional MRI and magnetoencephalography (MEG) source estimates, defined on voxel grids and triangulations of the folded cortical surface.Comment: Information Processing in Medical Imaging (IPMI), Jun 2015, Isle of Skye, United Kingdom. Springer, 201

    Variational Approaches for Image Labeling on the Assignment Manifold

    Get PDF
    The image labeling problem refers to the task of assigning to each pixel a single element from a finite predefined set of labels. In classical approaches the labeling task is formulated as a minimization problem of specifically structured objective functions. Assignment flows for contextual image labeling are a recently proposed alternative formulation via spatially coupled replicator equations. In this work, the classical and dynamical viewpoint of image labeling are combined into a variational formulation. This is accomplished by following the induced Riemannian gradient descent flow on an elementary statistical manifold with respect to the underlying information geometry. Convergence and stability behavior of this approach are investigated using the log-barrier method. A novel parameterization of the assignment flow by its dominant component is derived, revealing a Riemannian gradient flow structure that clearly identifies the two governing processes of the flow: spatial regularization of assignments and gradual enforcement of unambiguous label decisions. Also, a continuous-domain formulation of the corresponding potential is presented and well-posedness of the related optimization problem is established. Furthermore, an alternative smooth variational approach to maximum a-posteriori inference based on discrete graphical models is derived by utilizing local Wasserstein distances. Following the resulting Riemannian gradient flow leads to an inference process which always satisfies the local marginalization constraints and incorporates a smooth rounding mechanism towards unambiguous assignments

    Inference and Model Parameter Learning for Image Labeling by Geometric Assignment

    Get PDF
    Image labeling is a fundamental problem in the area of low-level image analysis. In this work, we present novel approaches to maximum a posteriori (MAP) inference and model parameter learning for image labeling, respectively. Both approaches are formulated in a smooth geometric setting, whose respective solution space is a simple Riemannian manifold. Optimization consists of multiplicative updates that geometrically integrate the resulting Riemannian gradient flow. Our novel approach to MAP inference is based on discrete graphical models. By utilizing local Wasserstein distances for coupling assignment measures across edges of the underlying graph, we smoothly approximate a given discrete objective function and restrict it to the assignment manifold. A corresponding update scheme combines geometric integration of the resulting gradient flow, and rounding to integral solutions that represent valid labelings. This formulation constitutes an inner relaxation of the discrete labeling problem, i.e. throughout this process local marginalization constraints known from the established linear programming relaxation are satisfied. Furthermore, we study the inverse problem of model parameter learning using the linear assignment flow and training data with ground truth. This is accomplished by a Riemannian gradient flow on the manifold of parameters that determine the regularization properties of the assignment flow. This smooth formulation enables us to tackle the model parameter learning problem from the perspective of parameter estimation of dynamical systems. By using symplectic partitioned Runge--Kutta methods for numerical integration, we show that deriving the sensitivity conditions of the parameter learning problem and its discretization commute. A favorable property of our approach is that learning is based on exact inference

    Learning Generative Models with Sinkhorn Divergences

    Full text link
    The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function

    Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach

    Full text link
    Bayesian inference typically requires the computation of an approximation to the posterior distribution. An important requirement for an approximate Bayesian inference algorithm is to output high-accuracy posterior mean and uncertainty estimates. Classical Monte Carlo methods, particularly Markov Chain Monte Carlo, remain the gold standard for approximate Bayesian inference because they have a robust finite-sample theory and reliable convergence diagnostics. However, alternative methods, which are more scalable or apply to problems where Markov Chain Monte Carlo cannot be used, lack the same finite-data approximation theory and tools for evaluating their accuracy. In this work, we develop a flexible new approach to bounding the error of mean and uncertainty estimates of scalable inference algorithms. Our strategy is to control the estimation errors in terms of Wasserstein distance, then bound the Wasserstein distance via a generalized notion of Fisher distance. Unlike computing the Wasserstein distance, which requires access to the normalized posterior distribution, the Fisher distance is tractable to compute because it requires access only to the gradient of the log posterior density. We demonstrate the usefulness of our Fisher distance approach by deriving bounds on the Wasserstein error of the Laplace approximation and Hilbert coresets. We anticipate that our approach will be applicable to many other approximate inference methods such as the integrated Laplace approximation, variational inference, and approximate Bayesian computationComment: 22 pages, 2 figure
    • …
    corecore