Search CORE

24,774 research outputs found

Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm

Author: Ciliberto Carlo
Luise Giulia
Pontil Massimiliano
Salzo Saverio
Publication venue
Publication date: 30/05/2019
Field of study

We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation. We consider discrete as well as continuous distributions, proving convergence rates of the proposed algorithm in both settings. Key elements of our analysis are a new result showing that the Sinkhorn divergence on compact domains has Lipschitz continuous gradient with respect to the Total Variation and a characterization of the sample complexity of Sinkhorn potentials. Experiments validate the effectiveness of our method in practice.Comment: 46 pages, 8 figure

arXiv.org e-Print Archive

UCL Discovery

Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

Author: Komodakis Nikos
Pesquet Jean-Christophe
Publication venue
Publication date: 03/12/2014
Field of study

Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and non-smooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

HAL-Rennes 1

Regularized Optimal Transport and the Rot Mover's Distance

Author: Dessein Arnaud
Papadakis Nicolas
Rouas Jean-Luc
Publication venue
Publication date: 01/01/2018
Field of study

This paper presents a unified framework for smooth convex regularization of discrete optimal transport problems. In this context, the regularized optimal transport turns out to be equivalent to a matrix nearness problem with respect to Bregman divergences. Our framework thus naturally generalizes a previously proposed regularization based on the Boltzmann-Shannon entropy related to the Kullback-Leibler divergence, and solved with the Sinkhorn-Knopp algorithm. We call the regularized optimal transport distance the rot mover's distance in reference to the classical earth mover's distance. We develop two generic schemes that we respectively call the alternate scaling algorithm and the non-negative alternate scaling algorithm, to compute efficiently the regularized optimal plans depending on whether the domain of the regularizer lies within the non-negative orthant or not. These schemes are based on Dykstra's algorithm with alternate Bregman projections, and further exploit the Newton-Raphson method when applied to separable divergences. We enhance the separable case with a sparse extension to deal with high data dimensions. We also instantiate our proposed framework and discuss the inherent specificities for well-known regularizers and statistical divergences in the machine learning and information geometry communities. Finally, we demonstrate the merits of our methods with experiments using synthetic data to illustrate the effect of different regularizers and penalties on the solutions, as well as real-world data for a pattern recognition application to audio scene classification

arXiv.org e-Print Archive

Oskar Bordeaux

Computation of Ground States of the Gross-Pitaevskii Functional via Riemannian Optimization

Author: Danaila Ionut
Protas Bartosz
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2017
Field of study

In this paper we combine concepts from Riemannian Optimization and the theory of Sobolev gradients to derive a new conjugate gradient method for direct minimization of the Gross-Pitaevskii energy functional with rotation. The conservation of the number of particles constrains the minimizers to lie on a manifold corresponding to the unit

L^2

norm. The idea developed here is to transform the original constrained optimization problem to an unconstrained problem on this (spherical) Riemannian manifold, so that fast minimization algorithms can be applied as alternatives to more standard constrained formulations. First, we obtain Sobolev gradients using an equivalent definition of an

H^1

inner product which takes into account rotation. Then, the Riemannian gradient (RG) steepest descent method is derived based on projected gradients and retraction of an intermediate solution back to the constraint manifold. Finally, we use the concept of the Riemannian vector transport to propose a Riemannian conjugate gradient (RCG) method for this problem. It is derived at the continuous level based on the "optimize-then-discretize" paradigm instead of the usual "discretize-then-optimize" approach, as this ensures robustness of the method when adaptive mesh refinement is performed in computations. We evaluate various design choices inherent in the formulation of the method and conclude with recommendations concerning selection of the best options. Numerical tests demonstrate that the proposed RCG method outperforms the simple gradient descent (RG) method in terms of rate of convergence. While on simple problems a Newton-type method implemented in the {\tt Ipopt} library exhibits a faster convergence than the (RCG) approach, the two methods perform similarly on more complex problems requiring the use of mesh adaptation. At the same time the (RCG) approach has far fewer tunable parameters.Comment: 28 pages, 13 figure

arXiv.org e-Print Archive

HAL - Normandie Université

HAL Descartes

Hal-Diderot

Entropic Wasserstein Gradient Flows

Author: Peyré Gabriel
Publication venue
Publication date: 01/01/2015
Field of study

This article details a novel numerical scheme to approximate gradient flows for optimal transport (i.e. Wasserstein) metrics. These flows have proved useful to tackle theoretically and numerically non-linear diffusion equations that model for instance porous media or crowd evolutions. These gradient flows define a suitable notion of weak solutions for these evolutions and they can be approximated in a stable way using discrete flows. These discrete flows are implicit Euler time stepping according to the Wasserstein metric. A bottleneck of these approaches is the high computational load induced by the resolution of each step. Indeed, this corresponds to the resolution of a convex optimization problem involving a Wasserstein distance to the previous iterate. Following several recent works on the approximation of Wasserstein distances, we consider a discrete flow induced by an entropic regularization of the transportation coupling. This entropic regularization allows one to trade the initial Wasserstein fidelity term for a Kulback-Leibler divergence, which is easier to deal with numerically. We show how KL proximal schemes, and in particular Dykstra's algorithm, can be used to compute each step of the regularized flow. The resulting algorithm is both fast, parallelizable and versatile, because it only requires multiplications by a Gibbs kernel. On Euclidean domains discretized on an uniform grid, this corresponds to a linear filtering (for instance a Gaussian filtering when

c

is the squared Euclidean distance) which can be computed in nearly linear time. On more general domains, such as (possibly non-convex) shapes or on manifolds discretized by a triangular mesh, following a recently proposed numerical scheme for optimal transport, this Gibbs kernel multiplication is approximated by a short-time heat diffusion

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server