24,774 research outputs found
Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm
We present a novel algorithm to estimate the barycenter of arbitrary
probability distributions with respect to the Sinkhorn divergence. Based on a
Frank-Wolfe optimization strategy, our approach proceeds by populating the
support of the barycenter incrementally, without requiring any pre-allocation.
We consider discrete as well as continuous distributions, proving convergence
rates of the proposed algorithm in both settings. Key elements of our analysis
are a new result showing that the Sinkhorn divergence on compact domains has
Lipschitz continuous gradient with respect to the Total Variation and a
characterization of the sample complexity of Sinkhorn potentials. Experiments
validate the effectiveness of our method in practice.Comment: 46 pages, 8 figure
Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems
Optimization methods are at the core of many problems in signal/image
processing, computer vision, and machine learning. For a long time, it has been
recognized that looking at the dual of an optimization problem may drastically
simplify its solution. Deriving efficient strategies which jointly brings into
play the primal and the dual problems is however a more recent idea which has
generated many important new contributions in the last years. These novel
developments are grounded on recent advances in convex analysis, discrete
optimization, parallel processing, and non-smooth optimization with emphasis on
sparsity issues. In this paper, we aim at presenting the principles of
primal-dual approaches, while giving an overview of numerical methods which
have been proposed in different contexts. We show the benefits which can be
drawn from primal-dual algorithms both for solving large-scale convex
optimization problems and discrete ones, and we provide various application
examples to illustrate their usefulness
Regularized Optimal Transport and the Rot Mover's Distance
This paper presents a unified framework for smooth convex regularization of
discrete optimal transport problems. In this context, the regularized optimal
transport turns out to be equivalent to a matrix nearness problem with respect
to Bregman divergences. Our framework thus naturally generalizes a previously
proposed regularization based on the Boltzmann-Shannon entropy related to the
Kullback-Leibler divergence, and solved with the Sinkhorn-Knopp algorithm. We
call the regularized optimal transport distance the rot mover's distance in
reference to the classical earth mover's distance. We develop two generic
schemes that we respectively call the alternate scaling algorithm and the
non-negative alternate scaling algorithm, to compute efficiently the
regularized optimal plans depending on whether the domain of the regularizer
lies within the non-negative orthant or not. These schemes are based on
Dykstra's algorithm with alternate Bregman projections, and further exploit the
Newton-Raphson method when applied to separable divergences. We enhance the
separable case with a sparse extension to deal with high data dimensions. We
also instantiate our proposed framework and discuss the inherent specificities
for well-known regularizers and statistical divergences in the machine learning
and information geometry communities. Finally, we demonstrate the merits of our
methods with experiments using synthetic data to illustrate the effect of
different regularizers and penalties on the solutions, as well as real-world
data for a pattern recognition application to audio scene classification
Computation of Ground States of the Gross-Pitaevskii Functional via Riemannian Optimization
In this paper we combine concepts from Riemannian Optimization and the theory
of Sobolev gradients to derive a new conjugate gradient method for direct
minimization of the Gross-Pitaevskii energy functional with rotation. The
conservation of the number of particles constrains the minimizers to lie on a
manifold corresponding to the unit norm. The idea developed here is to
transform the original constrained optimization problem to an unconstrained
problem on this (spherical) Riemannian manifold, so that fast minimization
algorithms can be applied as alternatives to more standard constrained
formulations. First, we obtain Sobolev gradients using an equivalent definition
of an inner product which takes into account rotation. Then, the
Riemannian gradient (RG) steepest descent method is derived based on projected
gradients and retraction of an intermediate solution back to the constraint
manifold. Finally, we use the concept of the Riemannian vector transport to
propose a Riemannian conjugate gradient (RCG) method for this problem. It is
derived at the continuous level based on the "optimize-then-discretize"
paradigm instead of the usual "discretize-then-optimize" approach, as this
ensures robustness of the method when adaptive mesh refinement is performed in
computations. We evaluate various design choices inherent in the formulation of
the method and conclude with recommendations concerning selection of the best
options. Numerical tests demonstrate that the proposed RCG method outperforms
the simple gradient descent (RG) method in terms of rate of convergence. While
on simple problems a Newton-type method implemented in the {\tt Ipopt} library
exhibits a faster convergence than the (RCG) approach, the two methods perform
similarly on more complex problems requiring the use of mesh adaptation. At the
same time the (RCG) approach has far fewer tunable parameters.Comment: 28 pages, 13 figure
Entropic Wasserstein Gradient Flows
This article details a novel numerical scheme to approximate gradient flows
for optimal transport (i.e. Wasserstein) metrics. These flows have proved
useful to tackle theoretically and numerically non-linear diffusion equations
that model for instance porous media or crowd evolutions. These gradient flows
define a suitable notion of weak solutions for these evolutions and they can be
approximated in a stable way using discrete flows. These discrete flows are
implicit Euler time stepping according to the Wasserstein metric. A bottleneck
of these approaches is the high computational load induced by the resolution of
each step. Indeed, this corresponds to the resolution of a convex optimization
problem involving a Wasserstein distance to the previous iterate. Following
several recent works on the approximation of Wasserstein distances, we consider
a discrete flow induced by an entropic regularization of the transportation
coupling. This entropic regularization allows one to trade the initial
Wasserstein fidelity term for a Kulback-Leibler divergence, which is easier to
deal with numerically. We show how KL proximal schemes, and in particular
Dykstra's algorithm, can be used to compute each step of the regularized flow.
The resulting algorithm is both fast, parallelizable and versatile, because it
only requires multiplications by a Gibbs kernel. On Euclidean domains
discretized on an uniform grid, this corresponds to a linear filtering (for
instance a Gaussian filtering when is the squared Euclidean distance) which
can be computed in nearly linear time. On more general domains, such as
(possibly non-convex) shapes or on manifolds discretized by a triangular mesh,
following a recently proposed numerical scheme for optimal transport, this
Gibbs kernel multiplication is approximated by a short-time heat diffusion
- …