Search CORE

41 research outputs found

Regularized Wasserstein Means for Aligning Distributional Data

Author: Mi Liang
Wang Yalin
Zhang Wen
Publication venue
Publication date: 20/02/2020
Field of study

We propose to align distributional data from the perspective of Wasserstein means. We raise the problem of regularizing Wasserstein means and propose several terms tailored to tackle different problems. Our formulation is based on the variational transportation to distribute a sparse discrete measure into the target domain. The resulting sparse representation well captures the desired property of the domain while reducing the mapping cost. We demonstrate the scalability and robustness of our method with examples in domain adaptation, point set registration, and skeleton layout

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sliced Wasserstein Distance for Learning Gaussian Mixture Models

Author: Hoffmann Heiko
Kolouri Soheil
Rohde Gustavo K.
Publication venue
Publication date: 15/11/2017
Field of study

Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only convergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm

arXiv.org e-Print Archive

Crossref

Probabilistic Multilevel Clustering via Composite Transportation Distance

Author: Ho Nhat
Huynh Viet
Jordan Michael I.
Phung Dinh
Publication venue
Publication date: 28/10/2018
Field of study

We propose a novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence. Our method involves solving a joint optimization problem over spaces of probability measures to simultaneously discover grouping structures within groups and among groups. By exploiting the connection of our method to the problem of finding composite transportation barycenters, we develop fast and efficient optimization algorithms even for potentially large-scale multilevel datasets. Finally, we present experimental results with both synthetic and real data to demonstrate the efficiency and scalability of the proposed approach.Comment: 25 pages, 3 figure

arXiv.org e-Print Archive

Monash University Research Portal

Exponential Convergence of Sinkhorn Under Regularization Scheduling

Author: Chen Jingbang
Chen Li
Liu Yang P.
Peng Richard
Ramaswami Arvind
Publication venue
Publication date: 04/04/2023
Field of study

In 2013, Cuturi [Cut13] introduced the Sinkhorn algorithm for matrix scaling as a method to compute solutions to regularized optimal transport problems. In this paper, aiming at a better convergence rate for a high accuracy solution, we work on understanding the Sinkhorn algorithm under regularization scheduling, and thus modify it with a mechanism that adaptively doubles the regularization parameter

\eta

periodically. We prove that such modified version of Sinkhorn has an exponential convergence rate as iteration complexity depending on

\log(1/\varepsilon)

instead of

\varepsilon^{-O(1)}

from previous analyses [Cut13][ANWR17] in the optimal transport problems with integral supply and demand. Furthermore, with cost and capacity scaling procedures, the general optimal transport problem can be solved with a logarithmic dependence on

1/\varepsilon

as well.Comment: ACDA23, 13 page

arXiv.org e-Print Archive