Applications of optimal transport have recently gained remarkable attention
thanks to the computational advantages of entropic regularization. However, in
most situations the Sinkhorn approximation of the Wasserstein distance is
replaced by a regularized version that is less accurate but easy to
differentiate. In this work we characterize the differential properties of the
original Sinkhorn distance, proving that it enjoys the same smoothness as its
regularized version and we explicitly provide an efficient algorithm to compute
its gradient. We show that this result benefits both theory and applications:
on one hand, high order smoothness confers statistical guarantees to learning
with Wasserstein approximations. On the other hand, the gradient formula allows
us to efficiently solve learning and optimization problems in practice.
Promising preliminary experiments complement our analysis.Comment: 26 pages, 4 figure

Ciliberto, Carlo

Luise, Giulia

Pontil, Massimiliano

Rudi, Alessandro

English

arXiv

Applications of optimal transport have recently gained remarkable attention thanks to

the computational advantages of entropic regularization. However, in most situations

the Sinkhorn approximation of the Wasserstein distance is replaced by a regularized

version that is less accurate but easy to differentiate. In this work we characterize the

differential properties of the original Sinkhorn distance, proving that it enjoys the same

smoothness as its regularized version and we explicitly provide an efficient algorithm

to compute its gradient. We show that this result benefits both theory and applications:

on one hand, high order smoothness confers statistical guarantees to learning with

Wasserstein approximations. On the other hand, the gradient formula allows us to

efficiently solve learning and optimization problems in practice. Promising preliminary

experiments complement our analysis

Luise, G

Rudi, A

Pontil, M

Ciliberto, C

UCL Discovery

Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance

26 pages, 4 figuresInternational audienceApplications of optimal transport have recently gained remarkable attention thanks to the computational advantages of entropic regularization. However, in most situations the Sinkhorn approximation of the Wasserstein distance is replaced by a regularized version that is less accurate but easy to differentiate. In this work we characterize the differential properties of the original Sinkhorn distance, proving that it enjoys the same smoothness as its regularized version and we explicitly provide an efficient algorithm to compute its gradient. We show that this result benefits both theory and applications: on one hand, high order smoothness confers statistical guarantees to learning with Wasserstein approximations. On the other hand, the gradient formula allows us to efficiently solve learning and optimization problems in practice. Promising preliminary experiments complement our analysis

INRIA a CCSD electronic archive server

http://discovery.ucl.ac.uk/10073325/1/pre-published.pdf

Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance

Abstract

Similar works

Full text

Available Versions

UCL Discovery

INRIA a CCSD electronic archive server