2 research outputs found
Statistical Windows in Testing for the Initial Distribution of a Reversible Markov Chain
We study the problem of hypothesis testing between two discrete
distributions, where we only have access to samples after the action of a known
reversible Markov chain, playing the role of noise. We derive
instance-dependent minimax rates for the sample complexity of this problem, and
show how its dependence in time is related to the spectral properties of the
Markov chain. We show that there exists a wide statistical window, in terms of
sample complexity for hypothesis testing between different pairs of initial
distributions. We illustrate these results in several concrete examples
Stochastic Optimization for Regularized Wasserstein Estimators
Optimal transport is a foundational problem in optimization, that allows to
compare probability distributions while taking into account geometric aspects.
Its optimal objective value, the Wasserstein distance, provides an important
loss between distributions that has been used in many applications throughout
machine learning and statistics. Recent algorithmic progress on this problem
and its regularized versions have made these tools increasingly popular.
However, existing techniques require solving an optimization problem to obtain
a single gradient of the loss, thus slowing down first-order methods to
minimize the sum of losses, that require many such gradient computations. In
this work, we introduce an algorithm to solve a regularized version of this
problem of Wasserstein estimators, with a time per step which is sublinear in
the natural dimensions of the problem. We introduce a dual formulation, and
optimize it with stochastic gradient steps that can be computed directly from
samples, without solving additional optimization problems at each step. Doing
so, the estimation and computation tasks are performed jointly. We show that
this algorithm can be extended to other tasks, including estimation of
Wasserstein barycenters. We provide theoretical guarantees and illustrate the
performance of our algorithm with experiments on synthetic data