363 research outputs found
Quantitative contraction rates for Markov chains on general state spaces
We investigate the problem of quantifying contraction coefficients of Markov
transition kernels in Kantorovich ( Wasserstein) distances. For diffusion
processes, relatively precise quantitative bounds on contraction rates have
recently been derived by combining appropriate couplings with carefully
designed Kantorovich distances. In this paper, we partially carry over this
approach from diffusions to Markov chains. We derive quantitative lower bounds
on contraction rates for Markov chains on general state spaces that are
powerful if the dynamics is dominated by small local moves. For Markov chains
on with isotropic transition kernels, the general bounds can be
used efficiently together with a coupling that combines maximal and reflection
coupling. The results are applied to Euler discretizations of stochastic
differential equations with non-globally contractive drifts, and to the
Metropolis adjusted Langevin algorithm for sampling from a class of probability
measures on high dimensional state spaces that are not globally log-concave.Comment: 39 page
Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach
Bayesian inference typically requires the computation of an approximation to
the posterior distribution. An important requirement for an approximate
Bayesian inference algorithm is to output high-accuracy posterior mean and
uncertainty estimates. Classical Monte Carlo methods, particularly Markov Chain
Monte Carlo, remain the gold standard for approximate Bayesian inference
because they have a robust finite-sample theory and reliable convergence
diagnostics. However, alternative methods, which are more scalable or apply to
problems where Markov Chain Monte Carlo cannot be used, lack the same
finite-data approximation theory and tools for evaluating their accuracy. In
this work, we develop a flexible new approach to bounding the error of mean and
uncertainty estimates of scalable inference algorithms. Our strategy is to
control the estimation errors in terms of Wasserstein distance, then bound the
Wasserstein distance via a generalized notion of Fisher distance. Unlike
computing the Wasserstein distance, which requires access to the normalized
posterior distribution, the Fisher distance is tractable to compute because it
requires access only to the gradient of the log posterior density. We
demonstrate the usefulness of our Fisher distance approach by deriving bounds
on the Wasserstein error of the Laplace approximation and Hilbert coresets. We
anticipate that our approach will be applicable to many other approximate
inference methods such as the integrated Laplace approximation, variational
inference, and approximate Bayesian computationComment: 22 pages, 2 figure
MEXIT: Maximal un-coupling times for stochastic processes
Classical coupling constructions arrange for copies of the \emph{same} Markov
process started at two \emph{different} initial states to become equal as soon
as possible. In this paper, we consider an alternative coupling framework in
which one seeks to arrange for two \emph{different} Markov (or other
stochastic) processes to remain equal for as long as possible, when started in
the \emph{same} state. We refer to this "un-coupling" or "maximal agreement"
construction as \emph{MEXIT}, standing for "maximal exit". After highlighting
the importance of un-coupling arguments in a few key statistical and
probabilistic settings, we develop an explicit \MEXIT construction for
stochastic processes in discrete time with countable state-space. This
construction is generalized to random processes on general state-space running
in continuous time, and then exemplified by discussion of \MEXIT for Brownian
motions with two different constant drifts.Comment: 28 page
PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference
Generalized linear models (GLMs) -- such as logistic regression, Poisson
regression, and robust regression -- provide interpretable models for diverse
data types. Probabilistic approaches, particularly Bayesian ones, allow
coherent estimates of uncertainty, incorporation of prior information, and
sharing of power across experiments via hierarchical models. In practice,
however, the approximate Bayesian methods necessary for inference have either
failed to scale to large data sets or failed to provide theoretical guarantees
on the quality of inference. We propose a new approach based on constructing
polynomial approximate sufficient statistics for GLMs (PASS-GLM). We
demonstrate that our method admits a simple algorithm as well as trivial
streaming and distributed extensions that do not compound error across
computations. We provide theoretical guarantees on the quality of point (MAP)
estimates, the approximate posterior, and posterior mean and uncertainty
estimates. We validate our approach empirically in the case of logistic
regression using a quadratic approximation and show competitive performance
with stochastic gradient descent, MCMC, and the Laplace approximation in terms
of speed and multiple measures of accuracy -- including on an advertising data
set with 40 million data points and 20,000 covariates.Comment: In Proceedings of the 31st Annual Conference on Neural Information
Processing Systems (NIPS 2017). v3: corrected typos in Appendix
Tensor approximation of generalized correlated diffusions and applications
This thesis documents my research activity conducted in the past three years at the Department of Statistical Science at the University College London. My investigation is focused on functional-analytic methods applied to the characterization of generalized correlated Markov processes. The main objective of the research is to formalize the properties of such a class of stochastic processes when approximated in a tensor space. This lead to the development of a new interpretation of the correlation among processes that is exploited for the analysis of copula functions and their statistical properties
- …