Search CORE

438 research outputs found

Efficient Bayesian Inference for Generalized Bradley-Terry Models

Author: Caron Francois
Doucet Arnaud
Publication venue
Publication date: 01/01/2010
Field of study

The Bradley-Terry model is a popular approach to describe probabilities of the possible outcomes when elements of a set are repeatedly compared with one another in pairs. It has found many applications including animal behaviour, chess ranking and multiclass classification. Numerous extensions of the basic model have also been proposed in the literature including models with ties, multiple comparisons, group comparisons and random graphs. From a computational point of view, Hunter (2004) has proposed efficient iterative MM (minorization-maximization) algorithms to perform maximum likelihood estimation for these generalized Bradley-Terry models whereas Bayesian inference is typically performed using MCMC (Markov chain Monte Carlo) algorithms based on tailored Metropolis-Hastings (M-H) proposals. We show here that these MM\ algorithms can be reinterpreted as special instances of Expectation-Maximization (EM) algorithms associated to suitable sets of latent variables and propose some original extensions. These latent variables allow us to derive simple Gibbs samplers for Bayesian inference. We demonstrate experimentally the efficiency of these algorithms on a variety of applications

arXiv.org e-Print Archive

Oxford University Research Archive

Autoregressive Kernels For Time Series

Author: Cuturi Marco
Doucet Arnaud
Publication venue
Publication date: 01/01/2011
Field of study

We propose in this work a new family of kernels for variable-length time series. Our work builds upon the vector autoregressive (VAR) model for multivariate stochastic processes: given a multivariate time series x, we consider the likelihood function p_{\theta}(x) of different parameters \theta in the VAR model as features to describe x. To compare two time series x and x', we form the product of their features p_{\theta}(x) p_{\theta}(x') which is integrated out w.r.t \theta using a matrix normal-inverse Wishart prior. Among other properties, this kernel can be easily computed when the dimension d of the time series is much larger than the lengths of the considered time series x and x'. It can also be generalized to time series taking values in arbitrary state spaces, as long as the state space itself is endowed with a kernel \kappa. In that case, the kernel between x and x' is a a function of the Gram matrices produced by \kappa on observations and subsequences of observations enumerated in x and x'. We describe a computationally efficient implementation of this generalization that uses low-rank matrix factorization techniques. These kernels are compared to other known kernels using a set of benchmark classification tasks carried out with support vector machines

arXiv.org e-Print Archive

CiteSeerX

Asymptotic Bias of Stochastic Gradient Search

Author: Doucet Arnaud
Tadic Vladislav B.
Publication venue
Publication date: 01/01/2017
Field of study

The asymptotic behavior of the stochastic gradient algorithm with a biased gradient estimator is analyzed. Relying on arguments based on the dynamic system theory (chain-recurrence) and the differential geometry (Yomdin theorem and Lojasiewicz inequality), tight bounds on the asymptotic bias of the iterates generated by such an algorithm are derived. The obtained results hold under mild conditions and cover a broad class of high-dimensional nonlinear algorithms. Using these results, the asymptotic properties of the policy-gradient (reinforcement) learning and adaptive population Monte Carlo sampling are studied. Relying on the same results, the asymptotic behavior of the recursive maximum split-likelihood estimation in hidden Markov models is analyzed, too.Comment: arXiv admin note: text overlap with arXiv:0907.102

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Explore Bristol Research

Replica Conditional Sequential Monte Carlo

Author: Doucet Arnaud
Shestopaloff Alexander Y.
Publication venue
Publication date: 13/05/2019
Field of study

We propose a Markov chain Monte Carlo (MCMC) scheme to perform state inference in non-linear non-Gaussian state-space models. Current state-of-the-art methods to address this problem rely on particle MCMC techniques and its variants, such as the iterated conditional Sequential Monte Carlo (cSMC) scheme, which uses a Sequential Monte Carlo (SMC) type proposal within MCMC. A deficiency of standard SMC proposals is that they only use observations up to time

t

to propose states at time

t

when an entire observation sequence is available. More sophisticated SMC based on lookahead techniques could be used but they can be difficult to put in practice. We propose here replica cSMC where we build SMC proposals for one replica using information from the entire observation sequence by conditioning on the states of the other replicas. This approach is easily parallelizable and we demonstrate its excellent empirical performance when compared to the standard iterated cSMC scheme at fixed computational complexity.Comment: To appear in Proceedings of ICML '1

arXiv.org e-Print Archive

Edinburgh Research Explorer

Interacting Markov chain Monte Carlo methods for solving nonlinear measure-valued equations

Author: Del Moral Pierre
Doucet Arnaud
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

We present a new class of interacting Markov chain Monte Carlo algorithms for solving numerically discrete-time measure-valued equations. The associated stochastic processes belong to the class of self-interacting Markov chains. In contrast to traditional Markov chains, their time evolutions depend on the occupation measure of their past values. This general methodology allows us to provide a natural way to sample from a sequence of target probability measures of increasing complexity. We develop an original theoretical analysis to analyze the behavior of these iterative algorithms which relies on measure-valued processes and semigroup techniques. We establish a variety of convergence results including exponential estimates and a uniform convergence theorem with respect to the number of target distributions. We also illustrate these algorithms in the context of Feynman-Kac distribution flows.Comment: Published in at http://dx.doi.org/10.1214/09-AAP628 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Oxford University Research Archive

Oskar Bordeaux

Analyticity of Entropy Rates of Continuous-State Hidden Markov Models

Author: Doucet Arnaud
Tadic Vladislav Z. B.
Publication venue
Publication date: 01/01/2019
Field of study

The analyticity of the entropy and relative entropy rates of continuous-state hidden Markov models is studied here. Using the analytic continuation principle and the stability properties of the optimal filter, the analyticity of these rates is shown for analytically parameterized models. The obtained results hold under relatively mild conditions and cover several classes of hidden Markov models met in practice. These results are relevant for several (theoretically and practically) important problems arising in statistical inference, system identification and information theory

arXiv.org e-Print Archive

Oxford University Research Archive

Explore Bristol Research

Gibbs flow for approximate transport with applications to Bayesian computation

Author: Doucet Arnaud
Heng Jeremy
Pokern Yvo
Publication venue
Publication date: 21/01/2020
Field of study

Let

\pi_{0}

and

\pi_{1}

be two distributions on the Borel space

(\mathbb{R}^{d},\mathcal{B}(\mathbb{R}^{d}))

. Any measurable function

T:\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}

such that

Y=T(X)\sim\pi_{1}

X\sim\pi_{0}

is called a transport map from

\pi_{0}

\pi_{1}

. For any

\pi_{0}

and

\pi_{1}

, if one could obtain an analytical expression for a transport map from

\pi_{0}

\pi_{1}

, then this could be straightforwardly applied to sample from any distribution. One would map draws from an easy-to-sample distribution

\pi_{0}

to the target distribution

\pi_{1}

using this transport map. Although it is usually impossible to obtain an explicit transport map for complex target distributions, we show here how to build a tractable approximation of a novel transport map. This is achieved by moving samples from

\pi_{0}

using an ordinary differential equation with a velocity field that depends on the full conditional distributions of the target. Even when this ordinary differential equation is time-discretized and the full conditional distributions are numerically approximated, the resulting distribution of mapped samples can be efficiently evaluated and used as a proposal within sequential Monte Carlo samplers. We demonstrate significant gains over state-of-the-art sequential Monte Carlo samplers at a fixed computational complexity on a variety of applications.Comment: Significantly revised with new methodology and numerical example

arXiv.org e-Print Archive

Crossref

UCL Discovery

Oxford University Research Archive