Search CORE

1,205 research outputs found

Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families

Author: Gretton Arthur
Livingstone Samuel
Sejdinovic Dino
Strathmann Heiko
Szabo Zoltan
Publication venue
Publication date: 01/01/2015
Field of study

We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities where classical HMC is not an option due to intractable gradients, KMC adaptively learns the target's gradient structure by fitting an exponential family model in a Reproducing Kernel Hilbert Space. Computational costs are reduced by two novel efficient approximations to this gradient. While being asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and offers substantial mixing improvements over state-of-the-art gradient free samplers. We support our claims with experimental studies on both toy and real-world applications, including Approximate Bayesian Computation and exact-approximate MCMC.Comment: 20 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Oxford University Research Archive

Nonparametric Uncertainty Quantification for Stochastic Gradient Flows

Author: Berry Tyrus
Harlim John
Publication venue
Publication date: 07/02/2015
Field of study

This paper presents a nonparametric statistical modeling method for quantifying uncertainty in stochastic gradient systems with isotropic diffusion. The central idea is to apply the diffusion maps algorithm to a training data set to produce a stochastic matrix whose generator is a discrete approximation to the backward Kolmogorov operator of the underlying dynamics. The eigenvectors of this stochastic matrix, which we will refer to as the diffusion coordinates, are discrete approximations to the eigenfunctions of the Kolmogorov operator and form an orthonormal basis for functions defined on the data set. Using this basis, we consider the projection of three uncertainty quantification (UQ) problems (prediction, filtering, and response) into the diffusion coordinates. In these coordinates, the nonlinear prediction and response problems reduce to solving systems of infinite-dimensional linear ordinary differential equations. Similarly, the continuous-time nonlinear filtering problem reduces to solving a system of infinite-dimensional linear stochastic differential equations. Solving the UQ problems then reduces to solving the corresponding truncated linear systems in finitely many diffusion coordinates. By solving these systems we give a model-free algorithm for UQ on gradient flow systems with isotropic diffusion. We numerically verify these algorithms on a 1-dimensional linear gradient flow system where the analytic solutions of the UQ problems are known. We also apply the algorithm to a chaotically forced nonlinear gradient flow system which is known to be well approximated as a stochastically forced gradient flow.Comment: Find the associated videos at: http://personal.psu.edu/thb11

arXiv.org e-Print Archive

CiteSeerX

Scalable iterative methods for sampling from massive Gaussian random vectors

Author: Pettitt Anthony N.
Simpson Daniel P.
Strickland Christopher M.
Turner Ian W.
Publication venue
Publication date: 05/12/2013
Field of study

Sampling from Gaussian Markov random fields (GMRFs), that is multivariate Gaussian ran- dom vectors that are parameterised by the inverse of their covariance matrix, is a fundamental problem in computational statistics. In this paper, we show how we can exploit arbitrarily accu- rate approximations to a GMRF to speed up Krylov subspace sampling methods. We also show that these methods can be used when computing the normalising constant of a large multivariate Gaussian distribution, which is needed for both any likelihood-based inference method. The method we derive is also applicable to other structured Gaussian random vectors and, in particu- lar, we show that when the precision matrix is a perturbation of a (block) circulant matrix, it is still possible to derive O(n log n) sampling schemes.Comment: 17 Pages, 4 Figure

arXiv.org e-Print Archive

CiteSeerX

The Onsager--Machlup functional for data assimilation

Author: Sugiura Nozomi
Publication venue: 'Copernicus GmbH'
Publication date: 02/12/2017
Field of study

When taking the model error into account in data assimilation, one needs to evaluate the prior distribution represented by the Onsager--Machlup functional. Through numerical experiments, this study clarifies how the prior distribution should be incorporated into cost functions for discrete-time estimation problems. Consistent with previous theoretical studies, the divergence of the drift term is essential in weak-constraint 4D-Var (w4D-Var), but it is not nec essary in Markov chain Monte Carlo with the Euler scheme. Although the former property may cause difficulties when implementing w4D-Var in large systems, this paper proposes a new technique for estimating the divergence term and its derivative.Comment: Reprint from Nonlin. Processes Geophys. (ver.5). 12 pages, 5 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Bayesian data assimilation in shape registration

Author: Cotter C. J.
Cotter S. L.
Vialard F. X.
Publication venue
Publication date: 01/01/2011
Field of study

In this paper we apply a Bayesian framework to the problem of geodesic curve matching. Given a template curve, the geodesic equations provide a mapping from initial conditions\ud for the conjugate momentum onto topologically equivalent shapes. Here, we aim to recover the well defined posterior distribution on the initial momentum which gives rise to observed points on the target curve; this is achieved by explicitly including a reparameterisation in the formulation. Appropriate priors are chosen for the functions which together determine this field and the positions of the observation points, the initial momentum p0 and the reparameterisation vector field v, informed by regularity results about the forward model. Having done this, we illustrate how Maximum Likelihood Estimators (MLEs) can be used to find regions of high posterior density, but also how we can apply recently developed MCMC methods on function spaces to characterise the whole of the posterior density. These illustrative examples also include scenarios where the posterior distribution is multimodal and irregular, leading us to the conclusion that knowledge of a state of global maximal posterior density does not always give us the whole picture, and full posterior sampling can give better quantification of likely states and the overall uncertainty inherent in the problem

arXiv.org e-Print Archive

Oxford University Research Archive

MIMS EPrints

The University of Manchester - Institutional Repository

Master of Science

Author: Singhal Shweta
Publication venue: University of Utah
Publication date: 01/01/2017
Field of study

thesisRecent developments have shown that restricted Boltzmann machines (RBMs) are useful in learning the features of a given dataset in an unsupervised manner. In the case of digital images, RBMs consider the image pixels as a set of real-valued random variables, disregarding their spatial layout. However, as we know, each image pixel is correlated with its neighboring pixels, and direct modeling of this correlation might help in learning. Therefore, this thesis proposes using a Markov random field prior on the weights of the RBM model, which is designed to model these correlations between neighboring pixels. We compared the test classification error of our model with that of a traditional RBM with no prior on the weights and with RBMs with L1 and L2 regularization prior on the weights. We used the NIST dataset, which consists of images of handwritten digits for our experiments

The University of Utah: J. Willard Marriott Digital Library