1,205 research outputs found
Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families
We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive
MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities
where classical HMC is not an option due to intractable gradients, KMC
adaptively learns the target's gradient structure by fitting an exponential
family model in a Reproducing Kernel Hilbert Space. Computational costs are
reduced by two novel efficient approximations to this gradient. While being
asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and
offers substantial mixing improvements over state-of-the-art gradient free
samplers. We support our claims with experimental studies on both toy and
real-world applications, including Approximate Bayesian Computation and
exact-approximate MCMC.Comment: 20 pages, 7 figure
Nonparametric Uncertainty Quantification for Stochastic Gradient Flows
This paper presents a nonparametric statistical modeling method for
quantifying uncertainty in stochastic gradient systems with isotropic
diffusion. The central idea is to apply the diffusion maps algorithm to a
training data set to produce a stochastic matrix whose generator is a discrete
approximation to the backward Kolmogorov operator of the underlying dynamics.
The eigenvectors of this stochastic matrix, which we will refer to as the
diffusion coordinates, are discrete approximations to the eigenfunctions of the
Kolmogorov operator and form an orthonormal basis for functions defined on the
data set. Using this basis, we consider the projection of three uncertainty
quantification (UQ) problems (prediction, filtering, and response) into the
diffusion coordinates. In these coordinates, the nonlinear prediction and
response problems reduce to solving systems of infinite-dimensional linear
ordinary differential equations. Similarly, the continuous-time nonlinear
filtering problem reduces to solving a system of infinite-dimensional linear
stochastic differential equations. Solving the UQ problems then reduces to
solving the corresponding truncated linear systems in finitely many diffusion
coordinates. By solving these systems we give a model-free algorithm for UQ on
gradient flow systems with isotropic diffusion. We numerically verify these
algorithms on a 1-dimensional linear gradient flow system where the analytic
solutions of the UQ problems are known. We also apply the algorithm to a
chaotically forced nonlinear gradient flow system which is known to be well
approximated as a stochastically forced gradient flow.Comment: Find the associated videos at: http://personal.psu.edu/thb11
Scalable iterative methods for sampling from massive Gaussian random vectors
Sampling from Gaussian Markov random fields (GMRFs), that is multivariate
Gaussian ran- dom vectors that are parameterised by the inverse of their
covariance matrix, is a fundamental problem in computational statistics. In
this paper, we show how we can exploit arbitrarily accu- rate approximations to
a GMRF to speed up Krylov subspace sampling methods. We also show that these
methods can be used when computing the normalising constant of a large
multivariate Gaussian distribution, which is needed for both any
likelihood-based inference method. The method we derive is also applicable to
other structured Gaussian random vectors and, in particu- lar, we show that
when the precision matrix is a perturbation of a (block) circulant matrix, it
is still possible to derive O(n log n) sampling schemes.Comment: 17 Pages, 4 Figure
The Onsager--Machlup functional for data assimilation
When taking the model error into account in data assimilation, one needs to
evaluate the prior distribution represented by the Onsager--Machlup functional.
Through numerical experiments, this study clarifies how the prior distribution
should be incorporated into cost functions for discrete-time estimation
problems. Consistent with previous theoretical studies, the divergence of the
drift term is essential in weak-constraint 4D-Var (w4D-Var), but it is not nec
essary in Markov chain Monte Carlo with the Euler scheme. Although the former
property may cause difficulties when implementing w4D-Var in large systems,
this paper proposes a new technique for estimating the divergence term and its
derivative.Comment: Reprint from Nonlin. Processes Geophys. (ver.5). 12 pages, 5 figure
Bayesian data assimilation in shape registration
In this paper we apply a Bayesian framework to the problem of geodesic curve matching. Given a template curve, the geodesic equations provide a mapping from initial conditions\ud
for the conjugate momentum onto topologically equivalent shapes. Here, we aim to recover the well defined posterior distribution on the initial momentum which gives rise to observed points on the target curve; this is achieved by explicitly including a reparameterisation in the formulation. Appropriate priors are chosen for the functions which together determine this field and the positions of the observation points, the initial momentum p0 and the reparameterisation vector field v, informed by regularity results about the forward model. Having done this, we illustrate how Maximum Likelihood Estimators (MLEs) can be used to find regions of high posterior density, but also how we can apply recently developed MCMC methods on function spaces to characterise the whole of the posterior density. These illustrative examples also include scenarios where the posterior distribution is multimodal and irregular, leading us to the conclusion that knowledge of a state of global maximal posterior density does not always give us the whole picture, and full posterior sampling can give better quantification of likely states and the overall uncertainty inherent in the problem
Master of Science
thesisRecent developments have shown that restricted Boltzmann machines (RBMs) are useful in learning the features of a given dataset in an unsupervised manner. In the case of digital images, RBMs consider the image pixels as a set of real-valued random variables, disregarding their spatial layout. However, as we know, each image pixel is correlated with its neighboring pixels, and direct modeling of this correlation might help in learning. Therefore, this thesis proposes using a Markov random field prior on the weights of the RBM model, which is designed to model these correlations between neighboring pixels. We compared the test classification error of our model with that of a traditional RBM with no prior on the weights and with RBMs with L1 and L2 regularization prior on the weights. We used the NIST dataset, which consists of images of handwritten digits for our experiments
- …