257 research outputs found
Modelling Spatial Compositional Data: Reconstructions of past land cover and uncertainties
In this paper, we construct a hierarchical model for spatial compositional
data, which is used to reconstruct past land-cover compositions (in terms of
coniferous forest, broadleaved forest, and unforested/open land) for five time
periods during the past years over Europe. The model consists of a
Gaussian Markov Random Field (GMRF) with Dirichlet observations. A block
updated Markov chain Monte Carlo (MCMC), including an adaptive Metropolis
adjusted Langevin step, is used to estimate model parameters. The sparse
precision matrix in the GMRF provides computational advantages leading to a
fast MCMC algorithm. Reconstructions are obtained by combining pollen-based
estimates of vegetation cover at a limited number of locations with scenarios
of past deforestation and output from a dynamic vegetation model. To evaluate
uncertainties in the predictions a novel way of constructing joint confidence
regions for the entire composition at each prediction location is proposed. The
hierarchical model's ability to reconstruct past land cover is evaluated
through cross validation for all time periods, and by comparing reconstructions
for the recent past to a present day European forest map. The evaluation
results are promising and the model is able to capture known structures in past
land-cover compositions
Bayesian Spatial Binary Regression for Label Fusion in Structural Neuroimaging
Many analyses of neuroimaging data involve studying one or more regions of
interest (ROIs) in a brain image. In order to do so, each ROI must first be
identified. Since every brain is unique, the location, size, and shape of each
ROI varies across subjects. Thus, each ROI in a brain image must either be
manually identified or (semi-) automatically delineated, a task referred to as
segmentation. Automatic segmentation often involves mapping a previously
manually segmented image to a new brain image and propagating the labels to
obtain an estimate of where each ROI is located in the new image. A more recent
approach to this problem is to propagate labels from multiple manually
segmented atlases and combine the results using a process known as label
fusion. To date, most label fusion algorithms either employ voting procedures
or impose prior structure and subsequently find the maximum a posteriori
estimator (i.e., the posterior mode) through optimization. We propose using a
fully Bayesian spatial regression model for label fusion that facilitates
direct incorporation of covariate information while making accessible the
entire posterior distribution. We discuss the implementation of our model via
Markov chain Monte Carlo and illustrate the procedure through both simulation
and application to segmentation of the hippocampus, an anatomical structure
known to be associated with Alzheimer's disease.Comment: 24 pages, 10 figure
Arriving on time: estimating travel time distributions on large-scale road networks
Most optimal routing problems focus on minimizing travel time or distance
traveled. Oftentimes, a more useful objective is to maximize the probability of
on-time arrival, which requires statistical distributions of travel times,
rather than just mean values. We propose a method to estimate travel time
distributions on large-scale road networks, using probe vehicle data collected
from GPS. We present a framework that works with large input of data, and
scales linearly with the size of the network. Leveraging the planar topology of
the graph, the method computes efficiently the time correlations between
neighboring streets. First, raw probe vehicle traces are compressed into pairs
of travel times and number of stops for each traversed road segment using a
`stop-and-go' algorithm developed for this work. The compressed data is then
used as input for training a path travel time model, which couples a Markov
model along with a Gaussian Markov random field. Finally, scalable inference
algorithms are developed for obtaining path travel time distributions from the
composite MM-GMRF model. We illustrate the accuracy and scalability of our
model on a 505,000 road link network spanning the San Francisco Bay Area
On dimension reduction in Gaussian filters
A priori dimension reduction is a widely adopted technique for reducing the
computational complexity of stationary inverse problems. In this setting, the
solution of an inverse problem is parameterized by a low-dimensional basis that
is often obtained from the truncated Karhunen-Loeve expansion of the prior
distribution. For high-dimensional inverse problems equipped with smoothing
priors, this technique can lead to drastic reductions in parameter dimension
and significant computational savings.
In this paper, we extend the concept of a priori dimension reduction to
non-stationary inverse problems, in which the goal is to sequentially infer the
state of a dynamical system. Our approach proceeds in an offline-online
fashion. We first identify a low-dimensional subspace in the state space before
solving the inverse problem (the offline phase), using either the method of
"snapshots" or regularized covariance estimation. Then this subspace is used to
reduce the computational complexity of various filtering algorithms - including
the Kalman filter, extended Kalman filter, and ensemble Kalman filter - within
a novel subspace-constrained Bayesian prediction-and-update procedure (the
online phase). We demonstrate the performance of our new dimension reduction
approach on various numerical examples. In some test cases, our approach
reduces the dimensionality of the original problem by orders of magnitude and
yields up to two orders of magnitude in computational savings
spam: A Sparse Matrix R Package with Emphasis on MCMC Methods for Gaussian Markov Random Fields
spam is an R package for sparse matrix algebra with emphasis on a Cholesky factorization of sparse positive definite matrices. The implemantation of spam is based on the competing philosophical maxims to be competitively fast compared to existing tools and to be easy to use, modify and extend. The first is addressed by using fast Fortran routines and the second by assuring S3 and S4 compatibility. One of the features of spam is to exploit the algorithmic steps of the Cholesky factorization and hence to perform only a fraction of the workload when factorizing matrices with the same sparsity structure. Simulations show that exploiting this break-down of the factorization results in a speed-up of about a factor 5 and memory savings of about a factor 10 for large matrices and slightly smaller factors for huge matrices. The article is motivated with Markov chain Monte Carlo methods for Gaussian Markov random fields, but many other statistical applications are mentioned that profit from an efficient Cholesky factorization as well.
- …