18,324 research outputs found
High-Dimensional Bayesian Geostatistics
With the growing capabilities of Geographic Information Systems (GIS) and
user-friendly software, statisticians today routinely encounter geographically
referenced data containing observations from a large number of spatial
locations and time points. Over the last decade, hierarchical spatiotemporal
process models have become widely deployed statistical tools for researchers to
better understand the complex nature of spatial and temporal variability.
However, fitting hierarchical spatiotemporal models often involves expensive
matrix computations with complexity increasing in cubic order for the number of
spatial locations and temporal points. This renders such models unfeasible for
large data sets. This article offers a focused review of two methods for
constructing well-defined highly scalable spatiotemporal stochastic processes.
Both these processes can be used as "priors" for spatiotemporal random fields.
The first approach constructs a low-rank process operating on a
lower-dimensional subspace. The second approach constructs a Nearest-Neighbor
Gaussian Process (NNGP) that ensures sparse precision matrices for its finite
realizations. Both processes can be exploited as a scalable prior embedded
within a rich hierarchical modeling framework to deliver full Bayesian
inference. These approaches can be described as model-based solutions for big
spatiotemporal datasets. The models ensure that the algorithmic complexity has
floating point operations (flops), where the number of spatial
locations (per iteration). We compare these methods and provide some insight
into their methodological underpinnings
Estimating Spatial Econometrics Models with Integrated Nested Laplace Approximation
Integrated Nested Laplace Approximation provides a fast and effective method
for marginal inference on Bayesian hierarchical models. This methodology has
been implemented in the R-INLA package which permits INLA to be used from
within R statistical software. Although INLA is implemented as a general
methodology, its use in practice is limited to the models implemented in the
R-INLA package.
Spatial autoregressive models are widely used in spatial econometrics but
have until now been missing from the R-INLA package. In this paper, we describe
the implementation and application of a new class of latent models in INLA made
available through R-INLA. This new latent class implements a standard spatial
lag model, which is widely used and that can be used to build more complex
models in spatial econometrics.
The implementation of this latent model in R-INLA also means that all the
other features of INLA can be used for model fitting, model selection and
inference in spatial econometrics, as will be shown in this paper. Finally, we
will illustrate the use of this new latent model and its applications with two
datasets based on Gaussian and binary outcomes
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets
Spatial process models for analyzing geostatistical data entail computations
that become prohibitive as the number of spatial locations become large. This
manuscript develops a class of highly scalable Nearest Neighbor Gaussian
Process (NNGP) models to provide fully model-based inference for large
geostatistical datasets. We establish that the NNGP is a well-defined spatial
process providing legitimate finite-dimensional Gaussian densities with sparse
precision matrices. We embed the NNGP as a sparsity-inducing prior within a
rich hierarchical modeling framework and outline how computationally efficient
Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or
decomposing large matrices. The floating point operations (flops) per iteration
of this algorithm is linear in the number of spatial locations, thereby
rendering substantial scalability. We illustrate the computational and
inferential benefits of the NNGP over competing methods using simulation
studies and also analyze forest biomass from a massive United States Forest
Inventory dataset at a scale that precludes alternative dimension-reducing
methods
Bayesian Structure Learning for Markov Random Fields with a Spike and Slab Prior
In recent years a number of methods have been developed for automatically
learning the (sparse) connectivity structure of Markov Random Fields. These
methods are mostly based on L1-regularized optimization which has a number of
disadvantages such as the inability to assess model uncertainty and expensive
cross-validation to find the optimal regularization parameter. Moreover, the
model's predictive performance may degrade dramatically with a suboptimal value
of the regularization parameter (which is sometimes desirable to induce
sparseness). We propose a fully Bayesian approach based on a "spike and slab"
prior (similar to L0 regularization) that does not suffer from these
shortcomings. We develop an approximate MCMC method combining Langevin dynamics
and reversible jump MCMC to conduct inference in this model. Experiments show
that the proposed model learns a good combination of the structure and
parameter values without the need for separate hyper-parameter tuning.
Moreover, the model's predictive performance is much more robust than L1-based
methods with hyper-parameter settings that induce highly sparse model
structures.Comment: Accepted in the Conference on Uncertainty in Artificial Intelligence
(UAI), 201
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
- …