1,536 research outputs found

    High-Dimensional Bayesian Geostatistics

    Full text link
    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has ∼n\sim n floating point operations (flops), where nn the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings

    Latent Gaussian modeling and INLA: A review with focus on space-time applications

    Get PDF
    Bayesian hierarchical models with latent Gaussian layers have proven very flexible in capturing complex stochastic behavior and hierarchical structures in high-dimensional spatial and spatio-temporal data. Whereas simulation-based Bayesian inference through Markov Chain Monte Carlo may be hampered by slow convergence and numerical instabilities, the inferential framework of Integrated Nested Laplace Approximation (INLA) is capable to provide accurate and relatively fast analytical approximations to posterior quantities of interest. It heavily relies on the use of Gauss-Markov dependence structures to avoid the numerical bottleneck of high-dimensional nonsparse matrix computations. With a view towards space-time applications, we here review the principal theoretical concepts, model classes and inference tools within the INLA framework. Important elements to construct space-time models are certain spatial Mat\'ern-like Gauss-Markov random fields, obtained as approximate solutions to a stochastic partial differential equation. Efficient implementation of statistical inference tools for a large variety of models is available through the INLA package of the R software. To showcase the practical use of R-INLA and to illustrate its principal commands and syntax, a comprehensive simulation experiment is presented using simulated non Gaussian space-time count data with a first-order autoregressive dependence structure in time

    Bayesian Nonstationary Spatial Modeling for Very Large Datasets

    Full text link
    With the proliferation of modern high-resolution measuring instruments mounted on satellites, planes, ground-based vehicles and monitoring stations, a need has arisen for statistical methods suitable for the analysis of large spatial datasets observed on large spatial domains. Statistical analyses of such datasets provide two main challenges: First, traditional spatial-statistical techniques are often unable to handle large numbers of observations in a computationally feasible way. Second, for large and heterogeneous spatial domains, it is often not appropriate to assume that a process of interest is stationary over the entire domain. We address the first challenge by using a model combining a low-rank component, which allows for flexible modeling of medium-to-long-range dependence via a set of spatial basis functions, with a tapered remainder component, which allows for modeling of local dependence using a compactly supported covariance function. Addressing the second challenge, we propose two extensions to this model that result in increased flexibility: First, the model is parameterized based on a nonstationary Matern covariance, where the parameters vary smoothly across space. Second, in our fully Bayesian model, all components and parameters are considered random, including the number, locations, and shapes of the basis functions used in the low-rank component. Using simulated data and a real-world dataset of high-resolution soil measurements, we show that both extensions can result in substantial improvements over the current state-of-the-art.Comment: 16 pages, 2 color figure

    High-dimensional modeling of spatial and spatio-temporal conditional extremes using INLA and the SPDE approach

    Get PDF
    The conditional extremes framework allows for event-based stochastic modeling of dependent extremes, and has recently been extended to spatial and spatio-temporal settings. After standardizing the marginal distributions and applying an appropriate linear normalization, certain non-stationary Gaussian processes can be used as asymptotically-motivated models for the process conditioned on threshold exceedances at a fixed reference location and time. In this work, we adopt a Bayesian perspective by implementing estimation through the integrated nested Laplace approximation (INLA), allowing for novel and flexible semi-parametric specifications of the Gaussian mean function. By using Gauss-Markov approximations of the Mat\'ern covariance function (known as the Stochastic Partial Differential Equation approach) at a latent stage of the model, likelihood-based inference becomes feasible even with thousands of observed locations. We explain how constraints on the spatial and spatio-temporal Gaussian processes, arising from the conditioning mechanism, can be implemented through the latent variable approach without losing the computationally convenient Markov property. We discuss tools for the comparison of models via their posterior distributions, and illustrate the flexibility of the approach with gridded Red Sea surface temperature data at over 6,000 observed locations. Posterior sampling is exploited to study the probability distribution of cluster functionals of spatial and spatio-temporal extreme episodes
    • …
    corecore