16,306 research outputs found
Model-Based Geostatistics for Prevalence Mapping in Low-Resource Settings
In low-resource settings, prevalence mapping relies on empirical prevalence
data from a finite, often spatially sparse, set of surveys of communities
within the region of interest, possibly supplemented by remotely sensed images
that can act as proxies for environmental risk factors. A standard
geostatistical model for data of this kind is a generalized linear mixed model
with binomial error distribution, logistic link and a combination of
explanatory variables and a Gaussian spatial stochastic process in the linear
predictor. In this paper, we first review statistical methods and software
associated with this standard model, then consider several methodological
extensions whose development has been motivated by the requirements of specific
applications. These include: methods for combining randomised survey data with
data from non-randomised, and therefore potentially biased, surveys;
spatio-temporal extensions; spatially structured zero-inflation. Throughout, we
illustrate the methods with disease mapping applications that have arisen
through our involvement with a range of African public health programmes.Comment: Submitte
Model-Based Geostatistics the Easy Way
This paper briefly describes geostatistical models for Gaussian and non-Gaussian data and demonstrates the geostatsp and dieasemapping packages for performing inference using these models. Making use of R’s spatial data types, and raster objects in particular, makes spatial analyses using geostatistical models simple and convenient. Examples using real data are shown for Gaussian spatial data, binomially distributed spatial data, a logGaussian Cox process, and an area-level model for case counts
Identification of high-permeability subsurface structures with multiple point geostatistics and normal score ensemble Kalman filter
Alluvial aquifers are often characterized by the presence of braided high-permeable paleo-riverbeds, which constitute an interconnected preferential flow network whose localization is of fundamental importance to predict flow and transport dynamics. Classic geostatistical approaches based on two-point correlation (i.e., the variogram) cannot describe such particular shapes. In contrast, multiple point geostatistics can describe almost any kind of shape using the empirical probability distribution derived from a training image. However, even with a correct training image the exact positions of the channels are uncertain. State information like groundwater levels can constrain the channel positions using inverse modeling or data assimilation, but the method should be able to handle non-Gaussianity of the parameter distribution. Here the normal score ensemble Kalman filter (NS-EnKF) was chosen as the inverse conditioning algorithm to tackle this issue. Multiple point geostatistics and NS-EnKF have already been tested in synthetic examples, but in this study they are used for the first time in a real-world casestudy. The test site is an alluvial unconfined aquifer in northeastern Italy with an extension of approximately 3 km2. A satellite training image showing the braid shapes of the nearby river and electrical resistivity tomography (ERT) images were used as conditioning data to provide information on channel shape, size, and position. Measured groundwater levels were assimilated with the NS-EnKF to update the spatially distributed groundwater parameters (hydraulic conductivity and storage coefficients). Results from the study show that the inversion based on multiple point geostatistics does not outperform the one with a multiGaussian model and that the information from the ERT images did not improve site characterization. These results were further evaluated with a synthetic study that mimics the experimental site. The synthetic results showed that only for a much larger number of conditioning piezometric heads, multiple point geostatistics and ERT could improve aquifer characterization. This shows that state of the art stochastic methods need to be supported by abundant and high-quality subsurface data
High-Dimensional Bayesian Geostatistics
With the growing capabilities of Geographic Information Systems (GIS) and
user-friendly software, statisticians today routinely encounter geographically
referenced data containing observations from a large number of spatial
locations and time points. Over the last decade, hierarchical spatiotemporal
process models have become widely deployed statistical tools for researchers to
better understand the complex nature of spatial and temporal variability.
However, fitting hierarchical spatiotemporal models often involves expensive
matrix computations with complexity increasing in cubic order for the number of
spatial locations and temporal points. This renders such models unfeasible for
large data sets. This article offers a focused review of two methods for
constructing well-defined highly scalable spatiotemporal stochastic processes.
Both these processes can be used as "priors" for spatiotemporal random fields.
The first approach constructs a low-rank process operating on a
lower-dimensional subspace. The second approach constructs a Nearest-Neighbor
Gaussian Process (NNGP) that ensures sparse precision matrices for its finite
realizations. Both processes can be exploited as a scalable prior embedded
within a rich hierarchical modeling framework to deliver full Bayesian
inference. These approaches can be described as model-based solutions for big
spatiotemporal datasets. The models ensure that the algorithmic complexity has
floating point operations (flops), where the number of spatial
locations (per iteration). We compare these methods and provide some insight
into their methodological underpinnings
Model-based geostatistics: some issues in modelling and model diagnostics
Spatial modelling is examined in a model-based geostatistical context using the Gaussian linear mixed model in a likelihood framework. Complex spatial models developed provide practitioners with a practical and best-practice guide for spatial analysis. Adequate modelling theory and matrix algebra are provided to ground the methods demonstrated. A multivariate model over two time points and three-dimensional space is developed which is novel to the field of soil science. Soil organic carbon measurements at three soil depths and two time points from a cropping field with four soil classes are used. The spatial process is assessed for second-order stationarity and anisotropic correlation. Univariate spatial modelling is used to inform bivariate spatial modelling of pre- and post-harvest soil organic carbon at each soil depth. Bivariate modelling is extended to the multivariate level, where both time points and the three soil depths are incorporated in a single model to pool maximum information. A common correlation structure is tested and is supported for the response variable at each of the six time-depth combinations. Separable correlation structures are used for computational efficiency. The difficulty of estimating nugget effects suggests a sub-optimal sampling design. Preferred fitted models are all isotropic. Equations for predictions and the variance of prediction errors are extended from well-known results and maps of predicted values and variance of prediction errors are produced and show close correspondence with observed values. Finally, univariate models for spatially referenced seed counts from small sampling plots are examined within a Gaussian framework using Box-Cox transformations. The discrete nature of the data, small sample size and computational problems hamper model fitting. Anisotropy is examined using a variogram envelope diagnostic technique. ASReml-R software is shown to be a powerful analytical tool for spatial processes
Model-based geostatistics: some issues in modelling and model diagnostics
Spatial modelling is examined in a model-based geostatistical context using the Gaussian linear mixed model in a likelihood framework. Complex spatial models developed provide practitioners with a practical and best-practice guide for spatial analysis. Adequate modelling theory and matrix algebra are provided to ground the methods demonstrated. A multivariate model over two time points and three-dimensional space is developed which is novel to the field of soil science. Soil organic carbon measurements at three soil depths and two time points from a cropping field with four soil classes are used. The spatial process is assessed for second-order stationarity and anisotropic correlation. Univariate spatial modelling is used to inform bivariate spatial modelling of pre- and post-harvest soil organic carbon at each soil depth. Bivariate modelling is extended to the multivariate level, where both time points and the three soil depths are incorporated in a single model to pool maximum information. A common correlation structure is tested and is supported for the response variable at each of the six time-depth combinations. Separable correlation structures are used for computational efficiency. The difficulty of estimating nugget effects suggests a sub-optimal sampling design. Preferred fitted models are all isotropic. Equations for predictions and the variance of prediction errors are extended from well-known results and maps of predicted values and variance of prediction errors are produced and show close correspondence with observed values. Finally, univariate models for spatially referenced seed counts from small sampling plots are examined within a Gaussian framework using Box-Cox transformations. The discrete nature of the data, small sample size and computational problems hamper model fitting. Anisotropy is examined using a variogram envelope diagnostic technique. ASReml-R software is shown to be a powerful analytical tool for spatial processes
- …