4,935 research outputs found
Comment on Article by Ferreira and Gamerman
A utility-function approach to optimal spatial sampling design is a powerful
way to quantify what "optimality" means. The emphasis then should be to capture
all possible contributions to utility, including scientific impact and the cost
of sampling. The resulting sampling plan should contain a component of designed
randomness that would allow for a non-parametric design-based analysis if
model-based assumptions were in doubt. [arXiv:1509.03410]Comment: Published at http://dx.doi.org/10.1214/15-BA944B in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
An Empirical Bayes Approach for Distributed Estimation of Spatial Fields
In this paper we consider a network of spatially distributed sensors which
collect measurement samples of a spatial field, and aim at estimating in a
distributed way (without any central coordinator) the entire field by suitably
fusing all network data. We propose a general probabilistic model that can
handle both partial knowledge of the physics generating the spatial field as
well as a purely data-driven inference. Specifically, we adopt an Empirical
Bayes approach in which the spatial field is modeled as a Gaussian Process,
whose mean function is described by means of parametrized equations. We
characterize the Empirical Bayes estimator when nodes are heterogeneous, i.e.,
perform a different number of measurements. Moreover, by exploiting the
sparsity of both the covariance and the (parametrized) mean function of the
Gaussian Process, we are able to design a distributed spatial field estimator.
We corroborate the theoretical results with two numerical simulations: a
stationary temperature field estimation in which the field is described by a
partial differential (heat) equation, and a data driven inference in which the
mean is parametrized by a cubic spline
A spatial analysis of multivariate output from regional climate models
Climate models have become an important tool in the study of climate and
climate change, and ensemble experiments consisting of multiple climate-model
runs are used in studying and quantifying the uncertainty in climate-model
output. However, there are often only a limited number of model runs available
for a particular experiment, and one of the statistical challenges is to
characterize the distribution of the model output. To that end, we have
developed a multivariate hierarchical approach, at the heart of which is a new
representation of a multivariate Markov random field. This approach allows for
flexible modeling of the multivariate spatial dependencies, including the
cross-dependencies between variables. We demonstrate this statistical model on
an ensemble arising from a regional-climate-model experiment over the western
United States, and we focus on the projected change in seasonal temperature and
precipitation over the next 50 years.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS369 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On statistical approaches to generate Level 3 products from satellite remote sensing retrievals
Satellite remote sensing of trace gases such as carbon dioxide (CO) has
increased our ability to observe and understand Earth's climate. However, these
remote sensing data, specifically~Level 2 retrievals, tend to be irregular in
space and time, and hence, spatio-temporal prediction is required to infer
values at any location and time point. Such inferences are not only required to
answer important questions about our climate, but they are also needed for
validating the satellite instrument, since Level 2 retrievals are generally not
co-located with ground-based remote sensing instruments. Here, we discuss
statistical approaches to construct Level 3 products from Level 2 retrievals,
placing particular emphasis on the strengths and potential pitfalls when using
statistical prediction in this context. Following this discussion, we use a
spatio-temporal statistical modelling framework known as fixed rank kriging
(FRK) to obtain global predictions and prediction standard errors of
column-averaged carbon dioxide based on Version 7r and Version 8r retrievals
from the Orbiting Carbon Observatory-2 (OCO-2) satellite. The FRK predictions
allow us to validate statistically the Level 2 retrievals globally even though
the data are at locations and at time points that do not coincide with
validation data. Importantly, the validation takes into account the prediction
uncertainty, which is dependent both on the temporally-varying density of
observations around the ground-based measurement sites and on the
spatio-temporal high-frequency components of the trace gas field that are not
explicitly modelled. Here, for validation of remotely-sensed CO data, we
use observations from the Total Carbon Column Observing Network. We demonstrate
that the resulting FRK product based on Version 8r compares better with TCCON
data than that based on Version 7r.Comment: 28 pages, 10 figures, 4 table
Flexible regression models over river networks
Many statistical models are available for spatial data but the vast majority of these assume that spatial separation can be measured by Euclidean distance. Data which are collected over river networks constitute a notable and commonly occurring exception, where distance must be measured along complex paths and, in addition, account must be taken of the relative flows of water into and out of confluences. Suitable models for this type of data have been constructed based on covariance functions. The aim of the paper is to place the focus on underlying spatial trends by adopting a regression formulation and using methods which allow smooth but flexible patterns. Specifically, kernel methods and penalized splines are investigated, with the latter proving more suitable from both computational and modelling perspectives. In addition to their use in a purely spatial setting, penalized splines also offer a convenient route to the construction of spatiotemporal models, where data are available over time as well as over space. Models which include main effects and spatiotemporal interactions, as well as seasonal terms and interactions, are constructed for data on nitrate pollution in the River Tweed. The results give valuable insight into the changes in water quality in both space and time
Bayesian Nonstationary Spatial Modeling for Very Large Datasets
With the proliferation of modern high-resolution measuring instruments
mounted on satellites, planes, ground-based vehicles and monitoring stations, a
need has arisen for statistical methods suitable for the analysis of large
spatial datasets observed on large spatial domains. Statistical analyses of
such datasets provide two main challenges: First, traditional
spatial-statistical techniques are often unable to handle large numbers of
observations in a computationally feasible way. Second, for large and
heterogeneous spatial domains, it is often not appropriate to assume that a
process of interest is stationary over the entire domain.
We address the first challenge by using a model combining a low-rank
component, which allows for flexible modeling of medium-to-long-range
dependence via a set of spatial basis functions, with a tapered remainder
component, which allows for modeling of local dependence using a compactly
supported covariance function. Addressing the second challenge, we propose two
extensions to this model that result in increased flexibility: First, the model
is parameterized based on a nonstationary Matern covariance, where the
parameters vary smoothly across space. Second, in our fully Bayesian model, all
components and parameters are considered random, including the number,
locations, and shapes of the basis functions used in the low-rank component.
Using simulated data and a real-world dataset of high-resolution soil
measurements, we show that both extensions can result in substantial
improvements over the current state-of-the-art.Comment: 16 pages, 2 color figure
Non-Gaussian bivariate modelling with application to atmospheric trace-gas inversion
Atmospheric trace-gas inversion is the procedure by which the sources and
sinks of a trace gas are identified from observations of its mole fraction at
isolated locations in space and time. This is inherently a spatio-temporal
bivariate inversion problem, since the mole-fraction field evolves in space and
time and the flux is also spatio-temporally distributed. Further, the bivariate
model is likely to be non-Gaussian since the flux field is rarely Gaussian.
Here, we use conditioning to construct a non-Gaussian bivariate model, and we
describe some of its properties through auto- and cross-cumulant functions. A
bivariate non-Gaussian, specifically trans-Gaussian, model is then achieved
through the use of Box--Cox transformations, and we facilitate Bayesian
inference by approximating the likelihood in a hierarchical framework.
Trace-gas inversion, especially at high spatial resolution, is frequently
highly sensitive to prior specification. Therefore, unlike conventional
approaches, we assimilate trace-gas inventory information with the
observational data at the parameter layer, thus shifting prior sensitivity from
the inventory itself to its spatial characteristics (e.g., its spatial length
scale). We demonstrate the approach in controlled-experiment studies of methane
inversion, using fluxes extracted from inventories of the UK and Ireland and of
Northern Australia.Comment: 45 pages, 7 figure
Hierarchical Bayesian auto-regressive models for large space time data with applications to ozone concentration modelling
Increasingly large volumes of space-time data are collected everywhere by mobile computing applications, and in many of these cases temporal data are obtained by registering events, for example telecommunication or web traffic data. Having both the spatial and temporal dimensions adds substantial complexity to data analysis and inference tasks. The computational complexity increases rapidly for fitting Bayesian hierarchical models, as such a task involves repeated inversion of large matrices. The primary focus of this paper is on developing space-time auto-regressive models under the hierarchical Bayesian setup. To handle large data sets, a recently developed Gaussian predictive process approximation method (Banerjee et al. [1]) is extended to include auto-regressive terms of latent space-time processes. Specifically, a space-time auto-regressive process, supported on a set of a smaller number of knot locations, is spatially interpolated to approximate the original space-time process. The resulting model is specified within a hierarchical Bayesian framework and Markov chain Monte Carlo techniques are used to make inference. The proposed model is applied for analysing the daily maximum 8-hour average ground level ozone concentration data from 1997 to 2006 from a large study region in the eastern United States. The developed methods allow accurate spatial prediction of a temporally aggregated ozone summary, known as the primary ozone standard, along with its uncertainty, at any unmonitored location during the study period. Trends in spatial patterns of many features of the posterior predictive distribution of the primary standard, such as the probability of non-compliance with respect to the standard, are obtained and illustrated
- …