4,927 research outputs found

    Comment on Article by Ferreira and Gamerman

    Get PDF
    A utility-function approach to optimal spatial sampling design is a powerful way to quantify what "optimality" means. The emphasis then should be to capture all possible contributions to utility, including scientific impact and the cost of sampling. The resulting sampling plan should contain a component of designed randomness that would allow for a non-parametric design-based analysis if model-based assumptions were in doubt. [arXiv:1509.03410]Comment: Published at http://dx.doi.org/10.1214/15-BA944B in the Bayesian Analysis (http://projecteuclid.org/euclid.ba) by the International Society of Bayesian Analysis (http://bayesian.org/

    An Empirical Bayes Approach for Distributed Estimation of Spatial Fields

    Get PDF
    In this paper we consider a network of spatially distributed sensors which collect measurement samples of a spatial field, and aim at estimating in a distributed way (without any central coordinator) the entire field by suitably fusing all network data. We propose a general probabilistic model that can handle both partial knowledge of the physics generating the spatial field as well as a purely data-driven inference. Specifically, we adopt an Empirical Bayes approach in which the spatial field is modeled as a Gaussian Process, whose mean function is described by means of parametrized equations. We characterize the Empirical Bayes estimator when nodes are heterogeneous, i.e., perform a different number of measurements. Moreover, by exploiting the sparsity of both the covariance and the (parametrized) mean function of the Gaussian Process, we are able to design a distributed spatial field estimator. We corroborate the theoretical results with two numerical simulations: a stationary temperature field estimation in which the field is described by a partial differential (heat) equation, and a data driven inference in which the mean is parametrized by a cubic spline

    A spatial analysis of multivariate output from regional climate models

    Get PDF
    Climate models have become an important tool in the study of climate and climate change, and ensemble experiments consisting of multiple climate-model runs are used in studying and quantifying the uncertainty in climate-model output. However, there are often only a limited number of model runs available for a particular experiment, and one of the statistical challenges is to characterize the distribution of the model output. To that end, we have developed a multivariate hierarchical approach, at the heart of which is a new representation of a multivariate Markov random field. This approach allows for flexible modeling of the multivariate spatial dependencies, including the cross-dependencies between variables. We demonstrate this statistical model on an ensemble arising from a regional-climate-model experiment over the western United States, and we focus on the projected change in seasonal temperature and precipitation over the next 50 years.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS369 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Flexible regression models over river networks

    Get PDF
    Many statistical models are available for spatial data but the vast majority of these assume that spatial separation can be measured by Euclidean distance. Data which are collected over river networks constitute a notable and commonly occurring exception, where distance must be measured along complex paths and, in addition, account must be taken of the relative flows of water into and out of confluences. Suitable models for this type of data have been constructed based on covariance functions. The aim of the paper is to place the focus on underlying spatial trends by adopting a regression formulation and using methods which allow smooth but flexible patterns. Specifically, kernel methods and penalized splines are investigated, with the latter proving more suitable from both computational and modelling perspectives. In addition to their use in a purely spatial setting, penalized splines also offer a convenient route to the construction of spatiotemporal models, where data are available over time as well as over space. Models which include main effects and spatiotemporal interactions, as well as seasonal terms and interactions, are constructed for data on nitrate pollution in the River Tweed. The results give valuable insight into the changes in water quality in both space and time

    On statistical approaches to generate Level 3 products from satellite remote sensing retrievals

    Get PDF
    Satellite remote sensing of trace gases such as carbon dioxide (CO2_2) has increased our ability to observe and understand Earth's climate. However, these remote sensing data, specifically~Level 2 retrievals, tend to be irregular in space and time, and hence, spatio-temporal prediction is required to infer values at any location and time point. Such inferences are not only required to answer important questions about our climate, but they are also needed for validating the satellite instrument, since Level 2 retrievals are generally not co-located with ground-based remote sensing instruments. Here, we discuss statistical approaches to construct Level 3 products from Level 2 retrievals, placing particular emphasis on the strengths and potential pitfalls when using statistical prediction in this context. Following this discussion, we use a spatio-temporal statistical modelling framework known as fixed rank kriging (FRK) to obtain global predictions and prediction standard errors of column-averaged carbon dioxide based on Version 7r and Version 8r retrievals from the Orbiting Carbon Observatory-2 (OCO-2) satellite. The FRK predictions allow us to validate statistically the Level 2 retrievals globally even though the data are at locations and at time points that do not coincide with validation data. Importantly, the validation takes into account the prediction uncertainty, which is dependent both on the temporally-varying density of observations around the ground-based measurement sites and on the spatio-temporal high-frequency components of the trace gas field that are not explicitly modelled. Here, for validation of remotely-sensed CO2_2 data, we use observations from the Total Carbon Column Observing Network. We demonstrate that the resulting FRK product based on Version 8r compares better with TCCON data than that based on Version 7r.Comment: 28 pages, 10 figures, 4 table

    Bayesian Nonstationary Spatial Modeling for Very Large Datasets

    Full text link
    With the proliferation of modern high-resolution measuring instruments mounted on satellites, planes, ground-based vehicles and monitoring stations, a need has arisen for statistical methods suitable for the analysis of large spatial datasets observed on large spatial domains. Statistical analyses of such datasets provide two main challenges: First, traditional spatial-statistical techniques are often unable to handle large numbers of observations in a computationally feasible way. Second, for large and heterogeneous spatial domains, it is often not appropriate to assume that a process of interest is stationary over the entire domain. We address the first challenge by using a model combining a low-rank component, which allows for flexible modeling of medium-to-long-range dependence via a set of spatial basis functions, with a tapered remainder component, which allows for modeling of local dependence using a compactly supported covariance function. Addressing the second challenge, we propose two extensions to this model that result in increased flexibility: First, the model is parameterized based on a nonstationary Matern covariance, where the parameters vary smoothly across space. Second, in our fully Bayesian model, all components and parameters are considered random, including the number, locations, and shapes of the basis functions used in the low-rank component. Using simulated data and a real-world dataset of high-resolution soil measurements, we show that both extensions can result in substantial improvements over the current state-of-the-art.Comment: 16 pages, 2 color figure

    Non-Gaussian bivariate modelling with application to atmospheric trace-gas inversion

    Get PDF
    Atmospheric trace-gas inversion is the procedure by which the sources and sinks of a trace gas are identified from observations of its mole fraction at isolated locations in space and time. This is inherently a spatio-temporal bivariate inversion problem, since the mole-fraction field evolves in space and time and the flux is also spatio-temporally distributed. Further, the bivariate model is likely to be non-Gaussian since the flux field is rarely Gaussian. Here, we use conditioning to construct a non-Gaussian bivariate model, and we describe some of its properties through auto- and cross-cumulant functions. A bivariate non-Gaussian, specifically trans-Gaussian, model is then achieved through the use of Box--Cox transformations, and we facilitate Bayesian inference by approximating the likelihood in a hierarchical framework. Trace-gas inversion, especially at high spatial resolution, is frequently highly sensitive to prior specification. Therefore, unlike conventional approaches, we assimilate trace-gas inventory information with the observational data at the parameter layer, thus shifting prior sensitivity from the inventory itself to its spatial characteristics (e.g., its spatial length scale). We demonstrate the approach in controlled-experiment studies of methane inversion, using fluxes extracted from inventories of the UK and Ireland and of Northern Australia.Comment: 45 pages, 7 figure

    Hierarchical Bayesian auto-regressive models for large space time data with applications to ozone concentration modelling

    No full text
    Increasingly large volumes of space-time data are collected everywhere by mobile computing applications, and in many of these cases temporal data are obtained by registering events, for example telecommunication or web traffic data. Having both the spatial and temporal dimensions adds substantial complexity to data analysis and inference tasks. The computational complexity increases rapidly for fitting Bayesian hierarchical models, as such a task involves repeated inversion of large matrices. The primary focus of this paper is on developing space-time auto-regressive models under the hierarchical Bayesian setup. To handle large data sets, a recently developed Gaussian predictive process approximation method (Banerjee et al. [1]) is extended to include auto-regressive terms of latent space-time processes. Specifically, a space-time auto-regressive process, supported on a set of a smaller number of knot locations, is spatially interpolated to approximate the original space-time process. The resulting model is specified within a hierarchical Bayesian framework and Markov chain Monte Carlo techniques are used to make inference. The proposed model is applied for analysing the daily maximum 8-hour average ground level ozone concentration data from 1997 to 2006 from a large study region in the eastern United States. The developed methods allow accurate spatial prediction of a temporally aggregated ozone summary, known as the primary ozone standard, along with its uncertainty, at any unmonitored location during the study period. Trends in spatial patterns of many features of the posterior predictive distribution of the primary standard, such as the probability of non-compliance with respect to the standard, are obtained and illustrated
    corecore