15,249 research outputs found
Compression and Conditional Emulation of Climate Model Output
Numerical climate model simulations run at high spatial and temporal
resolutions generate massive quantities of data. As our computing capabilities
continue to increase, storing all of the data is not sustainable, and thus it
is important to develop methods for representing the full datasets by smaller
compressed versions. We propose a statistical compression and decompression
algorithm based on storing a set of summary statistics as well as a statistical
model describing the conditional distribution of the full dataset given the
summary statistics. The statistical model can be used to generate realizations
representing the full dataset, along with characterizations of the
uncertainties in the generated data. Thus, the methods are capable of both
compression and conditional emulation of the climate models. Considerable
attention is paid to accurately modeling the original dataset--one year of
daily mean temperature data--particularly with regard to the inherent spatial
nonstationarity in global fields, and to determining the statistics to be
stored, so that the variation in the original data can be closely captured,
while allowing for fast decompression and conditional emulation on modest
computers
High-Dimensional Bayesian Geostatistics
With the growing capabilities of Geographic Information Systems (GIS) and
user-friendly software, statisticians today routinely encounter geographically
referenced data containing observations from a large number of spatial
locations and time points. Over the last decade, hierarchical spatiotemporal
process models have become widely deployed statistical tools for researchers to
better understand the complex nature of spatial and temporal variability.
However, fitting hierarchical spatiotemporal models often involves expensive
matrix computations with complexity increasing in cubic order for the number of
spatial locations and temporal points. This renders such models unfeasible for
large data sets. This article offers a focused review of two methods for
constructing well-defined highly scalable spatiotemporal stochastic processes.
Both these processes can be used as "priors" for spatiotemporal random fields.
The first approach constructs a low-rank process operating on a
lower-dimensional subspace. The second approach constructs a Nearest-Neighbor
Gaussian Process (NNGP) that ensures sparse precision matrices for its finite
realizations. Both processes can be exploited as a scalable prior embedded
within a rich hierarchical modeling framework to deliver full Bayesian
inference. These approaches can be described as model-based solutions for big
spatiotemporal datasets. The models ensure that the algorithmic complexity has
floating point operations (flops), where the number of spatial
locations (per iteration). We compare these methods and provide some insight
into their methodological underpinnings
Covariance approximation for large multivariate spatial data sets with an application to multiple climate model errors
This paper investigates the cross-correlations across multiple climate model
errors. We build a Bayesian hierarchical model that accounts for the spatial
dependence of individual models as well as cross-covariances across different
climate models. Our method allows for a nonseparable and nonstationary
cross-covariance structure. We also present a covariance approximation approach
to facilitate the computation in the modeling and analysis of very large
multivariate spatial data sets. The covariance approximation consists of two
parts: a reduced-rank part to capture the large-scale spatial dependence, and a
sparse covariance matrix to correct the small-scale dependence error induced by
the reduced rank approximation. We pay special attention to the case that the
second part of the approximation has a block-diagonal structure. Simulation
results of model fitting and prediction show substantial improvement of the
proposed approximation over the predictive process approximation and the
independent blocks analysis. We then apply our computational approach to the
joint statistical modeling of multiple climate model errors.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS478 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Recommended from our members
A high resolution coupled hydrologic–hydraulic model (HiResFlood-UCI) for flash flood modeling
HiResFlood-UCI was developed by coupling the NWS's hydrologic model (HL-RDHM) with the hydraulic model (BreZo) for flash flood modeling at decameter resolutions. The coupled model uses HL-RDHM as a rainfall-runoff generator and replaces the routing scheme of HL-RDHM with the 2D hydraulic model (BreZo) in order to predict localized flood depths and velocities. A semi-automated technique of unstructured mesh generation was developed to cluster an adequate density of computational cells along river channels such that numerical errors are negligible compared with other sources of error, while ensuring that computational costs of the hydraulic model are kept to a bare minimum. HiResFlood-UCI was implemented for a watershed (ELDO2) in the DMIP2 experiment domain in Oklahoma. Using synthetic precipitation input, the model was tested for various components including HL-RDHM parameters (a priori versus calibrated), channel and floodplain Manning n values, DEM resolution (10 m versus 30 m) and computation mesh resolution (10 m+ versus 30 m+). Simulations with calibrated versus a priori parameters of HL-RDHM show that HiResFlood-UCI produces reasonable results with the a priori parameters from NWS. Sensitivities to hydraulic model resistance parameters, mesh resolution and DEM resolution are also identified, pointing to the importance of model calibration and validation for accurate prediction of localized flood intensities. HiResFlood-UCI performance was examined using 6 measured precipitation events as model input for model calibration and validation of the streamflow at the outlet. The Nash–Sutcliffe Efficiency (NSE) obtained ranges from 0.588 to 0.905. The model was also validated for the flooded map using USGS observed water level at an interior point. The predicted flood stage error is 0.82 m or less, based on a comparison to measured stage. Validation of stage and discharge predictions builds confidence in model predictions of flood extent and localized velocities, which are fundamental to reliable flash flood warning
- …