96 research outputs found
Statistical Analyses of Ozone Temporal Trends in Calgary, Alberta: an Application of Multivariate Geostatistics
The prediction of tropospheric (surface) ozone episodes is a challenging task that requires
the integration of physicochemical and statistical techniques. Governmental agencies
such as the U.S. Environmental Protection Agency (EPA) and Alberta Environment favor
physicochemical modeling in order to capture the complexity of the underlying physical
processes. Unlike physicochemical models, statistical techniques are usually based on
spatial and/or temporal correlations between relevant variates. The statistical models also
require less exhaustive data sets for accurate predictions; this major advantage is perhaps
more obvious when ozone prediction is performed for a longer period of interest.
The primary objective of this research is to investigate statistical techniques for
modeling ozone and/or other pollutant concentrations given only sparse environmental
records at the monitoring stations. Straightforward linear regression based techniques are
implemented initially but the inadequacy of these approaches for predicting detailed
temporal ozone variations is verified by the results. Then geostatistical paradigms of
kriging and sequential stochastic simulation are implemented to incorporate temporal
correlation in the form ofvariogram. Secondary variables (covariates) can also be useful
for providing extra information and their influence is accounted for in cokriging and cosimulation.
The positive-definiteness of auto and cross-covariances are ensured via a
linear model of coregionalization (LMC). The "two-point" statistic (variogram) is found
to be insufficient and hence this thesis strives to explore methodologies for modeling the
highly fluctuating temporal profiles with a view to providing a sound framework for
subsequent extensions to spatiotemporal modeling.
Il
High-dimensional Bayesian optimization with intrinsically low-dimensional response surfaces
Bayesian optimization is a powerful technique for the optimization of expensive black-box functions. It is used in a wide range of applications such as in drug and material design and training of machine learning models, e.g. large deep networks. We propose to extend this approach to high-dimensional settings, that is where the number of parameters to be optimized exceeds 10--20. In this thesis, we scale Bayesian optimization by exploiting different types of projections and the intrinsic low-dimensionality assumption of the objective function. We reformulate the problem in a low-dimensional subspace and learn a response surface and maximize an acquisition function in this low-dimensional projection. Contributions include i) a probabilistic model for axis-aligned projections, such as the quantile-Gaussian process and ii) a probabilistic model for learning a feature space by means of manifold Gaussian processes. In the latter contribution, we propose to learn a low-dimensional feature space jointly with (a) the response surface and (b) a reconstruction mapping. Finally, we present empirical results against well-known baselines in high-dimensional Bayesian optimization and provide possible directions for future research in this field.Open Acces
MAPPING AND DECOMPOSING SCALE-DEPENDENT SOIL MOISTURE VARIABILITY WITHIN AN INNER BLUEGRASS LANDSCAPE
There is a shared desire among public and private sectors to make more reliable predictions, accurate mapping, and appropriate scaling of soil moisture and associated parameters across landscapes. A discrepancy often exists between the scale at which soil hydrologic properties are measured and the scale at which they are modeled for management purposes. Moreover, little is known about the relative importance of hydrologic modeling parameters as soil moisture fluctuates with time. More research is needed to establish which observation scales in space and time are optimal for managing soil moisture variation over large spatial extents and how these scales are affected by fluctuations in soil moisture content with time. This research fuses high resolution geoelectric and light detection and ranging (LiDAR) as auxiliary measures to support sparse direct soil sampling over a 40 hectare inner BluegrassKentucky (USA) landscape. A Veris 3100 was used to measure shallow and deep apparent electrical conductivity (aEC) in tandem with soil moisture sampling on three separate dates with ascending soil moisture contents ranging from plant wilting point to near field capacity. Terrain attributes were produced from 2010 LiDAR ground returns collected at ≤1 m nominal pulse spacing. Exploratory statistics revealed several variables best associate with soil moisture, including terrain features (slope, profile curvature, and elevation), soil physical and chemical properties (calcium, cation exchange capacity, organic matter, clay and sand) and aEC for each date. Multivariate geostatistics, time stability analyses, and spatial regression were performed to characterize scale-dependent soil moisture patterns in space with time to determine which soil-terrain parameters influence soil moisture distribution. Results showed that soil moisture variation was time stable across the landscape and primarily associated with long-range (~250 m) soil physicochemical properties. When the soils approached field capacity, however, there was a shift in relative importance from long-range soil physicochemical properties to short-range (~70 m) terrain attributes, albeit this shift did not cause time instability. Results obtained suggest soil moisture’s interaction with soil-terrain parameters is time dependent and this dependence influences which observation scale is optimal to sample and manage soil moisture variation
Recommended from our members
On Simplified Bayesian Modeling for Massive Geostatistical Datasets: Conjugacy and Beyond
With continued advances in Geographic Information Systems and related computational technologies, researchers in diverse fields like forestry, environmental health, climate sciences etc. have growing interests in analyzing large scale data sets measured at a substantial number of geographic locations. Geostatistical models used to capture the space varying relationships in such data are often accompanied by onerous computations which prohibit the analysis of large scale spatial data sets. Less burdensome alternatives proposed recently for analyzing massive spatial datasets often lead to inaccurate inference or require slow sampling process. Bayesian inference, while attractive for accommodating uncertainties through their hierarchical structures, can become computationally onerous for modeling massive spatial data sets because of their reliance on iterative estimation algorithms. My dissertation research aims at developing computationally scalable Bayesian geostatistical models that provide valid inference through highly accelerated sampling process. We also study the asymptotic properties of estimators in spatial analysis.In Chapter 2 and 3, we develop conjugate Bayesian frameworks for analyzing univariate and multivariate spatial data. We propose a conjugate latent Nearest-Neighbor Gaussian Process (NNGP) model in Chapter 2, which uses analytically tractable posterior distributions to obtain posterior inferences, including the large dimensional latent process. In Chapter 3, we focus on building conjugate Bayesian frameworks for analyzing multivariate spatial data. We utilize Matrix-Normal Inverse-Wishart(MNIW) prior to propose conjugate Bayesian frameworks and algorithms that can incorporate a family of scalable spatial modeling methodologies.In Chapter 4, we pursue general Bayesian modeling methodologies beyond a conjugate Bayesian hierarchical modeling. We build scalable versions of a hierarchical linear model of coregionalization (LMC) and spatial factor models, and propose a highly accelerated block update MCMC algorithm. Using the proposed Bayesian LMC model, we extend scalable modeling strategies for a single process into multivariate process cases. All proposed frameworks are tested on simulated data and fit to real data sets with observed locations numbering in the millions. Our contribution is to offer practicing scientists and spatial analysts practical and flexible scalable hierarchical models for analyzing massive spatial data sets.In Chapter 5, we investigate the asymptotic properties of the estimators in spatial analysis. We formally establish results on the identifiability and consistency of the nugget in spatial models based upon the Gaussian process within the framework of in-fill asymptotics, i.e. the sample size increases within a sampling domain that is bounded. We establish the identifiability of parameters in the Matern covariance function and the consistency of their maximum likelihood estimators in the presence of discontinuities due to the nugget
Scaling of sorption isotherms to quantify the field-scale variability of heavy metal retardation in soil
Taken two agricultural lands as the study areas (loess, Haplic Luvisols; loamy to sandy soil, Eutric Cambisols), which are representative of the soils of northern Germany, this dissertation studies the upscaling of the adsorptive binding of heavy metals in soils and their variability by calculating the scale factors. The adsorptive binding of heavy metals in soils is mostly quantified by sorption isotherms with large variability at the field scale. The aim of this work is to search the correlation of sorption isotherms by means of scale factors between different heavy metals and further with physico-chemical soil properties, so that only a few measurements are necessary to make sufficient statements on heavy metal’s binding and mobility at field-scale. At both study sites, upscaling can capture the linear parts of sorption’s variability well. Scenario study discussed satisfied simulations of heavy metals transport process, where the scale factors are treated as the measure of variability. However, in the statistical and geostatistical studies, no significant correlations were found between the scale factors of different heavy metals and with physicochemical soil properties. Depending on the location and soil horizon, the correlation of scale factors between different heavy metals varied so different and not transferrable. In addition, the reference isotherm calculated directly from measurements did not match the sorption isotherm from a composite sample, which indicates that scaling is favorable to homogenous sites. Thus, the important finding in this dissertation can be summarized that the application of scale factors for heavy metal sorption isotherms, such as statistical or geostatistical evaluation, is limited only to specific case studies or a scenario modeling
- …