734 research outputs found

    SciKit-GStat 1.0: a SciPy-flavored geostatistical variogram estimation toolbox written in Python

    Get PDF
    Geostatistical methods are widely used in almost all geoscientific disciplines, i.e., for interpolation, rescaling, data assimilation or modeling. At its core, geostatistics aims to detect, quantify, describe, analyze and model spatial covariance of observations. The variogram, a tool to describe this spatial covariance in a formalized way, is at the heart of every such method. Unfortunately, many applications of geostatistics focus on the interpolation method or the result rather than the quality of the estimated variogram. Not least because estimating a variogram is commonly left as a task for computers, and some software implementations do not even show a variogram to the user. This is a miss, because the quality of the variogram largely determines whether the application of geostatistics makes sense at all. Furthermore, the Python programming language was missing a mature, well-established and tested package for variogram estimation a couple of years ago. Here I present SciKit-GStat, an open-source Python package for variogram estimation that fits well into established frameworks for scientific computing and puts the focus on the variogram before more sophisticated methods are about to be applied. SciKit-GStat is written in a mutable, object-oriented way that mimics the typical geostatistical analysis workflow. Its main strength is the ease of use and interactivity, and it is therefore usable with only a little or even no knowledge of Python. During the last few years, other libraries covering geostatistics for Python developed along with SciKit-GStat. Today, the most important ones can be interfaced by SciKit-GStat. Additionally, established data structures for scientific computing are reused internally, to keep the user from learning complex data models, just for using SciKit-GStat. Common data structures along with powerful interfaces enable the user to use SciKit-GStat along with other packages in established workflows rather than forcing the user to stick to the author\u27s programming paradigms. SciKit-GStat ships with a large number of predefined procedures, algorithms and models, such as variogram estimators, theoretical spatial models or binning algorithms. Common approaches to estimate variograms are covered and can be used out of the box. At the same time, the base class is very flexible and can be adjusted to less common problems, as well. Last but not least, it was made sure that a user is aided in implementing new procedures or even extending the core functionality as much as possible, to extend SciKit-GStat to uncovered use cases. With broad documentation, a user guide, tutorials and good unit-test coverage, SciKit-GStat enables the user to focus on variogram estimation rather than implementation details

    Capturing Multivariate Spatial Dependence: Model, Estimate and then Predict

    Get PDF
    Physical processes rarely occur in isolation, rather they influence and interact with one another. Thus, there is great benefit in modeling potential dependence between both spatial locations and different processes. It is the interaction between these two dependencies that is the focus of Genton and Kleiber's paper under discussion. We see the problem of ensuring that any multivariate spatial covariance matrix is nonnegative definite as important, but we also see it as a means to an end. That "end" is solving the scientific problem of predicting a multivariate field. [arXiv:1507.08017].Comment: Published at http://dx.doi.org/10.1214/15-STS517 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A disposition of interpolation techniques

    Get PDF
    A large collection of interpolation techniques is available for application in environmental research. To help environmental scientists in choosing an appropriate technique a disposition is made, based on 1) applicability in space, time and space-time, 2) quantification of accuracy of interpolated values, 3) incorporation of ancillary information, and 4) incorporation of process knowledge. The described methods include inverse distance weighting, nearest neighbour methods, geostatistical interpolation methods, Kalman filter methods, Bayesian Maximum Entropy methods, etc. The applicability of methods in aggregation (upscaling) and disaggregation (downscaling) is discussed. Software for interpolation is described. The application of interpolation techniques is illustrated in two case studies: temporal interpolation of indicators for ecological water quality, and spatio-temporal interpolation and aggregation of pesticide concentrations in Dutch surface waters. A valuable next step will be to construct a decision tree or decision support system, that guides the environmental scientist to easy-to-use software implementations that are appropriate to solve their interpolation problem. Validation studies are needed to assess the quality of interpolated values, and the quality of information on uncertainty provided by the interpolation method

    Geostatistical spatiotemporal modelling with application to the western king prawn of the Shark Bay managed prawn fishery

    Get PDF
    Geostatistical methodology has been employed in the modelling of spatiotemporal data from various scientific fields by viewing the data as realisations of space-time random functions. Traditional geostatistics aims to model the spatial variability of a process so, in order to incorporate a time dimension into a geostatistical model, the fundamental differences between the space and time dimensions must be acknowledged and addressed. The main conceptual viewpoint of geostatistical spatiotemporal modelling identified within the literature views the process as a single random function model utilising a joint space-time covariance function to model the spatiotemporal continuity. Geostatistical space-time modelling has been primarily data driven, resulting in models that are suited to the data under investigation, usually survey data involving fixed locations. Space-time geostatistical modelling of fish stocks within the fishing season is limited as the collection of fishery-independent survey data for the spatiotemporal sampling design is often costly or impractical. However, fishery-dependent commercial catch and effort data, throughout each season, are available for many fisheries as part of the ongoing monitoring program to support their stock assessment and fishery management. An example of such data is prawn catch and effort data from the Shark Bay managed prawn fishery in Western Australia. The data are densely informed in both the spatial and temporal dimensions and cover a range of locations at each time instant. Both catch and effort variables display an obvious spatiotemporal continuity across the fishing region and throughout the season. There is detailed spatial and temporal resolution as skippers record their daily fishing shots with associated latitudinal and longitudinal positions. In order to facilitate the ongoing management of the fishery, an understanding of the spatiotemporal dynamics of various prawn species within season is necessary. A suitable spatiotemporal model is required in order to effectively capture the joint space-time dependence of the prawn data. An exhaustive literature search suggests that this is the first application of geostatistical space-time modelling to commercial fishery data, with the development and evaluation of an integrated space-time geostatistical model that caters for the commercial logbook prawn catch and effort data for the Shark Bay fishery. The model developed in this study utilises the global temporal trend observed in the data to standardise the catch rates. Geostatistical spatiotemporal variogram modelling was shown to accurately represent the spatiotemporal continuity of the catch data, and was used to predict and simulate catch rates at unsampled locations and future time instants in a season. In addition, fishery-independent survey data were used to help improve the performance of catch rate estimates

    Road distance and travel time for an improved house price Kriging predictor

    Get PDF
    The paper designs an automated valuation model to predict the price of residential property in Coventry, United Kingdom, and achieves this by means of geostatistical Kriging, a popularly employed distance-based learning method. Unlike traditional applications of distance-based learning, this papers implements non-Euclidean distance metrics by approximating road distance, travel time and a linear combination of both, which this paper hypothesizes to be more related to house prices than straight-line (Euclidean) distance. Given that – to undertake Kriging – a valid variogram must be produced, this paper exploits the conforming properties of the Minkowski distance function to approximate a road distance and travel time metric. A least squares approach is put forth for variogram parameter selection and an ordinary Kriging predictor is implemented for interpolation. The predictor is then validated with 10-fold cross-validation and a spatially aware checkerboard hold out method against the almost exclusively employed, Euclidean metric. Given a comparison of results for each distance metric, this paper witnesses a goodness of fit (r²) result of 0.6901 ± 0.18 SD for real estate price prediction compared to the traditional (Euclidean) approach obtaining a suboptimal r² value of 0.66 ± 0.21 SD
    • …
    corecore