106 research outputs found

    Quality measures for soil surveys by lognormal kriging

    Get PDF
    If we know the variogram of a random variable then we can compute the prediction error variances (kriging variances) for kriged estimates of the variable at unsampled sites from sampling grids of different design and density. In this way the kriging variance is a useful pre-survey measure of the quality of statistical predictions, which can be used to design sampling schemes to achieve target quality requirements at minimal cost. However, many soil properties are lognormally distributed, and must be transformed to logarithms before geostatistical analysis. The predicted values on the log scale are then back-transformed. It is possible to compute the prediction error variance for a prediction by this lognormal kriging procedure. However, it does not depend only on the variogram of the variable and the sampling configuration, but also on the conditional mean of the prediction. We therefore cannot use the kriging variance directly as a pre-survey measure of quality for geostatistical surveys of lognormal variables. In this paper we present an alternative. First we show how the limits of a prediction interval for a variable predicted by lognormal kriging can be expressed as dimensionless quantities, proportions of the unknown median of the conditional distribution. This scaled prediction interval can be used as a presurvey quality measure since it depends only on the sampling configuration and the variogram of the log-transformed variable. Second, we show how a similar scaled prediction interval can be computed for the median value of a lognormal variable across a block, in the case of block kriging. This approach is then illustrated using variograms of lognormally distributed data on concentration of elements in the soils of a part of eastern England

    Mapping seabed sediments of the Fulmar rMCZ

    Get PDF
    This report is on work undertaken for the JNCC under an Addendum to the Memorandum of Agreement dated 20 February 2014 between The Scottish Ministers, Natural Environment Research Council (NERC) and JNCC Support co. (JNCC). Under the terms of this Addendum JNCC requested that BGS carry out geostatistical analysis of sediment sample data from CEND 8/12 survey of Fulmar rMCZ in order to produce maps of sediment distribution in the site. A geostatistical analysis of the data is reported leading to the selection of a linear model of corregionalization for the composition of the sediment, based on the additive log-ratio transformation of data on mud, sand and gravel content. This model is then used for spatial prediction on a 250-m grid. At each grid node a prediction distribution is obtained, conditional on neighbouring data and the selected model. By sampling from this distribution, and back-transforming onto the original compositional simplex of the data, we obtain a conditional expectation for the proportions of sand, gravel and mud at each location, a 95% confidence interval for the value at each node, and the probability that each of the four sediment texture classes that underlie the EUNIS habitat classification is found at the node

    Mapping seabed sediments of the Swallow Sand and South-West Deeps (West) MCZs

    Get PDF
    This report is on work undertaken for the JNCC under an Addendum to the Memorandum of Agreement dated 20 February 2014 between The Scottish Ministers, Natural Environment Research Council (NERC) and JNCC Support co. (JNCC). Under the terms of this Addendum JNCC requested that BGS carry out geostatistical analysis of sediment sample data from the CEND 8/12 survey of Swallow Sand MCZ and CEND 6/13 survey of South-west Deeps (West) MCZ in order to produce maps of sediment distribution in the sites. XX/00/00; Draft 0.1 Last modified: 2014/03/18 12:29 iii For each of the MCZ a geostatistical analysis of the data is reported leading to the selection of a robust linear model of corregionalization for the composition of the sediment, based on the additive log-ratio transformation of data on mud, sand and gravel content. This model is then used for spatial prediction on a 250-m grid. At each grid node a prediction distribution is obtained, conditional on neighbouring data and the selected model. By sampling from this distribution, and back-transforming onto the original compositional simplex of the data, we obtain a conditional expectation for the proportions of sand, gravel and mud at each location, a 95% confidence interval for the value at each node, and the probability that each of the four sediment texture classes that underly the EUNIS habitat classification is found at the node

    Using third-order cumulants to investigate spatial variation: a case study on the porosity of the Bunter Sandstone

    Get PDF
    The multivariate cumulants characterize aspects of the spatial variability of a regionalized variable. A centred multivariate Gaussian random variable, for example, has zero third-order cumulants. In this paper it is shown how the third-order cumulants can be used to test the plausibility of the assumption of multivariate normality for the porosity of an important formation, the Bunter Sandstone in the North Sea. The results suggest that the spatial variability of this variable deviates from multivariate normality, and that this assumption may lead to misleading inferences about, for example, the uncertainty attached to kriging predictions

    Block correlation and the spatial resolution of soil property maps made by kriging

    Get PDF
    The block correlation is the correlation between the block kriging prediction of a variable and the true spatial mean which it estimates, computed for a particular sampling configuration and block size over the stochastic model which underlies the kriging prediction. This correlation can be computed if the variogram and disposition of sample points are known. It is also possible to compute the concordance correlation, a modified correlation which measures the extent to which the block kriging prediction and true block spatial mean conform to the 1:1 line, and so is sensitive to the tendency of the kriging predictor to over-smooth. It is proposed that block concordance correlation has two particular advantages over kriging variance for communicating uncertainty in predicted values. First, as a measure on a bounded scale it is more intuitively understood by the non-specialist data user, particularly one who is interested in a synoptic overview of soil variation across a region. Second, because it accounts for the variability of the spatial means and their kriged estimates, as well as the uncertainty of the latter, it can be more readily compared between blocks of different sizes than can a kriging variance. Using the block correlation and concordance correlation it is shown that the uncertainty of block kriged predictions depends on block size, but this effect depends on the interaction of the autocorrelation of the random variable and the sampling intensity. In some circumstances (where the dominant component of variation is at a long range relative to sample spacing) the block correlation and concordance correlation are insensitive to block size, but if the grid spacing is closer to the range of correlation of a significant component then block size can have a substantial effect on block correlation. It is proposed that (i) block concordance correlation is used to communicate the uncertainty in kriged predictions to a range of audiences (ii) that it is used to explore sensitivity to block size when planning mapping and (iii) as a general operational rule a block size is selected to give a block concordance correlation of 0.8 or larger where this can be achieved without extra sampling

    Changes in the variance of a soil property along a transect, a comparison of a non-stationary linear mixed model and a wavelet transform

    Get PDF
    The wavelet transform and the linear mixed model with spectral tempering are two methods which have been used to analyse soil data without assumptions of stationarity in the variance. In this paper both methods are compared on a single data set on soil pH where marked changes in parent material are expected to result in non-stationary variability. The two methods both identified the dominant feature of the data, a reduction in the variance of pH over Chalk parent material, and also identified less pronounced effects of other parent material contrasts. However, there were differences between the results which can be attributed to (i) the wavelet transform's analysis on discrete scales, for which local features are resolved with scale-dependent resolution; (ii) differences between the partition of variation into, respectively, smooth or detail components of the wavelet analysis and fixed or random effects of the linear mixed model; (iii) the fact that the identification of changes in the variance is done sequentially for the wavelet transform and simultaneously in the linear mixed model

    Multi-objective optimization of spatial sampling

    Get PDF
    The optimization of spatial sampling by simulated annealing has been demonstrated and applied for a range of objective functions. In practice more than one objective function may be important for sampling, and there may be complex trade-offs between them. In this paper it is shown how a multi-objective optimization algorithm can be applied to a spatial sampling problem. This generates a set of solutions which is non-dominated (no one solution does better than any other on all objective functions). These solutions represent different feasible trade-offs between the objective functions, and a subset might be practically acceptable. The algorithm is applied to a hypothetical example of sampling for a regional mean with the variance of the mean and the total distance travelled between sample points as the two objective functions. The solutions represent a transition of sample arrays from a loose grid to a tight loop. The potential to develop this approach and apply it to other spatial sampling problems is discussed

    Controlling the marginal false discovery rate in inferences from a soil dataset with α -investment

    Get PDF
    Large datasets on soil provide a temptation to search for relations between variables and then to model and make inferences about them with statistical methods more properly used to test preplanned hypotheses on data from designed experiments or sample surveys. The control of family-wise error rate (FWER) is one way to improve the robustness of inferences from tests of multiple hypotheses. In its simplest form, hypothesis testing with FWER control lacks statistical power. The α-investment approach to controlling the marginal false discovery rate is one method proposed to improve statistical power. In this paper I outline the α-investment approach and then demonstrate it in the analysis of a dataset on the rate of CO2 emission from incubated intact cores of soil from a transect over Cretaceous rocks in eastern England. Hypotheses are advanced after considering the literature and examining relations among the available soil variables that might be proposed as explanatory factors for the variation of CO2 emissions. They are then tested in sequence with α-investment, such that the rejection of null hypotheses increases the power to reject later ones, while controlling the overall marginal false discovery rate at a specified value. This paper illustrates the use of α-investment to test a multiple set of hypotheses on a soil dataset; statistical power is improved by ordering the sequence of hypotheses on the basis of process knowledge. The approach could be useful in other areas of soil science where covariates must be selected for predictive statistical models, notably in the development of pedotransfer functions and in digital soil mapping. Highlights α-investment controls marginal false discovery rate in statistical inference. Hypotheses were advanced about soil factors that affect CO2 emission from soil. These hypotheses were tested in sequence with control of marginal false discovery rate. Soil properties, land use and parent material were significant predictors

    The implicit loss function for errors in soil information

    Get PDF
    The loss function expresses the costs to an organization that result from decisions made using erroneous information. In closely constrained circumstances, such as remediation of soil on contaminated land prior to development, it has proved possible to compute loss functions and to use these to guide rational decision making on the amount of resource to spend on sampling to collect soil information. In many circumstances it may not be possible to define loss functions prior to decision making on soil sampling. This may be the case when multiple decisions may be based on the soil information and the costs of errors are hard to predict. We propose the implicit loss function as a tool to aid decision making in these circumstances. Conditional on a logistical model which expresses costs of soil sampling as a function of effort, and statistical information from which the error of estimates can be modelled as a function of effort, the implicit loss function is the loss function which makes a particular decision on effort rational. After defining the implicit loss function we compute it for a number of arbitrary decisions on sampling effort for a hypothetical soil monitoring problem. This is based on a logistical model of sampling cost parameterized from a recent survey of soil in County Donegal, Ireland and on statistical parameters estimated with the aid of a process model for change in soil organic carbon. We show how the implicit loss function might provide a basis for reflection on a particular choice of sampling regime, specifically the simple random sample size, by comparing it with the values attributed to soil properties and functions. In a recent study rules were agreed to deal with uncertainty in soil carbon stocks for purposes of carbon trading by treating a percentile of the estimation distribution as the estimated value. We show that this is equivalent to setting a parameter of the implicit loss function, its asymmetry. We then discuss scope for further research to develop and apply the implicit loss function to help decision making by policy makers and regulators

    Modelling complex geological circular data with the projected normal distribution and mixtures of von Mises distributions

    Get PDF
    Circular data are commonly encountered in the earth sciences and statistical descriptions and inferences about such data are necessary in structural geology. In this paper we compare two statistical distributions appropriate for complex circular data sets: the mixture of von Mises and the projected normal distribution. We show how the number of components in a mixture of von Mises distribution may be chosen, and how one may choose between the projected normal distribution and the mixture of von Mises for a particular data set. We illustrate these methods with a few structural geological data, showing how the fitted models can complement geological interpretation and permit statistical inference. One of our data sets suggests a special case of the projected normal distribution which we discuss briefly
    • …
    corecore