18 research outputs found

    Recent Developments in Complex and Spatially Correlated Functional Data

    Full text link
    As high-dimensional and high-frequency data are being collected on a large scale, the development of new statistical models is being pushed forward. Functional data analysis provides the required statistical methods to deal with large-scale and complex data by assuming that data are continuous functions, e.g., a realization of a continuous process (curves) or continuous random fields (surfaces), and that each curve or surface is considered as a single observation. Here, we provide an overview of functional data analysis when data are complex and spatially correlated. We provide definitions and estimators of the first and second moments of the corresponding functional random variable. We present two main approaches: The first assumes that data are realizations of a functional random field, i.e., each observation is a curve with a spatial component. We call them 'spatial functional data'. The second approach assumes that data are continuous deterministic fields observed over time. In this case, one observation is a surface or manifold, and we call them 'surface time series'. For the two approaches, we describe software available for the statistical analysis. We also present a data illustration, using a high-resolution wind speed simulated dataset, as an example of the two approaches. The functional data approach offers a new paradigm of data analysis, where the continuous processes or random fields are considered as a single entity. We consider this approach to be very valuable in the context of big data.Comment: Some typos fixed and new references adde

    Positive definite nonparametric regression using an evolutionary algorithm with application to covariance function estimation

    Full text link
    We propose a novel nonparametric regression framework subject to the positive definiteness constraint. It offers a highly modular approach for estimating covariance functions of stationary processes. Our method can impose positive definiteness, as well as isotropy and monotonicity, on the estimators, and its hyperparameters can be decided using cross validation. We define our estimators by taking integral transforms of kernel-based distribution surrogates. We then use the iterated density estimation evolutionary algorithm, a variant of estimation of distribution algorithms, to fit the estimators. We also extend our method to estimate covariance functions for point-referenced data. Compared to alternative approaches, our method provides more reliable estimates for long-range dependence. Several numerical studies are performed to demonstrate the efficacy and performance of our method. Also, we illustrate our method using precipitation data from the Spatial Interpolation Comparison 97 project.Comment: Accepted at the 2023 Genetic and Evolutionary Computation Conference (GECCO) as a full paper. 14 pages with references and appendices, 11 figure

    Bootstrap based uncertainty bands for prediction in functional kriging

    Get PDF
    The increasing interest in spatially correlated functional data has led to the development of appropriate geostatistical techniques that allow to predict a curve at an unmonitored location using a functional kriging with external drift model that takes into account the effect of exogenous variables (either scalar or functional). Nevertheless uncertainty evaluation for functional spatial prediction remains an open issue. We propose a semi-parametric bootstrap for spatially correlated functional data that allows to evaluate the uncertainty of a predicted curve, ensuring that the spatial dependence structure is maintained in the bootstrap samples. The performance of the proposed methodology is assessed via a simulation study. Moreover, the approach is illustrated on a well known data set of Canadian temperature and on a real data set of PM10_{10} concentration in the Piemonte region, Italy. Based on the results it can be concluded that the method is computationally feasible and suitable for quantifying the uncertainty around a predicted curve. Supplementary material including R code is available upon request

    Linear and non-linear resource estimation techniques applied in the Kanzi Phosphate project in the Democratic Republic of Congo

    Get PDF
    The main aim of this project was to conduct a comparative analysis of the linear and non-linear estimation techniques used for a Kanzi Phosphate Project in the Democratic Republic of Congo. Kanzi phosphate is an elongated sedimentary unit with a north-south strike direction and a fairly flat dip angle. It was deposited between two graben structures. The Kanzi phosphate was divided into the North and South areas. The North and South areas were treated as different domains because they are far apart. The geology and assay results of the intersected phosphate mineralization were used in defining the layers. The layering was noted in South Geo-Zone. This led the South Geo-Zone to be sub-divided vertically into three layers namely Top, Middle and Bottom layers. The Top and Bottom layers had low P2O5 grades and higher SiO2 than the Middle layer. The Middle layer was the most laterally extensive layer than other layers. Drillholes were done by the Aircore drilling technique and the samples were taken at 1m intervals. No compositing was done as all samples contributed equal statistical weights in terms of length and density measurements. The declustering was not done because the drillholes were well-spread. The statistical evaluation of the domains showed that P2O5 is correlated to all other major variables (CaO, Al2O3, TiO2 and SiO2). A decision was taken to conduct mineral resource estimation on P2O5 only. Other block variables were estimated from the P2O5 using a linear regression relationship. A 3-dimensional geological model was constructed for each domain. A model was filled with the blocks. A definition of the block sizes were based on the neighbourhood analysis, drillhole spacing and mining requirements. Half the drillhole spacing was used for X (125m) and Y (125m) dimensions and 5m thickness was used for Z dimension. The traditional variograms for all the domains were created. Downhole variograms were used to determine the nugget effect. All variograms were omni-directional and have spherical models. The variogram ranges were used to guide the search volumes for both Ordinary Kriging (OK) and Inverse Distance Weighting (IDW). The estimation results from the OK and IDW techniques were comparable. The data was pre-processed for Indicator Kriging (IK). The median cut-offs were selected and median variograms were calculated. It was assumed that all other indicators have similar variograms to that of the median indicator variogram. For estimation purpose, the cut-offs selected were 7.5%, 12.5%, 17.5%, 22.5% and 27.5%. These cut-offs were guided by processing characteristics on the Kanzi phosphate. The results of the three estimation techniques (IDW, OK and IK) were analysed. The OK and IDW methods produced smoothed estimates. The OK and IDW methods defined the global resources well. The measure of uncertainty for OK was not clearly defined, partly due to widely spaced data. The Median Indicator Kriging produced more useful results than the results produced by the OK and IDW methods and smoothing was minimized. As a probabilistic method, the Median Indicator Kriging defined the proportion of tonnages above the defined processing cut-offs. The estimation methods were compared and ranked. The Median Indicator Kriging was the preferred estimation technique and was ranked high. The OK and IDW produced identical results and they were ranked low. OK performed like IDW as there were moderately mixed sample populations that were spatially integrated. The recommendations to conduct conditional simulation, drill additional boreholes, estimate other variables using co-kriging and perform further processing studies were given. This will help in reducing risks and increase the geostatistical understanding of the phosphate resources
    corecore