Search CORE

18 research outputs found

Recent Developments in Complex and Spatially Correlated Functional Data

Author: Genton Marc G.
Martínez-Hernández Israel
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 24/03/2020
Field of study

As high-dimensional and high-frequency data are being collected on a large scale, the development of new statistical models is being pushed forward. Functional data analysis provides the required statistical methods to deal with large-scale and complex data by assuming that data are continuous functions, e.g., a realization of a continuous process (curves) or continuous random fields (surfaces), and that each curve or surface is considered as a single observation. Here, we provide an overview of functional data analysis when data are complex and spatially correlated. We provide definitions and estimators of the first and second moments of the corresponding functional random variable. We present two main approaches: The first assumes that data are realizations of a functional random field, i.e., each observation is a curve with a spatial component. We call them 'spatial functional data'. The second approach assumes that data are continuous deterministic fields observed over time. In this case, one observation is a surface or manifold, and we call them 'surface time series'. For the two approaches, we describe software available for the statistical analysis. We also present a data illustration, using a high-resolution wind speed simulated dataset, as an example of the two approaches. The functional data approach offers a new paradigm of data analysis, where the continuous processes or random fields are considered as a single entity. We consider this approach to be very valuable in the context of big data.Comment: Some typos fixed and new references adde

arXiv.org e-Print Archive

Lancaster E-Prints

Positive definite nonparametric regression using an evolutionary algorithm with application to covariance function estimation

Author: Kang Myeongjong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/04/2023
Field of study

We propose a novel nonparametric regression framework subject to the positive definiteness constraint. It offers a highly modular approach for estimating covariance functions of stationary processes. Our method can impose positive definiteness, as well as isotropy and monotonicity, on the estimators, and its hyperparameters can be decided using cross validation. We define our estimators by taking integral transforms of kernel-based distribution surrogates. We then use the iterated density estimation evolutionary algorithm, a variant of estimation of distribution algorithms, to fit the estimators. We also extend our method to estimate covariance functions for point-referenced data. Compared to alternative approaches, our method provides more reliable estimates for long-range dependence. Several numerical studies are performed to demonstrate the efficacy and performance of our method. Also, we illustrate our method using precipitation data from the Spatial Interpolation Comparison 97 project.Comment: Accepted at the 2023 Genetic and Evolutionary Computation Conference (GECCO) as a full paper. 14 pages with references and appendices, 11 figure

arXiv.org e-Print Archive

Contributions to spectral spatial statistics

Author: Crujeiras Casais Rosa María
Publication venue: Universidade de Santiago de Compostela. Servizo de Publicacións e Intercambio Científico
Publication date: 01/01/2007
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela

Bootstrap based uncertainty bands for prediction in functional kriging

Author: Franco-Villoria Maria
Ignaccolo Rosaria
Publication venue
Publication date: 01/01/2016
Field of study

The increasing interest in spatially correlated functional data has led to the development of appropriate geostatistical techniques that allow to predict a curve at an unmonitored location using a functional kriging with external drift model that takes into account the effect of exogenous variables (either scalar or functional). Nevertheless uncertainty evaluation for functional spatial prediction remains an open issue. We propose a semi-parametric bootstrap for spatially correlated functional data that allows to evaluate the uncertainty of a predicted curve, ensuring that the spatial dependence structure is maintained in the bootstrap samples. The performance of the proposed methodology is assessed via a simulation study. Moreover, the approach is illustrated on a well known data set of Canadian temperature and on a real data set of PM

_{10}

concentration in the Piemonte region, Italy. Based on the results it can be concluded that the method is computationally feasible and suitable for quantifying the uncertainty around a predicted curve. Supplementary material including R code is available upon request

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Institutional Research Information System University of Turin

Recommended from our members

On Simplified Bayesian Modeling for Massive Geostatistical Datasets: Conjugacy and Beyond

Author: Zhang Lu
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

With continued advances in Geographic Information Systems and related computational technologies, researchers in diverse fields like forestry, environmental health, climate sciences etc. have growing interests in analyzing large scale data sets measured at a substantial number of geographic locations. Geostatistical models used to capture the space varying relationships in such data are often accompanied by onerous computations which prohibit the analysis of large scale spatial data sets. Less burdensome alternatives proposed recently for analyzing massive spatial datasets often lead to inaccurate inference or require slow sampling process. Bayesian inference, while attractive for accommodating uncertainties through their hierarchical structures, can become computationally onerous for modeling massive spatial data sets because of their reliance on iterative estimation algorithms. My dissertation research aims at developing computationally scalable Bayesian geostatistical models that provide valid inference through highly accelerated sampling process. We also study the asymptotic properties of estimators in spatial analysis.In Chapter 2 and 3, we develop conjugate Bayesian frameworks for analyzing univariate and multivariate spatial data. We propose a conjugate latent Nearest-Neighbor Gaussian Process (NNGP) model in Chapter 2, which uses analytically tractable posterior distributions to obtain posterior inferences, including the large dimensional latent process. In Chapter 3, we focus on building conjugate Bayesian frameworks for analyzing multivariate spatial data. We utilize Matrix-Normal Inverse-Wishart(MNIW) prior to propose conjugate Bayesian frameworks and algorithms that can incorporate a family of scalable spatial modeling methodologies.In Chapter 4, we pursue general Bayesian modeling methodologies beyond a conjugate Bayesian hierarchical modeling. We build scalable versions of a hierarchical linear model of coregionalization (LMC) and spatial factor models, and propose a highly accelerated block update MCMC algorithm. Using the proposed Bayesian LMC model, we extend scalable modeling strategies for a single process into multivariate process cases. All proposed frameworks are tested on simulated data and fit to real data sets with observed locations numbering in the millions. Our contribution is to offer practicing scientists and spatial analysts practical and flexible scalable hierarchical models for analyzing massive spatial data sets.In Chapter 5, we investigate the asymptotic properties of the estimators in spatial analysis. We formally establish results on the identifiability and consistency of the nugget in spatial models based upon the Gaussian process within the framework of in-fill asymptotics, i.e. the sample size increases within a sampling domain that is bounded. We establish the identifiability of parameters in the Matern covariance function and the consistency of their maximum likelihood estimators in the presence of discontinuities due to the nugget

eScholarship - University of California

Assessing spatial dependency under non-standard sampling

Author: Leite Raquel Menezes da Mota
Publication venue
Publication date: 01/01/2005
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela

Linear and non-linear resource estimation techniques applied in the Kanzi Phosphate project in the Democratic Republic of Congo

Author: Mudau Mpfariseni
Publication venue
Publication date: 07/05/2015
Field of study

The main aim of this project was to conduct a comparative analysis of the linear and non-linear estimation techniques used for a Kanzi Phosphate Project in the Democratic Republic of Congo. Kanzi phosphate is an elongated sedimentary unit with a north-south strike direction and a fairly flat dip angle. It was deposited between two graben structures. The Kanzi phosphate was divided into the North and South areas. The North and South areas were treated as different domains because they are far apart. The geology and assay results of the intersected phosphate mineralization were used in defining the layers. The layering was noted in South Geo-Zone. This led the South Geo-Zone to be sub-divided vertically into three layers namely Top, Middle and Bottom layers. The Top and Bottom layers had low P2O5 grades and higher SiO2 than the Middle layer. The Middle layer was the most laterally extensive layer than other layers. Drillholes were done by the Aircore drilling technique and the samples were taken at 1m intervals. No compositing was done as all samples contributed equal statistical weights in terms of length and density measurements. The declustering was not done because the drillholes were well-spread. The statistical evaluation of the domains showed that P2O5 is correlated to all other major variables (CaO, Al2O3, TiO2 and SiO2). A decision was taken to conduct mineral resource estimation on P2O5 only. Other block variables were estimated from the P2O5 using a linear regression relationship. A 3-dimensional geological model was constructed for each domain. A model was filled with the blocks. A definition of the block sizes were based on the neighbourhood analysis, drillhole spacing and mining requirements. Half the drillhole spacing was used for X (125m) and Y (125m) dimensions and 5m thickness was used for Z dimension. The traditional variograms for all the domains were created. Downhole variograms were used to determine the nugget effect. All variograms were omni-directional and have spherical models. The variogram ranges were used to guide the search volumes for both Ordinary Kriging (OK) and Inverse Distance Weighting (IDW). The estimation results from the OK and IDW techniques were comparable. The data was pre-processed for Indicator Kriging (IK). The median cut-offs were selected and median variograms were calculated. It was assumed that all other indicators have similar variograms to that of the median indicator variogram. For estimation purpose, the cut-offs selected were 7.5%, 12.5%, 17.5%, 22.5% and 27.5%. These cut-offs were guided by processing characteristics on the Kanzi phosphate. The results of the three estimation techniques (IDW, OK and IK) were analysed. The OK and IDW methods produced smoothed estimates. The OK and IDW methods defined the global resources well. The measure of uncertainty for OK was not clearly defined, partly due to widely spaced data. The Median Indicator Kriging produced more useful results than the results produced by the OK and IDW methods and smoothing was minimized. As a probabilistic method, the Median Indicator Kriging defined the proportion of tonnages above the defined processing cut-offs. The estimation methods were compared and ranked. The Median Indicator Kriging was the preferred estimation technique and was ranked high. The OK and IDW produced identical results and they were ranked low. OK performed like IDW as there were moderately mixed sample populations that were spatially integrated. The recommendations to conduct conditional simulation, drill additional boreholes, estimate other variables using co-kriging and perform further processing studies were given. This will help in reducing risks and increase the geostatistical understanding of the phosphate resources

Wits Institutional Repository on DSPACE