118 research outputs found

    Large scale geostatistics with locally varying anisotropy

    Get PDF
    Classical geostatistical methods are based on the hypothesis of stationarity, which allows to apply repetitive sampling in different locations of the spatial domain, in order to obtain enough information to infer cumulative distributions. In case of non stationarity, anisotropy is observed in the underlying physical phenomena. This feature manifest itself as preferential directions of continuity in the phenomena, i.e. properties are more continuous in one orientation than in another. In the case of local anisotropy, each location of the domain in study presents different preferential directions of continuity. The locally varying anisotropy (LVA) approach in geostatistics allows to incorporate a field of local anisotropy parameters defined for each domain point. With this additional input, more realistic spatial simulations can be generated, including geological features to the computational model such as folds, veins, faults, among others. Since the seminal article published by Boisvert and Deutsch (2011), to the best of the author's knowledge, no further analysis or public code improvements were developed. This is in part because acceleration and parallelization techniques must be applied to the inner kernels of the baseline LVA codes. Large execution time is needed to generate small-scale domain simulations, making large-scale domain simulations a prohibitive task. The contributions of this thesis are accelerating and parallelizing classical and LVA-based geostatistical simulation methods, particularly sequential simulation, which is one of the most common and computationally intensive methods in the field. This fact was recently remarked by some of the main authors in the field, Gómez-Hernández and Srivastava (2021), which shows the relevance of this work today. Two main parallel algorithms and an optimized version of a kd-tree search implementation are presented, all of them applied to both classical and LVA-based sequential simulation implementations. The first parallel algorithm is related to the parallel simulation of different domain points, after rearranging the order of simulation but preserving the exact results of a single-thread execution. The second parallel algorithm is related to the parallel search of neighbour points in the domain, which will be used to build data dependencies for the parallel simulation of points. The optimized kd-tree search was used in each test case in order to reduce the computational complexity of neighbour search tasks. Its modified implementation reduces the number of branching instructions and introduces specialized code sections to accelerate the execution. The main focus is on multi-core architectures using OpenMP and optimization techniques applied to Fortran and C++ codes. Additionally, acceleration and parallelization techniques were also applied to auxiliary applications, such as shortest path and variogram calculation on hybrid CPU/GPU architectures using Fortran, C++ and CUDA codes. In the last application, an analytical and heuristic model was developed to estimate the optimal workload distribution between CPU and GPU in the hybrid context. The overall results of this work are a set of applications that will allow researchers and practitioners to accelerate dramatically the execution of their experiments and simulations, being sgsim, sisim, sgs-lva and sisim-lva the accelerated codes presented. Final speedup results of 11x and 50x are obtained for non-LVA codes using 16 threads, and 56x and 1822x are obtained for LVA codes using 20 threads. These tools can be combined with other geostatistical tools, in order to improve the existing landscape of open source codes that can be used in practical scenarios.Los métodos geoestadísticos clásicos se basan en la hipótesis de la estacionariedad, que permite aplicar muestreos repetitivos en diferentes lugares del dominio espacial, con el fin de obtener información suficiente para inferir distribuciones acumuladas. En caso de no estacionariedad, se observa anisotropía en los fenómenos físicos subyacentes. Esta característica se manifiesta como direcciones preferenciales de continuidad en los fenómenos, es decir, las propiedades son más continuas en una orientación que en otra. En el caso de la anisotropía local, cada ubicación del dominio en estudio puede presentar diferentes direcciones preferenciales de continuidad. El enfoque de anisotropía localmente variable (LVA) en geoestadística permite incorporar un campo de parámetros de anisotropía locales definidos para cada punto de dominio. Con esta entrada adicional, se pueden generar simulaciones espaciales más realistas, incluyendo características geológicas al modelo computacional como pliegues, vetas, fallas, entre otras. Desde el artículo seminal publicado por Boisvert y Deutsch (2011), según el conocimiento del autor, no se han desarrollado más análisis ni mejoras en el código público. Esto se debe en parte a que se deben aplicar técnicas de aceleración y paralelización a los núcleos internos de los códigos LVA de referencia. Se necesita mucho tiempo de ejecución para generar simulaciones de dominio a pequeña escala, lo que hace que las simulaciones de dominio a gran escala sean una tarea prohibitiva. Las contribuciones de esta tesis consisten en acelerar y paralelizar métodos de simulación geoestadística clásicos y basados en LVA, particularmente la simulación secuencial, que es uno de los métodos más comunes e intensivos en computación en el campo. Este hecho fue señalado recientemente por algunos de los principales autores en el campo, Gómez-Hernández y Srivastava (2021), lo que demuestra la relevancia de este trabajo en la actualidad. Se presentan dos algoritmos paralelos principales y una versión optimizada de una implementación de búsqueda de árbol kd, todos ellos aplicados a implementaciones de simulación secuencial clásicas y basadas en LVA. El primer algoritmo paralelo está relacionado con la simulación paralela de diferentes puntos del dominio, después de reorganizar el orden de simulación pero conservando los resultados exactos de una ejecución de un solo hilo. El segundo algoritmo paralelo está relacionado con la búsqueda paralela de puntos vecinos en el dominio, que se utilizará para resolver dependencias de datos para la simulación paralela de puntos. La búsqueda optimizada de kd-tree se utilizó en cada caso de prueba para reducir la complejidad computacional de las tareas de búsqueda de vecinos. Su implementación modificada reduce el número de instrucciones branching e introduce código especializado para acelerar la ejecución. El foco principal está en arquitecturas multi-núcleo usando OpenMP y técnicas de optimización aplicadas a códigos Fortran y C++. Además, también se aplicaron técnicas de aceleración y paralelización a aplicaciones auxiliares, como el cálculo de la ruta más corta en un grafo y el cálculo de variogramas en arquitecturas híbridas CPU/GPU utilizando códigos Fortran, C++ y CUDA. En la última aplicación, se desarrolló un modelo analítico y heurístico para estimar la distribución óptima de la carga de trabajo entre CPU y GPU en el contexto híbrido. Los resultados generales de este trabajo son un conjunto de aplicaciones que permitirán a los investigadores y profesionales acelerar la ejecución de sus experimentos, siendo sgsim, sisim, sgs-lva y sisim-lva los códigos acelerados. Se obtienen resultados finales de aceleración de 11x y 50x para códigos que no son LVA usando 16 hilos, y se obtienen 56x y 1822x para códigos LVA usando 20 hilos. Estas herramientas se pueden combinar con otras herramientas geoestadícasPostprint (published version

    Acceleration strategies for large-scale sequential simulations using parallel neighbour search: Non-LVA and LVA scenarios

    Get PDF
    This paper describes the application of acceleration techniques into existing implementations of Sequential Gaussian Simulation and Sequential Indicator Simulation. These implementations might incorporate Locally Varying Anisotropy (LVA) to capture non-linear features of the underlying physical phenomena. The imple- mentation focuses on a novel parallel neighbour search algorithm, which can be used on both non-LVA and LVA codes. Additionally, parallel shortest path executions and optimized linear algebra libraries are applied with focus on LVA codes. Execution time, speedup and accuracy results are presented. Non-LVA codes are benchmarked using two scenarios with approximately 50 million domain points each. Speedup results of 2× and 4× were obtained on SGS and SISIM respectively, where each scenario is compared against a baseline code published in Peredo et al. (2018). The aggregated contribution to speedup of both works results in 12× and 50× respectively. LVA codes are benchmarked using two scenarios with approximately 1.7 million domain points each. Speedup results of 56× and 1822× were obtained on SGS and SISIM respectively, where each scenario is compared against the original baseline sequential codes.The authors acknowledge the donated resources from project PID2019-107255GB of the Spanish Ministerio de Economía y Competitividad, and project 2017-SGR-1414 from the Generalitat de Catalunya, Spain.Peer ReviewedPostprint (published version

    Machine learning-based estimation and clustering of statistics within stratigraphic models as exemplified in Denmark

    Get PDF
    Estimating a covariance model for kriging purposes is traditionally done using semivariogram analyses, where an empirical semivariogram is calculated, and a chosen semivariogram model, usually defined by a sill and a range, is fitted. We demonstrate that a convolutional neural network can estimate such a semivariogram model with comparable accuracy and precision by training it to recognise the relationship between realisations of Gaussian random fields and the sill and range values that define it, for a Gaussian type semivariance model. We do this by training the network with synthetic data consisting of many such realisations with the sill and range as the target variables. Because training takes time, the method is best suited for cases where many models need to be estimated since the actual estimation itself is about 70 times faster with the neural network than with the traditional approach. We demonstrate the viability of the method in three ways: (1) we test the model’s performance on the validation data, (2) we do a test where we compare the model to the traditional approach and (3) we show an example of an actual application of the method using the Danish national hydrostratigraphic model

    Geodesic gaussian processes for the parametric reconstruction of a free-form surface

    Get PDF
    Reconstructing a free-form surface from 3-dimensional (3D) noisy measurements is a central problem in inspection, statistical quality control, and reverse engineering. We present a new method for the statistical reconstruction of a free-form surface patch based on 3D point cloud data. The surface is represented parametrically, with each of the three Cartesian coordinates (x, y, z) a function of surface coordinates (u, v), a model form compatible with computer-aided-design (CAD) models. This model form also avoids having to choose one Euclidean coordinate (say, z) as a “response” function of the other two coordinate “locations” (say, x and y), as commonly used in previous Euclidean kriging models of manufacturing data. The (u, v) surface coordinates are computed using parameterization algorithms from the manifold learning and computer graphics literature. These are then used as locations in a spatial Gaussian process model that considers correlations between two points on the surface a function of their geodesic distance on the surface, rather than a function of their Euclidean distances over the xy plane. We show how the proposed geodesic Gaussian process (GGP) approach better reconstructs the true surface, filtering the measurement noise, than when using a standard Euclidean kriging model of the “heights”, that is, z(x, y). The methodology is applied to simulated surface data and to a real dataset obtained with a noncontact laser scanner. Supplementary materials are available online

    Lithofacies uncertainty modeling in a siliciclastic reservoir setting by incorporating geological contacts and seismic information

    Get PDF
    Deterministic modeling lonely provides a unique boundary layout, depending on the geological interpretation or interpolation from the hard available data. Changing the interpreter’s attitude or interpolation parameters leads to displacing the location of these borders. In contrary, probabilistic modeling of geological domains such as lithofacies is a critical aspect to providing information to take proper decision in the case of evaluation of oil reservoirs parameters, that is, applicable for quantification of uncertainty along the boundaries. These stochastic modeling manifests itself dramatically beyond this occasion. Conventional approaches of probabilistic modeling (object and pixel-based) mostly suffers from consideration of contact knowledge on the simulated domains. Plurigaussian simulation algorithm, in contrast, allows reproducing the complex transitions among the lithofacies domains and has found wide acceptance for modeling petroleum reservoirs. Stationary assumption for this framework has implications on the homogeneous characterization of the lithofacies. In this case, the proportion is assumed constant and the covariance function as a typical feature of spatial continuity depends only on the Euclidean distances between two points. But, whenever there exists a heterogeneity phenomenon in the region, this assumption does not urge model to generate the desired variability of the underlying proportion of facies over the domain. Geophysical attributes as a secondary variable in this place, plays an important role for generation of the realistic contact relationship between the simulated categories. In this paper, a hierarchical plurigaussian simulation approach is used to construct multiple realizations of lithofacies by incorporating the acoustic impedance as soft data through an oil reservoir in Iran.This research was funded by the National Elites Foundation of Iran in collaboration with research Institute Petroleum of Industry in Iran under the project number of 9265005

    SciKit-GStat 1.0: a SciPy-flavored geostatistical variogram estimation toolbox written in Python

    Get PDF
    Geostatistical methods are widely used in almost all geoscientific disciplines, i.e., for interpolation, rescaling, data assimilation or modeling. At its core, geostatistics aims to detect, quantify, describe, analyze and model spatial covariance of observations. The variogram, a tool to describe this spatial covariance in a formalized way, is at the heart of every such method. Unfortunately, many applications of geostatistics focus on the interpolation method or the result rather than the quality of the estimated variogram. Not least because estimating a variogram is commonly left as a task for computers, and some software implementations do not even show a variogram to the user. This is a miss, because the quality of the variogram largely determines whether the application of geostatistics makes sense at all. Furthermore, the Python programming language was missing a mature, well-established and tested package for variogram estimation a couple of years ago. Here I present SciKit-GStat, an open-source Python package for variogram estimation that fits well into established frameworks for scientific computing and puts the focus on the variogram before more sophisticated methods are about to be applied. SciKit-GStat is written in a mutable, object-oriented way that mimics the typical geostatistical analysis workflow. Its main strength is the ease of use and interactivity, and it is therefore usable with only a little or even no knowledge of Python. During the last few years, other libraries covering geostatistics for Python developed along with SciKit-GStat. Today, the most important ones can be interfaced by SciKit-GStat. Additionally, established data structures for scientific computing are reused internally, to keep the user from learning complex data models, just for using SciKit-GStat. Common data structures along with powerful interfaces enable the user to use SciKit-GStat along with other packages in established workflows rather than forcing the user to stick to the author\u27s programming paradigms. SciKit-GStat ships with a large number of predefined procedures, algorithms and models, such as variogram estimators, theoretical spatial models or binning algorithms. Common approaches to estimate variograms are covered and can be used out of the box. At the same time, the base class is very flexible and can be adjusted to less common problems, as well. Last but not least, it was made sure that a user is aided in implementing new procedures or even extending the core functionality as much as possible, to extend SciKit-GStat to uncovered use cases. With broad documentation, a user guide, tutorials and good unit-test coverage, SciKit-GStat enables the user to focus on variogram estimation rather than implementation details

    New Developments in Covariance Modeling and Coregionalization for the Study and Simulation of Natural Phenomena

    Get PDF
    RÉSUMÉ La géostatistique s’intéresse à la modélisation des phénomènes naturels par des champs aléatoires univariables ou multivariables. La plupart des applications utilisent un modèle stationnaire pour représenter le phénomène étudié. Il est maintenant reconnu que ce modèle n’est pas assez flexible pour représenter adéquatement un phénomène naturel montrant des comportements qui varient considérablement dans l’espace (un exemple simple de cette hétérogénéité est le problème de l’estimation de l’épaisseur du mort-terrain en présence d’affleurements). Pour le cas univariable, quelques modèles non-stationnaires ont été développés récemment. Toutefois, ces modèles n’ont pas un support compact, ce qui limite leur domaine d’application. Il y a un réel besoin d’enrichir la classe des modèles non-stationnaires univariable, le premier objectif poursuivi par cette thèse.----------ABSTRACT Geostatistics focus on modeling natural phenomena by univariate or multivariate spatial random fields. Most applications rely on the choice of a stationary model to represent the studied phenomenon. It is now acknowledged that this model is not flexible enough to adequately represent a natural phenomenon showing behaviors that vary substantially in space (a simple example of such heterogeneity is the problem of estimating overburden thickness in the presence of outcrops). For the univariate case, a few non-stationary models were developed recently. However, these models do not have compact support, which limits in practice their range of application. There is a definite need to enlarge the class of univariate non-stationary models, a first goal pursued by this thesis

    Handbook of Mathematical Geosciences

    Get PDF
    This Open Access handbook published at the IAMG's 50th anniversary, presents a compilation of invited path-breaking research contributions by award-winning geoscientists who have been instrumental in shaping the IAMG. It contains 45 chapters that are categorized broadly into five parts (i) theory, (ii) general applications, (iii) exploration and resource estimation, (iv) reviews, and (v) reminiscences covering related topics like mathematical geosciences, mathematical morphology, geostatistics, fractals and multifractals, spatial statistics, multipoint geostatistics, compositional data analysis, informatics, geocomputation, numerical methods, and chaos theory in the geosciences

    Investigating 'optimal' kriging variance estimation :analytic and bootstrap estimators

    Get PDF
    Kriging is a widely used group of techniques for predicting unobserved responses at specified locations using a set of observations obtained from known locations. Kriging predictors are best linear unbiased predictors (BLUPs) and the precision of predictions obtained from them are assessed by the mean squared prediction error (MSPE), commonly termed the kriging variance

    ROCK PROPERTIES MODEL ANALYSIS MODEL REPORT

    Full text link
    corecore