52 research outputs found

    Surrogate regression modelling for fast seismogram generation and detection of microseismic events in heterogeneous velocity models

    Get PDF
    This is the author accepted manuscript. The final version is available from Oxford University Press (OUP) via the DOI in this record.Given a 3D heterogeneous velocity model with a few million voxels, fast generation of accurate seismic responses at specified receiver positions from known microseismic event locations is a well-known challenge in geophysics, since it typically involves numerical solution of the computationally expensive elastic wave equation. Thousands of such forward simulations are often a routine requirement for parameter estimation of microseimsic events via a suitable source inversion process. Parameter estimation based on forward modelling is often advantageous over a direct regression-based inversion approach when there are unknown number of parameters to be estimated and the seismic data has complicated noise characteristics which may not always allow a stable and unique solution in a direct inversion process. In this paper, starting from Graphics Processing Unit (GPU) based synthetic simulations of a few thousand forward seismic shots due to microseismic events via pseudo-spectral solution of elastic wave equation, we develop a step-by-step process to generate a surrogate regression modelling framework, using machine learning techniques that can produce accurate seismograms at specified receiver locations. The trained surrogate models can then be used as a high-speed meta-model/emulator or proxy for the original full elastic wave propagator to generate seismic responses for other microseismic event locations also. The accuracies of the surrogate models have been evaluated using two independent sets of training and testing Latin hypercube (LH) quasi-random samples, drawn from a heterogeneous marine velocity model. The predicted seismograms have been used thereafter to calculate batch likelihood functions, with specified noise characteristics. Finally, the trained models on 23 receivers placed at the sea-bed in a marine velocity model are used to determine the maximum likelihood estimate (MLE) of the event locations which can in future be used in a Bayesian analysis for microseismic event detection.This work has been supported by the Shell Projects and Technology. The Wilkes high performance GPU computing service at the University of Cambridge has been used in this work

    Hybrid ACO and SVM algorithm for pattern classification

    Get PDF
    Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO

    Data Science-Based Full-Lifespan Management of Lithium-Ion Battery

    Get PDF
    This open access book comprehensively consolidates studies in the rapidly emerging field of battery management. The primary focus is to overview the new and emerging data science technologies for full-lifespan management of Li-ion batteries, which are categorized into three groups, namely (i) battery manufacturing management, (ii) battery operation management, and (iii) battery reutilization management. The key challenges, future trends as well as promising data-science technologies to further improve this research field are discussed. As battery full-lifespan (manufacturing, operation, and reutilization) management is a hot research topic in both energy and AI fields and none specific book has focused on systematically describing this particular from a data science perspective before, this book can attract the attention of academics, scientists, engineers, and practitioners. It is useful as a reference book for students and graduates working in related fields. Specifically, the audience could not only get the basics of battery manufacturing, operation, and reutilization but also the information of related data-science technologies. The step-by-step guidance, comprehensive introduction, and case studies to the topic make it accessible to audiences of different levels, from graduates to experienced engineers

    Integrated ACOR/IACOMV-R-SVM Algorithm

    Get PDF
    A direction for ACO is to optimize continuous and mixed (discrete and continuous) variables in solving problems with various types of data. Support Vector Machine (SVM), which originates from the statistical approach, is a present day classification technique. The main problems of SVM are selecting feature subset and tuning the parameters. Discretizing the continuous value of the parameters is the most common approach in tuning SVM parameters. This process will result in loss of information which affects the classification accuracy. This paper presents two algorithms that can simultaneously tune SVM parameters and select the feature subset. The first algorithm, ACOR-SVM, will tune SVM parameters, while the second IACOMV-R-SVM algorithm will simultaneously tune SVM parameters and select the feature subset. Three benchmark UCI datasets were used in the experiments to validate the performance of the proposed algorithms. The results show that the proposed algorithms have good performances as compared to other approaches

    Data Science-Based Full-Lifespan Management of Lithium-Ion Battery

    Get PDF
    This open access book comprehensively consolidates studies in the rapidly emerging field of battery management. The primary focus is to overview the new and emerging data science technologies for full-lifespan management of Li-ion batteries, which are categorized into three groups, namely (i) battery manufacturing management, (ii) battery operation management, and (iii) battery reutilization management. The key challenges, future trends as well as promising data-science technologies to further improve this research field are discussed. As battery full-lifespan (manufacturing, operation, and reutilization) management is a hot research topic in both energy and AI fields and none specific book has focused on systematically describing this particular from a data science perspective before, this book can attract the attention of academics, scientists, engineers, and practitioners. It is useful as a reference book for students and graduates working in related fields. Specifically, the audience could not only get the basics of battery manufacturing, operation, and reutilization but also the information of related data-science technologies. The step-by-step guidance, comprehensive introduction, and case studies to the topic make it accessible to audiences of different levels, from graduates to experienced engineers

    Large-scale Machine Learning in High-dimensional Datasets

    Get PDF

    Multiscale Modeling and Gaussian Process Regression for Applications in Composite Materials

    Get PDF
    An ongoing challenge in advanced materials design is the development of accurate multiscale models that consider uncertainty while establishing a link between knowledge or information about constituent materials to overall composite properties. Successful models can accurately predict composite properties, reducing the high financial and labor costs associated with experimental determination and accelerating material innovation. Whereas early pioneers in micromechanics developed simplistic theoretical models to map these relationships, modern advances in computer technology have enabled detailed simulators capable of accurately predicting complex and multiscale phenomena. This work advances domain knowledge via two means: firstly, through the development of high-fidelity, physics-based finite element (FE) models of composite microstructures that incorporate uncertainty in predictions, and secondly, through the development of a novel inverse analysis framework that enables the discovery of unknown or obscure constituent properties using literature data and Gaussian process (GP) surrogate models trained on FE model predictions. This work presents a generalizable approach to modeling a diverse array of composite subtypes, from a simple particulate system to a complex commercial composite. The inverse analysis framework was demonstrated for a thermoplastic composite reinforced by spherical fillers with unknown interphase properties. The framework leverages computer model simulations with easily obtainable macroscale elastic property measurements to infer interphase properties that are otherwise challenging to measure. The interphase modulus and thickness were determined for six different thermoplastic composites; four were reinforced by micron-scale particles and two with nano-scale particles. An alginate fiber embedded with a helically symmetric arrangement of cellulose nanocrystals (CNCs) was investigated using multiscale FE analysis to quantify microstructural uncertainty and the subsequent effect on macroscopic behavior. The macroscale uniaxial tensile simulation revealed that the microstructure induces internal stresses sufficient to rotate or twist the fiber about its axis. The reduction in axial elastic modulus for increases in CNC spiral angle was quantified in a sensitivity analysis using a GP surrogate modeling approach. A predictive model using GP regression was employed to investigate the link between input features and the mechanical properties of fiberglass-reinforced magnesium oxychloride (MOC) cement boards produced from a commercial process. The model evaluated the effect of formulation, crystalline phase compositions, and process control parameters on various mechanical performance metrics

    Generalising history matching for enhanced calibration of computer models

    Get PDF
    History matching using Gaussian process emulators is a well-known methodology for the calibration of computer models. It attempts to identify the parts of input parameter space that are likely to result in mismatches between simulator outputs and physical observations by using emulators. These parts are then ruled out. The remaining “Not Ruled Out Yet (NROY)” input space is then searched for good matches by repeating the history matching process. The first section of this thesis illustrates an easily neglected limitation of standard history matching: the emulator must simulate the target NROY space well, else good parameter choices can be ruled out. We show that even when an emulator passes standard diagnostic checks on the whole parameter space, good parameter choices can easily be ruled out. We present novel methods for detecting these cases and a Local Voronoi Tessellation method for a robust approach to calibration that ensures that the true NROY space is retained and parameter inference is not biased. The remainder of this thesis looks into developing a generalised history matching for calibrating computer models with high-dimensional output. We address another limitation of the standard (PCA-based) history matching, which only works well when the parameters are responsible for the strength of various patterns. We show that when the parameters control the position of patterns, e.g. shifting currents, current approaches will not generally be able to calibrate these models. To overcome this, we extend history matching to kernel feature space, where output space for moving patterns can be compared with the observations. We develop kernel-based history matching as a generalisation to history matching and examine the multiple possible interpretations of the usual implausibility measure and threshold for defining NROY. Automatic kernel selection based on expert modeller judgement is introduced to enable the experts to define important features that the model should be able to reproduce

    Statistical and Machine Learning Models for Remote Sensing Data Mining - Recent Advancements

    Get PDF
    This book is a reprint of the Special Issue entitled "Statistical and Machine Learning Models for Remote Sensing Data Mining - Recent Advancements" that was published in Remote Sensing, MDPI. It provides insights into both core technical challenges and some selected critical applications of satellite remote sensing image analytics

    Quantifying Vegetation Biophysical Variables from Imaging Spectroscopy Data: A Review on Retrieval Methods

    Get PDF
    An unprecedented spectroscopic data stream will soon become available with forthcoming Earth-observing satellite missions equipped with imaging spectroradiometers. This data stream will open up a vast array of opportunities to quantify a diversity of biochemical and structural vegetation properties. The processing requirements for such large data streams require reliable retrieval techniques enabling the spatiotemporally explicit quantification of biophysical variables. With the aim of preparing for this new era of Earth observation, this review summarizes the state-of-the-art retrieval methods that have been applied in experimental imaging spectroscopy studies inferring all kinds of vegetation biophysical variables. Identified retrieval methods are categorized into: (1) parametric regression, including vegetation indices, shape indices and spectral transformations; (2) nonparametric regression, including linear and nonlinear machine learning regression algorithms; (3) physically based, including inversion of radiative transfer models (RTMs) using numerical optimization and look-up table approaches; and (4) hybrid regression methods, which combine RTM simulations with machine learning regression methods. For each of these categories, an overview of widely applied methods with application to mapping vegetation properties is given. In view of processing imaging spectroscopy data, a critical aspect involves the challenge of dealing with spectral multicollinearity. The ability to provide robust estimates, retrieval uncertainties and acceptable retrieval processing speed are other important aspects in view of operational processing. Recommendations towards new-generation spectroscopy-based processing chains for operational production of biophysical variables are given
    corecore