Search CORE

145 research outputs found

Uncertainty and sensitivity analysis for long-running computer codes : a critical review

Author: Langewisch Dustin R
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Nuclear Science and Engineering, 2010."February 2010." Cataloged from PDF version of thesis.Includes bibliographical references (p. 137-146).This thesis presents a critical review of existing methods for performing probabilistic uncertainty and sensitivity analysis for complex, computationally expensive simulation models. Uncertainty analysis (UA) methods reviewed include standard Monte Carlo simulation, Latin Hypercube sampling, importance sampling, line sampling, and subset simulation. Sensitivity analysis (SA) methods include scatter plots, Monte Carlo filtering, regression analysis, variance-based methods (Sobol' sensitivity indices and Sobol' Monte Carlo algorithms), and Fourier amplitude sensitivity tests. In addition, this thesis reviews several existing metamodeling techniques that are intended provide quick-running approximations to the computer models being studied. Because stochastic simulation-based UA and SA rely on a large number (e.g., several thousands) of simulations, metamodels are recognized as a necessary compromise when UA and SA must be performed with long-running (i.e., several hours or days per simulation) computational models. This thesis discusses the use of polynomial Response Surfaces (RS), Artificial Neural Networks (ANN), and Kriging/Gaussian Processes (GP) for metamodeling. Moreover, two methods are discussed for estimating the uncertainty introduced by the metamodel. The first of these methods is based on a bootstrap sampling procedure, and can be utilized for any metamodeling technique.(cont.) The second method is specific to GP models, and is based on a Bayesian interpretation of the underlying stochastic process. Finally, to demonstrate the use of these methods, the results from two case studies involving the reliability assessment of passive nuclear safety systems are presented. The general conclusions of this work are that polynomial RSs are frequently incapable of adequately representing the complex input/output behavior exhibited by many mechanistic models. In addition, the goodness-of- fit of the RS should not be misinterpreted as a measure of the predictive capability of the metamodel, since RSs are necessarily biased predictors for deterministic computer models. Furthermore, the extent of this bias is not measured by standard goodness-of-fit metrics (e.g., coefficient of determination, R 2), so these methods tend to provide overly optimistic indications of the quality of the metamodel. The bootstrap procedure does provide indication as to the extent of this bias, with the bootstrap confidence intervals for the RS estimates generally being significantly wider than those of the alternative metamodeling methods. It has been found that the added flexibility afforded by ANNs and GPs can make these methods superior for approximating complex models. In addition, GPs are exact interpolators, which is an important feature when the underlying computer model is deterministic (i.e., when there is no justification for including a random error component in the metamodel). On the other hand, when the number of observations from the computer model is sufficiently large, all three methods appear to perform comparably, indicating that in such cases, RSs can still provide useful approximations.by Dustin R. Langewisch.S.M

DSpace@MIT

Uncertainty-Integrated Surrogate Modeling for Complex System Optimization

Author: Mehmani Ali
Publication venue: SURFACE at Syracuse University
Publication date: 15/05/2015
Field of study

Approximation models such as surrogate models provide a tractable substitute to expensive physical simulations and an effective solution to the potential lack of quantitative models of system behavior. These capabilities not only enable the efficient design of complex systems, but is also essential for the effective analysis of physical phenomena/characteristics in the different domains of Engineering, Material Science, Biomedical Science, and various other disciplines. Since these models provide an abstraction of the real system behavior (often a low-fidelity representative) it is important to quantify the accuracy and the reliability of such approximation models without investing additional expensive system evaluations (simulations or physical experiments). Standard error measures, such as the mean squared error, the cross-validation error, and the Akaike\u27s information criterion however provide limited (often inadequate) information regarding the accuracy of the final surrogate model while other more effective dedicated error measures are tailored towards only one class of surrogate models. This lack of accuracy information and the ability to compare and test diverse surrogate models reduce the confidence in model application, restricts appropriate model selection, and undermines the effectiveness of surrogate-based optimization. A key contribution of this dissertation is the development of a new model-independent approach to quantify the fidelity of a trained surrogate model in a given region of the design domain. This method is called the Predictive Estimation of Model Fidelity (PEMF). The PEMF method is derived from the hypothesis that the accuracy of an approximation model is related to the amount of data resources leveraged to train the model . In PEMF, intermediate surrogate models are iteratively constructed over heuristic subsets of sample points. The median and the maximum errors estimated over the remaining points are used to determine the respective error distributions at each iteration. The estimated modes of the error distributions are represented as functions of the density of intermediate training points through nonlinear regression, assuming a smooth decreasing trend of errors with increasing sample density. These regression functions are then used to predict the expected median and maximum errors in the final surrogate models. It is observed that the model fidelities estimated by PEMF are up to two orders of magnitude more accurate and statistically more stable compared to those based on the popularly-used leave-one-out cross-validation method, when applied to a variety of benchmark problems. By leveraging this new paradigm in quantifying the fidelity of surrogate models, a novel automated surrogate model selection framework is also developed. This PEMF-based model selection framework is called the Concurrent Surrogate Model Selection (COSMOS). COSMOS, unlike existing model selection methods, coherently operates at all the three levels necessary to facilitate optimal selection, i.e., (1) selecting the model type, (2) selecting the kernel function type, and (3) determining the optimal values of the typically user-prescribed parameters. The selection criteria that guide optimal model selection are determined by PEMF and the search process is performed using a MINLP solver. The effectiveness of COSMOS is demonstrated by successfully applying it to different benchmark and practical engineering problems, where it offers a first-of-its-kind globally competitive model selection. In this dissertation, the knowledge about the accuracy of a surrogate estimated using PEMF is applied to also develop a novel model management approach for engineering optimization. This approach adaptively selects computational models (both physics-based models and surrogate models) of differing levels of fidelity and computational cost, to be used during optimization, with the overall objective to yield optimal designs with high-fidelity function estimates at a reasonable computational expense. In this technique, a new adaptive model switching (AMS) metric defined to guide the switching of model from one to the next higher fidelity model during the optimization process. The switching criterion is based on whether the uncertainty associated with the current model output dominates the latest improvement of the relative fitness function, where both the model output uncertainty and the function improvement (across the population) are expressed as probability distributions. This adaptive model switching technique is applied to two practical problems through Particle Swarm Optimization to successfully illustrate: (i) the computational advantage of this method over purely high-fidelity model-based optimization, and (ii) the accuracy advantage of this method over purely low-fidelity model-based optimization. Motivated by the unique capabilities of the model switching concept, a new model refinement approach is also developed in this dissertation. The model refinement approach can be perceived as an adaptive sequential sampling approach applied in surrogate-based optimization. Decisions regarding when to perform additional system evaluations to refine the model is guided by the same model-uncertainty principles as in the adaptive model switching technique. The effectiveness of this new model refinement technique is illustrated through application to practical surrogate-based optimization in the area of energy sustainability

Syracuse University Research Facility and Collaborative Environment

Random Forest Spatial Interpolation

Author: Bajat Branislav
Heuvelink G.B.M.
Kilibarda M.
Nikolić Mladen
Sekulić A.
Publication venue
Publication date: 01/01/2020
Field of study

For many decades, kriging and deterministic interpolation techniques, such as inverse distance weighting and nearest neighbour interpolation, have been the most popular spatial interpolation techniques. Kriging with external drift and regression kriging have become basic techniques that benefit both from spatial autocorrelation and covariate information. More recently, machine learning techniques, such as random forest and gradient boosting, have become increasingly popular and are now often used for spatial interpolation. Some attempts have been made to explicitly take the spatial component into account in machine learning, but so far, none of these approaches have taken the natural route of incorporating the nearest observations and their distances to the prediction location as covariates. In this research, we explored the value of including observations at the nearest locations and their distances from the prediction location by introducing Random Forest Spatial Interpolation (RFSI). We compared RFSI with deterministic interpolation methods, ordinary kriging, regression kriging, Random Forest and Random Forest for spatial prediction (RFsp) in three case studies. The first case study made use of synthetic data, i.e., simulations from normally distributed stationary random fields with a known semivariogram, for which ordinary kriging is known to be optimal. The second and third case studies evaluated the performance of the various interpolation methods using daily precipitation data for the 2016–2018 period in Catalonia, Spain, and mean daily temperature for the year 2008 in Croatia. Results of the synthetic case study showed that RFSI outperformed most simple deterministic interpolation techniques and had similar performance as inverse distance weighting and RFsp. As expected, kriging was the most accurate technique in the synthetic case study. In the precipitation and temperature case studies, RFSI mostly outperformed regression kriging, inverse distance weighting, random forest, and RFsp. Moreover, RFSI was substantially faster than RFsp, particularly when the training dataset was large and high-resolution prediction maps were made

GraFar - Repository of the Faculty of Civil Engineering

Wageningen University & Research Publications

Recommended from our members

A data driven approach in less expensive robust transmitting coverage and power optimization

Author: Chaudhary Sushank
Imran Muhammad Ali
Mumtaz Shahid
Parnianifard Amir
Wuttisittikulkij Lunchakorn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

This paper aims the development of a new reduced-cost algorithm for a multi-objective robust transmitter placement under uncertainty. Toward this end, we propose a new hybrid Kriging/Grey Wolf Optimizer (GWO) approach combined with robust design optimization to estimate the set of Pareto frontier by searching robustness as well as accuracy (lower objective function) in a design space. We consider minimization of the energy power consumption for transmitting as well as maximization of signal coverage in a multi-objective robust optimization model. The reliability of the model to control signal overlap for multiple transmitting antennas is also provided. To smooth computational cost, the proposed method instead of evaluating all receiver test points in each optimization iteration approximates signal coverages using Kriging interpolation to obtain optimal transmitter positions. The results demonstrate the utility and the efficiency of the proposed method in rendering the robust optimal design and analyzing the sensitivity of the transmitter placement problem under practically less-expensive computational efforts (350% and 320% less than computational time elapsed using standalone GWO and NSGAII respectively)

Nottingham Trent Institutional Repository (IRep)

PubMed Central

Enlighten

Using deep learning for multivariate mapping of soil with quantified uncertainty

Author: Wadoux A.M.J.C.
Publication venue
Publication date: 01/01/2019
Field of study

Digital soil mapping (DSM) techniques are widely employed to generate soil maps. Soil properties are typically predicted individually, while ignoring the interrelation between them. Models for predicting multiple propertiesexist, but they are computationally demanding and often fail to provide accurate description of the associated uncertainty. In this paper a convolutional neural network (CNN) model is described to predict several soil properties with quantified uncertainty. CNN has the advantage that it incorporates spatial contextual information of environmental covariates surrounding an observation. A single CNN model can be trained to predict multiple soil properties simultaneously. I further propose a two-step approach to estimate the uncertainty of the prediction for mapping using a neural network model. The methodology is tested mapping six soil properties on the French metropolitan territory using measurements from the LUCAS dataset and a large set of environmental covariates portraying the factors of soil formation. Results indicate that the multivariate CNN model produces accurate maps as shown by the coefficient of determination and concordance correlation coefficient, compared to a conventional machine learning technique. For this country extent mapping, the maps predicted by CNN have a detailed pattern with significant spatial variation. Evaluation of the uncertainty maps using the median of thestandardized squared prediction error and accuracy plots suggests that the uncertainty was accurately quantified, albeit slightly underestimated. The tests conducted using different window size of input covariates to predict the soil properties indicate that CNN benefits from using local contextual information in a radius of 4.5 km. I conclude that CNN is an effective model to predict several soil properties and that the associated uncertainty can be accurately quantified with the proposed approach

Wageningen University & Research Publications

Configuration of Deep Neural Networks Using Model-Based Optimization

Author: Klepper Benjamin
Publication venue
Publication date: 01/01/2018
Field of study

Open Access LMU

Robust Surrogate Models for Uncertainty Quantification and Nuclear Engineering Applications

Author: Oparaji B
Publication venue
Publication date
Field of study

In this thesis, a framework that quantifies the uncertainties introduced when using surrogate models for Uncertainty Quantification is proposed. The proposed framework have been adopted for a variety of Nuclear Engineering problems

University of Liverpool Repository

Kriging meta-model assisted calibration of computational fluid dynamics models

Author: Beck
Broad
Broad
Bryd
Byrd
Caballero
Campbell
Chang
Chen
Chen
Coetzee
El Tabach
Elger
Fang
Fang
Farshidi
Fen
Fonseca
Genz
Ghani
Glaz
Goswani
Hoque
Hutchison
Jones
Jones
Kalagnanam
Keating
Kennedy
Khalfallah
Khu
Kleijnen
Kleijnen
Launder
Li
Limpert
Liu
Long
Madsen
Manfren
Martin
McKay
McKay
Myung
Palmer
Palmer
Popplewell
Quan
Sacks
Safari
Sheil
Shimizu
Simpson
Simpson
Simpson
Tolson
Waltz
Wang
Wang
Welch
Wu
Yadzi
Yan
Zou
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Application of machine learning in operational flood forecasting and mapping

Author: Kabir Syed Rezwan
Publication venue: Energy, Geoscience, Infrastructure and Society
Publication date: 01/10/2020
Field of study

Considering the computational effort and expertise required to simulate 2D hydrodynamic models, it is widely understood that it is practically impossible to run these types of models during a real-time flood event. To allow for real-time flood forecasting and mapping, an automated, computationally efficient and robust data driven modelling engine - as an alternative to the traditional 2D hydraulic models - has been proposed. The concept of computationally efficient model relies heavily on replacing time consuming 2D hydrodynamic software packages with a simplified model structure that is fast, reliable and can robustly retains sufficient accuracy for applications in real-time flood forecasting, mapping and sequential updating. This thesis presents a novel data-driven modelling framework that uses rainfall data from meteorological stations to forecast flood inundation maps. The proposed framework takes advantage of the highly efficient machine learning (ML) algorithms and also utilities the state-of-the-art hydraulic models as a system component. The aim of this research has been to develop an integrated system, where a data-driven rainfall-streamflow forecasting model sets up the upstream boundary conditions for the machine learning based classifiers, which then maps out multi-step ahead flood extents during an extreme flood event. To achieve the aim and objectives of this research, firstly, a comprehensive investigation was undertaken to search for a robust ML-based multi-step ahead rainfall-streamflow forecasting model. Three potential models were tested (Support Vector Regression (SVR), Deep Belief Network (DBN) and Wavelet decomposed Artificial Neural Network (WANN)). The analysis revealed that SVR-based models perform most efficiently in forecasting streamflow for shorter lead time. This study also tested the portability of model parameters and performance deterioration rates. Secondly, multiple ML-based models (SVR, Random Forest (RF) and Multi-layer Perceptron (MLP)) were deployed to simulate flood inundation extents. These models were trained and tested for two geomorphologically distinct case study areas. In the first case of study, of the models trained using the outputs from LISFLOOD-FP hydraulic model and upstream flow data for a large rural catchment (Niger Inland Delta, Mali). For the second case of study similar approach was adopted, though 2D Flood Modeller software package was used to generate target data for the machine learning algorithms and to model inundation extent for a semi-urban floodplain (Upton-Upon-Severn, UK). In both cases, machine learning algorithms performed comparatively in simulating seasonal and event based fluvial flooding. Finally, a framework was developed to generate flood extent maps from rainfall data using the knowledge learned from the case studies. The research activity focused on the town of Upton-Upon-Severn and the analysis time frame covers the flooding event of October-November 2000. RF-based models were trained to forecast the upstream boundary conditions, which were systematically fed into MLP-based classifiers. The classifiers detected states (wet/dry) of the randomly selected locations within a floodplain at every time step (e.g. one hour in this study). The forecasted states of the sampled locations were then spatially interpolated using regression kriging method to produce high resolution probabilistic inundation (9m) maps. Results show that the proposed data centric modelling engine can efficiently emulate the outcomes of the hydraulic model with considerably high accuracy, measured in terms of flood arrival time error, and classification accuracy during flood growing, peak, and receding periods. The key feature of the proposed modelling framework is that, it can substantially reduce computational time, i.e. ~14 seconds for generating flood maps for a flood plain of ~4 km2 at 9m spatial resolution (which is significantly low compared to a fully 2D hydrodynamic model run time)

ROS: The Research Output Service. Heriot-Watt University Edinburgh