459 research outputs found
Sequential design of computer experiments for the estimation of a probability of failure
This paper deals with the problem of estimating the volume of the excursion
set of a function above a given threshold,
under a probability measure on that is assumed to be known. In
the industrial world, this corresponds to the problem of estimating a
probability of failure of a system. When only an expensive-to-simulate model of
the system is available, the budget for simulations is usually severely limited
and therefore classical Monte Carlo methods ought to be avoided. One of the
main contributions of this article is to derive SUR (stepwise uncertainty
reduction) strategies from a Bayesian-theoretic formulation of the problem of
estimating a probability of failure. These sequential strategies use a Gaussian
process model of and aim at performing evaluations of as efficiently as
possible to infer the value of the probability of failure. We compare these
strategies to other strategies also based on a Gaussian process model for
estimating a probability of failure.Comment: This is an author-generated postprint version. The published version
is available at http://www.springerlink.co
An informational approach to the global optimization of expensive-to-evaluate functions
In many global optimization problems motivated by engineering applications,
the number of function evaluations is severely limited by time or cost. To
ensure that each evaluation contributes to the localization of good candidates
for the role of global minimizer, a sequential choice of evaluation points is
usually carried out. In particular, when Kriging is used to interpolate past
evaluations, the uncertainty associated with the lack of information on the
function can be expressed and used to compute a number of criteria accounting
for the interest of an additional evaluation at any given point. This paper
introduces minimizer entropy as a new Kriging-based criterion for the
sequential choice of points at which the function should be evaluated. Based on
\emph{stepwise uncertainty reduction}, it accounts for the informational gain
on the minimizer expected from a new evaluation. The criterion is approximated
using conditional simulations of the Gaussian process model behind Kriging, and
then inserted into an algorithm similar in spirit to the \emph{Efficient Global
Optimization} (EGO) algorithm. An empirical comparison is carried out between
our criterion and \emph{expected improvement}, one of the reference criteria in
the literature. Experimental results indicate major evaluation savings over
EGO. Finally, the method, which we call IAGO (for Informational Approach to
Global Optimization) is extended to robust optimization problems, where both
the factors to be tuned and the function evaluations are corrupted by noise.Comment: Accepted for publication in the Journal of Global Optimization (This
is the revised version, with additional details on computational problems,
and some grammatical changes
A Recommendation System for Meta-modeling: A Meta-learning Based Approach
Various meta-modeling techniques have been developed to replace computationally expensive simulation models. The performance of these meta-modeling techniques on different models is varied which makes existing model selection/recommendation approaches (e.g., trial-and-error, ensemble) problematic. To address these research gaps, we propose a general meta-modeling recommendation system using meta-learning which can automate the meta-modeling recommendation process by intelligently adapting the learning bias to problem characterizations. The proposed intelligent recommendation system includes four modules: (1) problem module, (2) meta-feature module which includes a comprehensive set of meta-features to characterize the geometrical properties of problems, (3) meta-learner module which compares the performance of instance-based and model-based learning approaches for optimal framework design, and (4) performance evaluation module which introduces two criteria, Spearman\u27s ranking correlation coefficient and hit ratio, to evaluate the system on the accuracy of model ranking prediction and the precision of the best model recommendation, respectively. To further improve the performance of meta-learning for meta-modeling recommendation, different types of feature reduction techniques, including singular value decomposition, stepwise regression and ReliefF, are studied. Experiments show that our proposed framework is able to achieve 94% correlation on model rankings, and a 91% hit ratio on best model recommendation. Moreover, the computational cost of meta-modeling recommendation is significantly reduced from an order of minutes to seconds compared to traditional trial-and-error and ensemble process. The proposed framework can significantly advance the research in meta-modeling recommendation, and can be applied for data-driven system modeling
Quantifying uncertainties on excursion sets under a Gaussian random field prior
We focus on the problem of estimating and quantifying uncertainties on the
excursion set of a function under a limited evaluation budget. We adopt a
Bayesian approach where the objective function is assumed to be a realization
of a Gaussian random field. In this setting, the posterior distribution on the
objective function gives rise to a posterior distribution on excursion sets.
Several approaches exist to summarize the distribution of such sets based on
random closed set theory. While the recently proposed Vorob'ev approach
exploits analytical formulae, further notions of variability require Monte
Carlo estimators relying on Gaussian random field conditional simulations. In
the present work we propose a method to choose Monte Carlo simulation points
and obtain quasi-realizations of the conditional field at fine designs through
affine predictors. The points are chosen optimally in the sense that they
minimize the posterior expected distance in measure between the excursion set
and its reconstruction. The proposed method reduces the computational costs due
to Monte Carlo simulations and enables the computation of quasi-realizations on
fine designs in large dimensions. We apply this reconstruction approach to
obtain realizations of an excursion set on a fine grid which allow us to give a
new measure of uncertainty based on the distance transform of the excursion
set. Finally we present a safety engineering test case where the simulation
method is employed to compute a Monte Carlo estimate of a contour line
Recommended from our members
ABC for Climate: Dealing with Expensive Simulators
A single molecule or molecule complex detection method is disclosed in certain aspects, comprising nano- or micro-fluidic channels.U
Sequential search strategies based on kriging
This manuscript has been written to obtain the French Habilitation Ă Diriger des Recherches. It is not intended to provide new academic results nor should it be considered as a reference textbook. Instead, this manuscript is a brief (and incomplete) summary of my teaching and research activities. You will find in this manuscript a compilation of some articles in which I had a significant contribution, together with some introductory paragraphs about sequential search strategies based on kriging
Recommended from our members
Sequential Design for Gaussian Process Surrogates in Noisy Level Set Estimation
We consider the problem of learning the level set for which a noisy black-box function exceeds a given threshold. To efficiently reconstruct the level set, we investigate Gaussian process (GP) metamodels and sequential design frameworks. Our focus is on strongly stochastic samplers, in particular with heavy-tailed simulation noise and low signal-to-noise ratio. We introduce the use of four GP-based metamodels in level set estimation that are robust to noise misspecification, and evaluate the performance of them. In conjunction with these metamodels, we develop several acquisition functions for guiding the sequential experimental designs, extending existing stepwise uncertainty reduction criteria to the stochastic contour-finding context. This also motivates our development of (approximate) updating formulas to efficiently compute such acquisition functions for the proposed metamodels. To expedite sequential design in stochastic experiments, we also develop adaptive batching designs, which are natural extensions of sequential design heuristics with the benefit of replication growing as response features are learned, inputs concentrate, and the metamodeling overhead rises. We develop four novel schemes that simultaneously or sequentially determine the sequential design inputs and the respective number of replicates. Our schemes are benchmarked by using synthetic examples and an application in quantitative finance (Bermudan option pricing)
Fast uncertainty reduction strategies relying on Gaussian process models
This work deals with sequential and batch-sequential evaluation strategies of real-valued functions under limited evaluation budget, using Gaussian process models. Optimal Stepwise Uncertainty Reduction (SUR) strategies are investigated for two diff erent problems, motivated by real test cases in nuclear safety. First we consider the problem of identifying the excursion set above a given threshold T of a real-valued function f. Then we study the question of finding the set of "safe controlled con gurations", i.e. the set of controlled inputs where the function remains below T, whatever the value of some others non-controlled inputs. New SUR strategies are presented, together with effi cient procedures and formulas to compute and use them in real-world applications. The use of fast formulas to recalculate quickly the posterior mean or covariance function of a Gaussian process (referred to as the "kriging update formulas") does not only provide substantial computational savings. It is also one of the key tools to derive closed-form formulas enabling a practical use of computationally-intensive sampling strategies. A contribution in batch-sequential optimization (with the multi-points Expected Improvement) is also presented.Cette thĂšse traite de stratĂ©gies d'Ă©valuation sĂ©quentielle et batch-sĂ©quentielle de fonctions Ă valeurs rĂ©elles sous un budget d'Ă©valuation limitĂ©, Ă l'aide de modĂšles Ă processus Gaussiens. Des stratĂ©gies optimales de rĂ©duction sĂ©quentielle d'incertitude (SUR) sont Ă©tudiĂ©es pour deux problĂšmes diffĂ©rents, motivĂ©s par des cas d'application en sĂ»retĂ© nuclĂ©aire. Tout d'abord, nous traitons le problĂšme d'identification d'un ensemble d'excursion au dessus d'un seuil T d'une fonction f Ă valeurs rĂ©elles. Ensuite, nous Ă©tudions le problĂšme d'identification de l'ensemble des configurations "robustes, contrĂŽlĂ©es", c'est Ă dire l'ensemble des inputs contrĂŽlĂ©s oĂč la fonction demeure sous T quelle que soit la valeur des diffĂ©rents inputs non-contrĂŽlĂ©s. De nouvelles stratĂ©gies SUR sont prĂ©sentĂ©s. Nous donnons aussi des procĂ©dures efficientes et des formules permettant d'utiliser ces stratĂ©gies sur des applications concrĂštes. L'utilisation de formules rapides pour recalculer rapidement le posterior de la moyenne ou de la fonction de covariance d'un processus Gaussien (les "formules d'update de krigeage") ne fournit pas uniquement une Ă©conomie computationnelle importante. Elles sont aussi l'un des ingrĂ©dient clĂ© pour obtenir des formules fermĂ©es permettant l'utilisation en pratique de stratĂ©gies d'Ă©valuation coĂ»teuses en temps de calcul. Une contribution en optimisation batch-sĂ©quentielle utilisant le Multi-points Expected Improvement est Ă©galement prĂ©sentĂ©e
Recommended from our members
Statistical Emulation for Environmental Sustainability Analysis
The potential effects of climate change on the environment and society are many. In order to effectively quantify the uncertainty associated with these effects, highly complex simulation models are run with detailed representations of ecosystem processes. These models are computationally expensive and can involve computer runs of several days for their outputs. Computationally cheaper models can be obtained from large ensembles of simulations using a statistical emulation.
The purpose of this thesis is to construct cheaper computational models (emulators) from simulation outputs of Lund-Potsdam-Jena-managed Land (LPJmL) which is a dynamic global vegetation and crop model. This research work is part of a project called ERMITAGE. The project links together several key component models into a common framework to better understand how the management and interaction of land, water and the earthâs climate system could be improved.
The thesis focuses specifically on emulation of major outputs from the LPJmL model; carbon fluxes (NPP, carbon loss due to heterotrophic respiration and fire carbon) and potential crop yields (cereal, rice, maize and oil crops). Future decadal changes in carbon fluxes and crop yields are modelled as linear functions of climate change and other relevant variables. The emulators are constructed using a combination of statistical techniques of stepwise least squares regression, principal component analysis, weighted least squares regression, censored regression and Gaussian process regression.
Further modelling involves sensitivity analyses to identify the relative contribution of each input variable to the total output variance. This used the Sobol global sensitivity method. The data cover the period 2001-2100 and comprise climate scenarios of several GCMs and RCPs. Under cross validation the percentage of variance explained ranges from 52-96% for carbon fluxes, 60-88% for the rainfed crops and 62-93% for the irrigated crops, averaged over climate scenarios
Parallel Gaussian Process Surrogate Bayesian Inference with Noisy Likelihood Evaluations
We consider Bayesian inference when only a limited number of noisy log-likelihood evaluations can be obtained. This occurs for example when complex simulator-based statistical models are fitted to data, and synthetic likelihood (SL) method is used to form the noisy log-likelihood estimates using computationally costly forward simulations. We frame the inference task as a sequential Bayesian experimental design problem, where the log-likelihood function is modelled with a hierarchical Gaussian process (GP) surrogate model, which is used to efficiently select additional log-likelihood evaluation locations. Motivated by recent progress in the related problem of batch Bayesian optimisation, we develop various batch-sequential design strategies which allow to run some of the potentially costly simulations in parallel. We analyse the properties of the resulting method theoretically and empirically. Experiments with several toy problems and simulation models suggest that our method is robust, highly parallelisable, and sample-efficient.Peer reviewe
- âŠ