Search CORE

33 research outputs found

Data-driven Kriging models based on FANOVA-decomposition

Author: Carraro Laurent
Kuhnt Sonja
Muehlenstaedt Thomas
Roustant Olivier
Publication venue: Springer Verlag (Germany)
Publication date: 01/01/2010
Field of study

Preprint, Working Paper, Document sans référence, etc.International audienceKriging models have been widely used in computer experiments for the analysis of time-consuming computer codes. Based on kernels, they are flexible and can be tuned to many situations. In this paper, we construct kernels that reproduce the computer code complexity by mimicking its interaction structure. While the standard tensor-product kernel implicitly assumes that all interactions are active, the new kernels are suited for a general interaction structure, and will take advantage of the absence of interaction between some inputs. The methodology is twofold. First, the interaction structure is estimated from the data, using a first initial standard Kriging model, and represented by a so-called FANOVA graph. New FANOVA-based sensitivity indices are introduced to detect active interactions. Then this graph is used to derive the form of the kernel, and the corresponding Kriging model is estimated by maximum likelihood. The performance of the overall procedure is illustrated by several 3-dimensional and 6-dimensional simulated and real examples. A substantial improvement is observed when the computer code has a relatively high level of complexit

HAL-UJM

HAL-EMSE

On ANOVA decompositions of kernels and Gaussian random field paths

Author: Durrande Nicolas
Ginsbourger David
Lenz Nicolas
Roustant Olivier
Schuhmacher Dominic
Publication venue
Publication date: 02/10/2014
Field of study

The FANOVA (or "Sobol'-Hoeffding") decomposition of multivariate functions has been used for high-dimensional model representation and global sensitivity analysis. When the objective function f has no simple analytic form and is costly to evaluate, a practical limitation is that computing FANOVA terms may be unaffordable due to numerical integration costs. Several approximate approaches relying on random field models have been proposed to alleviate these costs, where f is substituted by a (kriging) predictor or by conditional simulations. In the present work, we focus on FANOVA decompositions of Gaussian random field sample paths, and we notably introduce an associated kernel decomposition (into 2^{2d} terms) called KANOVA. An interpretation in terms of tensor product projections is obtained, and it is shown that projected kernels control both the sparsity of Gaussian random field sample paths and the dependence structure between FANOVA effects. Applications on simulated data show the relevance of the approach for designing new classes of covariance kernels dedicated to high-dimensional kriging

arXiv.org e-Print Archive

HAL-EMSE

Total Interaction Index: A Variance-based Sensitivity Index for Function Decomposition

Author: Fruth Jana
Kuhnt Sonja
Roustant Olivier
Publication venue: HAL CCSD
Publication date: 02/07/2012
Field of study

http://mucm.ac.uk/UCM2012/Forms/Downloads/Posters/Fruth.pdfInternational audienc

HAL Clermont Université

HAL-EMSE

Additive Kernels for Gaussian Process Modeling

Author: Durrande Nicolas
Ginsbourger David
Roustant Olivier
Publication venue
Publication date: 01/01/2010
Field of study

Gaussian Process (GP) models are often used as mathematical approximations of computationally expensive experiments. Provided that its kernel is suitably chosen and that enough data is available to obtain a reasonable fit of the simulator, a GP model can beneficially be used for tasks such as prediction, optimization, or Monte-Carlo-based quantification of uncertainty. However, the former conditions become unrealistic when using classical GPs as the dimension of input increases. One popular alternative is then to turn to Generalized Additive Models (GAMs), relying on the assumption that the simulator's response can approximately be decomposed as a sum of univariate functions. If such an approach has been successfully applied in approximation, it is nevertheless not completely compatible with the GP framework and its versatile applications. The ambition of the present work is to give an insight into the use of GPs for additive models by integrating additivity within the kernel, and proposing a parsimonious numerical method for data-driven parameter estimation. The first part of this article deals with the kernels naturally associated to additive processes and the properties of the GP models based on such kernels. The second part is dedicated to a numerical procedure based on relaxation for additive kernel parameter estimation. Finally, the efficiency of the proposed method is illustrated and compared to other approaches on Sobol's g-function

arXiv.org e-Print Archive

HAL-EMSE

Extending Morris Method: identification of the interaction graph using cycle-equitabe designs

Author: Fédou Jean-Marc
Torres Dolores Rendas Maria João
Publication venue: Taylor & Francis
Publication date: 04/02/2015
Field of study

International audienceThe paper presents designs that allow detection of mixed effects when performing preliminary screening of the inputs of a scalar function of

d

input factors, in the spirit of Morris' Elementary Effects approach. We introduce the class of

(d,c)

-cycle equitable designs as those that enable computation of exactly

c

second order effects on all possible pairs of input factors. Using these designs, we propose a fast Mixed Effects screening method, that enables efficient identification of the interaction graph of the input variables. Design definition is formally supported on the establishment of an isometry between sub-graphs of the unit cube

Q_d

equipped of the Manhattan metric, and a set of polynomials in

(X_1,\ldots, X_d)

on which a convenient inner product is defined. In the paper we present systems of equations that recursively define these

(d,c)

-cycle equitable designs for generic values of

c\geq 1

, from which direct algorithmic implementations are derived. Application cases are presented, illustrating the application of the proposed designs to the estimation of the interaction graph of specific functions

HAL-UNICE

Black-box optimization of mixed discrete-continuous optimization problems

Author: Halstrup Momchil
Publication venue
Publication date: 01/01/2016
Field of study

In numerous applications in industry it is becoming standard practice to study complex, real-world processes with the help of computer experiments - simulations. With increasing computing capabilities it has become customary to perform simulation studies beforehand, where the desired process characteristics can be optimized. Computer experiments which have only continuous inputs have been studied and applied with great success in the past in a large variety of different fields. However, many experiments in practice have mixed quantitative and qualitative inputs. Such mixed-input experiments have only recently begun to receive more attention, but the field of research is still very new. Computer experiments very often take a long time to run, ranging from hours to days, making it impossible to perform direct optimization on the computer code. Instead, the simulator can be considered as a black-box function and a (meta-)model, which is cheaper to evaluate, is used to interpolate the simulation. In this thesis we develop models and optimization methods for experiments, which have purely continuous outputs, as well as for experiments with mixed qualitative-quantitative inputs. The optimization of expensive to evaluate black-box functions is often performed with the help of model-based sequential strategies. A popular choice is the efficient global optimization (EGO) algorithm, which is based on the prominent Kriging metamodel and the expected improvement (EI) search criterion. Kriging allows for a great flexibility and can be used to approximate highly non-linear functions. It also provides a local uncertainty estimator at unknown locations, which, together with the EI criterion, can be used to guide the EGO algorithm to less explored regions of the search space. EGO based strategies have been successfully applied in numerous simulation studies. However, there are a few drawbacks of the EGO algorithm – for example both the Kriging model and the EI criterion operate under the normality assumption, and the classical Kriging model assumes stationarity – both of these assumptions are fairly restrictive and can lead to a substantial loss of inaccuracy, when they are violated. One further drawback of EGO is its inability to make adequate use of parallel computing. Moreover, the classical version of the EGO algorithm is only suitable for use in computer experiments with purely continuous inputs. The Kriging model uses the Euclidean distances in order to interpolate the unknown black-box function – making interpolation of mixed-input functions difficult. In this work we address all of the drawbacks of the classical Kriging model and the powerful EGO described in the previous paragraph. We develop an assumption robust version of the EGO algorithm – called keiEGO, which does not rely on the Kriging model and the EI criterion. Instead, the robust alternatives – the kernel interpolation (KI) metamodel and the statistical lower bound (SLB) criterion are implemented. The KI and the SLB criterion are less sophisticated than Kriging and the EI criterion, but they are completely free of the normality and stationarity assumptions. The keiEGO algorithm is compared to the classical Kriging model based on a few synthetic function examples and also on a simulation case study of a sheet metal forming process developed by the IUL institute of the TU Dortmund University in the course of the collaborative research center 708. Furthermore, we develop a method for parallel optimization – called ParOF, based on a technique from the field of sensitivity analysis, called FANOVA graph. This method makes possible the use of parallel computations in the optimization with EGO but also manages to achieve a dimensionality reduction of the original problem. This makes modeling and optimization much easier, because of the (reverse) curse of dimensionality. The ParOF algorithm is also compared to the classical EGO algorithm based on synthetic functions and also on the same sheet metal forming case study mentioned before. The last part of this thesis is dedicated to EGO-like optimization of experiments with mixed inputs – thus we are addressing the last issue mentioned in the previous paragraph. We start by assessing different state of the art metamodels suitable for modeling and predicting mixed inputs. We then present a new class of Kriging models capable of modeling mixed inputs – called the Gower Kriging and developed in the course of this work. The Gower Kriging is also distance-based – it uses the Gower similarity measure, which constitutes a viable distance on the space of mixed quantitative-qualitative elements. With the help of the Gower Kriging we are able to produce a generalized EGO algorithm capable of optimization in this mixed space. We then perform a small benchmarking study, based on several synthetic examples of mixed-input functions of variable complexity. In this benchmark study we compare the Gower-Kriging-based EGO method to EGO variations implemented with other state of the art models for mixed data, based on their optimization capabilities

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

New methods for the sensitivity analysis of black-box functions with an application to sheet metal forming

Author: Fruth Jana
Publication venue
Publication date
Field of study

The general field of the thesis is the sensitivity analysis of black-box functions. Sensitivity analysis studies how the variation of the output can be apportioned to the variation of input sources. It is an important tool in the construction, analysis, and optimization of computer experiments. The total interaction index is presented, which can be used for the screening of interactions. Several variance-based estimation methods are suggested. Their properties are analyzed theoretically as well as on simulations. A further chapter concerns the sensitivity analysis for models that can take functions as input variables and return a scalar value as output. A very economical sequential approach is presented, which not only discovers the sensitivity of those functional variables as a whole but identifies relevant regions in the functional domain. As a third concept, support index functions, functions of sensitivity indices over the input distribution support, are suggested. Finally, all three methods are successfully applied in the sensitivity analysis of sheet metal forming models

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Additive Gaussian Processes Revisited

Author: Boukouvalas Alexis
Hensman James
Lu Xiaoyu
Publication venue
Publication date: 20/06/2022
Field of study

The University of Manchester - Institutional Repository

Multifidelity Information Fusion Algorithms for High-Dimensional Systems and Massive Data sets

Author: Karniadakis George E
Perdikaris Paris
Venturi Daniele
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2016
Field of study

We develop a framework for multifidelity information fusion and predictive inference in high-dimensional input spaces and in the presence of massive data sets. Hence, we tackle simultaneously the “big N" problem for big data and the curse of dimensionality in multivariate parametric problems. The proposed methodology establishes a new paradigm for constructing response surfaces of high-dimensional stochastic dynamical systems, simultaneously accounting for multifidelity in physical models as well as multifidelity in probability space. Scaling to high dimensions is achieved by data-driven dimensionality reduction techniques based on hierarchical functional decompositions and a graph-theoretic approach for encoding custom autocorrelation structure in Gaussian process priors. Multifidelity information fusion is facilitated through stochastic autoregressive schemes and frequency-domain machine learning algorithms that scale linearly with the data. Taking together these new developments leads to linear complexity algorithms as demonstrated in benchmark problems involving deterministic and stochastic fields in up to 10⁵ input dimensions and 10⁵ training points on a standard desktop computer

DSpace@MIT

Crossref