33 research outputs found

    Data-driven Kriging models based on FANOVA-decomposition

    Get PDF
    Preprint, Working Paper, Document sans référence, etc.International audienceKriging models have been widely used in computer experiments for the analysis of time-consuming computer codes. Based on kernels, they are flexible and can be tuned to many situations. In this paper, we construct kernels that reproduce the computer code complexity by mimicking its interaction structure. While the standard tensor-product kernel implicitly assumes that all interactions are active, the new kernels are suited for a general interaction structure, and will take advantage of the absence of interaction between some inputs. The methodology is twofold. First, the interaction structure is estimated from the data, using a first initial standard Kriging model, and represented by a so-called FANOVA graph. New FANOVA-based sensitivity indices are introduced to detect active interactions. Then this graph is used to derive the form of the kernel, and the corresponding Kriging model is estimated by maximum likelihood. The performance of the overall procedure is illustrated by several 3-dimensional and 6-dimensional simulated and real examples. A substantial improvement is observed when the computer code has a relatively high level of complexit

    On ANOVA decompositions of kernels and Gaussian random field paths

    Get PDF
    The FANOVA (or "Sobol'-Hoeffding") decomposition of multivariate functions has been used for high-dimensional model representation and global sensitivity analysis. When the objective function f has no simple analytic form and is costly to evaluate, a practical limitation is that computing FANOVA terms may be unaffordable due to numerical integration costs. Several approximate approaches relying on random field models have been proposed to alleviate these costs, where f is substituted by a (kriging) predictor or by conditional simulations. In the present work, we focus on FANOVA decompositions of Gaussian random field sample paths, and we notably introduce an associated kernel decomposition (into 2^{2d} terms) called KANOVA. An interpretation in terms of tensor product projections is obtained, and it is shown that projected kernels control both the sparsity of Gaussian random field sample paths and the dependence structure between FANOVA effects. Applications on simulated data show the relevance of the approach for designing new classes of covariance kernels dedicated to high-dimensional kriging

    Total Interaction Index: A Variance-based Sensitivity Index for Function Decomposition

    Get PDF
    http://mucm.ac.uk/UCM2012/Forms/Downloads/Posters/Fruth.pdfInternational audienc

    Additive Kernels for Gaussian Process Modeling

    Get PDF
    Gaussian Process (GP) models are often used as mathematical approximations of computationally expensive experiments. Provided that its kernel is suitably chosen and that enough data is available to obtain a reasonable fit of the simulator, a GP model can beneficially be used for tasks such as prediction, optimization, or Monte-Carlo-based quantification of uncertainty. However, the former conditions become unrealistic when using classical GPs as the dimension of input increases. One popular alternative is then to turn to Generalized Additive Models (GAMs), relying on the assumption that the simulator's response can approximately be decomposed as a sum of univariate functions. If such an approach has been successfully applied in approximation, it is nevertheless not completely compatible with the GP framework and its versatile applications. The ambition of the present work is to give an insight into the use of GPs for additive models by integrating additivity within the kernel, and proposing a parsimonious numerical method for data-driven parameter estimation. The first part of this article deals with the kernels naturally associated to additive processes and the properties of the GP models based on such kernels. The second part is dedicated to a numerical procedure based on relaxation for additive kernel parameter estimation. Finally, the efficiency of the proposed method is illustrated and compared to other approaches on Sobol's g-function

    Extending Morris Method: identification of the interaction graph using cycle-equitabe designs

    Get PDF
    International audienceThe paper presents designs that allow detection of mixed effects when performing preliminary screening of the inputs of a scalar function of dd input factors, in the spirit of Morris' Elementary Effects approach. We introduce the class of (d,c)(d,c)-cycle equitable designs as those that enable computation of exactly cc second order effects on all possible pairs of input factors. Using these designs, we propose a fast Mixed Effects screening method, that enables efficient identification of the interaction graph of the input variables. Design definition is formally supported on the establishment of an isometry between sub-graphs of the unit cube QdQ_d equipped of the Manhattan metric, and a set of polynomials in (X1,
,Xd)(X_1,\ldots, X_d) on which a convenient inner product is defined. In the paper we present systems of equations that recursively define these (d,c)(d,c)-cycle equitable designs for generic values of c≄1c\geq 1, from which direct algorithmic implementations are derived. Application cases are presented, illustrating the application of the proposed designs to the estimation of the interaction graph of specific functions

    Black-box optimization of mixed discrete-continuous optimization problems

    Get PDF
    In numerous applications in industry it is becoming standard practice to study complex, real-world processes with the help of computer experiments - simulations. With increasing computing capabilities it has become customary to perform simulation studies beforehand, where the desired process characteristics can be optimized. Computer experiments which have only continuous inputs have been studied and applied with great success in the past in a large variety of different fields. However, many experiments in practice have mixed quantitative and qualitative inputs. Such mixed-input experiments have only recently begun to receive more attention, but the field of research is still very new. Computer experiments very often take a long time to run, ranging from hours to days, making it impossible to perform direct optimization on the computer code. Instead, the simulator can be considered as a black-box function and a (meta-)model, which is cheaper to evaluate, is used to interpolate the simulation. In this thesis we develop models and optimization methods for experiments, which have purely continuous outputs, as well as for experiments with mixed qualitative-quantitative inputs. The optimization of expensive to evaluate black-box functions is often performed with the help of model-based sequential strategies. A popular choice is the efficient global optimization (EGO) algorithm, which is based on the prominent Kriging metamodel and the expected improvement (EI) search criterion. Kriging allows for a great flexibility and can be used to approximate highly non-linear functions. It also provides a local uncertainty estimator at unknown locations, which, together with the EI criterion, can be used to guide the EGO algorithm to less explored regions of the search space. EGO based strategies have been successfully applied in numerous simulation studies. However, there are a few drawbacks of the EGO algorithm – for example both the Kriging model and the EI criterion operate under the normality assumption, and the classical Kriging model assumes stationarity – both of these assumptions are fairly restrictive and can lead to a substantial loss of inaccuracy, when they are violated. One further drawback of EGO is its inability to make adequate use of parallel computing. Moreover, the classical version of the EGO algorithm is only suitable for use in computer experiments with purely continuous inputs. The Kriging model uses the Euclidean distances in order to interpolate the unknown black-box function – making interpolation of mixed-input functions difficult. In this work we address all of the drawbacks of the classical Kriging model and the powerful EGO described in the previous paragraph. We develop an assumption robust version of the EGO algorithm – called keiEGO, which does not rely on the Kriging model and the EI criterion. Instead, the robust alternatives – the kernel interpolation (KI) metamodel and the statistical lower bound (SLB) criterion are implemented. The KI and the SLB criterion are less sophisticated than Kriging and the EI criterion, but they are completely free of the normality and stationarity assumptions. The keiEGO algorithm is compared to the classical Kriging model based on a few synthetic function examples and also on a simulation case study of a sheet metal forming process developed by the IUL institute of the TU Dortmund University in the course of the collaborative research center 708. Furthermore, we develop a method for parallel optimization – called ParOF, based on a technique from the field of sensitivity analysis, called FANOVA graph. This method makes possible the use of parallel computations in the optimization with EGO but also manages to achieve a dimensionality reduction of the original problem. This makes modeling and optimization much easier, because of the (reverse) curse of dimensionality. The ParOF algorithm is also compared to the classical EGO algorithm based on synthetic functions and also on the same sheet metal forming case study mentioned before. The last part of this thesis is dedicated to EGO-like optimization of experiments with mixed inputs – thus we are addressing the last issue mentioned in the previous paragraph. We start by assessing different state of the art metamodels suitable for modeling and predicting mixed inputs. We then present a new class of Kriging models capable of modeling mixed inputs – called the Gower Kriging and developed in the course of this work. The Gower Kriging is also distance-based – it uses the Gower similarity measure, which constitutes a viable distance on the space of mixed quantitative-qualitative elements. With the help of the Gower Kriging we are able to produce a generalized EGO algorithm capable of optimization in this mixed space. We then perform a small benchmarking study, based on several synthetic examples of mixed-input functions of variable complexity. In this benchmark study we compare the Gower-Kriging-based EGO method to EGO variations implemented with other state of the art models for mixed data, based on their optimization capabilities

    New methods for the sensitivity analysis of black-box functions with an application to sheet metal forming

    Get PDF
    The general field of the thesis is the sensitivity analysis of black-box functions. Sensitivity analysis studies how the variation of the output can be apportioned to the variation of input sources. It is an important tool in the construction, analysis, and optimization of computer experiments. The total interaction index is presented, which can be used for the screening of interactions. Several variance-based estimation methods are suggested. Their properties are analyzed theoretically as well as on simulations. A further chapter concerns the sensitivity analysis for models that can take functions as input variables and return a scalar value as output. A very economical sequential approach is presented, which not only discovers the sensitivity of those functional variables as a whole but identifies relevant regions in the functional domain. As a third concept, support index functions, functions of sensitivity indices over the input distribution support, are suggested. Finally, all three methods are successfully applied in the sensitivity analysis of sheet metal forming models

    Multifidelity Information Fusion Algorithms for High-Dimensional Systems and Massive Data sets

    Get PDF
    We develop a framework for multifidelity information fusion and predictive inference in high-dimensional input spaces and in the presence of massive data sets. Hence, we tackle simultaneously the “big N" problem for big data and the curse of dimensionality in multivariate parametric problems. The proposed methodology establishes a new paradigm for constructing response surfaces of high-dimensional stochastic dynamical systems, simultaneously accounting for multifidelity in physical models as well as multifidelity in probability space. Scaling to high dimensions is achieved by data-driven dimensionality reduction techniques based on hierarchical functional decompositions and a graph-theoretic approach for encoding custom autocorrelation structure in Gaussian process priors. Multifidelity information fusion is facilitated through stochastic autoregressive schemes and frequency-domain machine learning algorithms that scale linearly with the data. Taking together these new developments leads to linear complexity algorithms as demonstrated in benchmark problems involving deterministic and stochastic fields in up to 10⁔ input dimensions and 10⁔ training points on a standard desktop computer
    corecore