15 research outputs found
ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis
International audienceGiven a reproducing kernel Hilbert space H of real-valued functions and a suitable measure mu over the source space D (subset of R), we decompose H as the sum of a subspace of centered functions for mu and its orthogonal in H. This decomposition leads to a special case of ANOVA kernels, for which the functional ANOVA representation of the best predictor can be elegantly derived, either in an interpolation or regularization framework. The proposed kernels appear to be particularly convenient for analyzing the e ffect of each (group of) variable(s) and computing sensitivity indices without recursivity
Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models
The interpretation of complex high-dimensional data typically requires the
use of dimensionality reduction techniques to extract explanatory
low-dimensional representations. However, in many real-world problems these
representations may not be sufficient to aid interpretation on their own, and
it would be desirable to interpret the model in terms of the original features
themselves. Our goal is to characterise how feature-level variation depends on
latent low-dimensional representations, external covariates, and non-linear
interactions between the two. In this paper, we propose to achieve this through
a structured kernel decomposition in a hybrid Gaussian Process model which we
call the Covariate Gaussian Process Latent Variable Model (c-GPLVM). We
demonstrate the utility of our model on simulated examples and applications in
disease progression modelling from high-dimensional gene expression data in the
presence of additional phenotypes. In each setting we show how the c-GPLVM can
extract low-dimensional structures from high-dimensional data sets whilst
allowing a breakdown of feature-level variability that is not present in other
commonly used dimensionality reduction approaches
On ANOVA decompositions of kernels and Gaussian random field paths
The FANOVA (or "Sobol'-Hoeffding") decomposition of multivariate functions
has been used for high-dimensional model representation and global sensitivity
analysis. When the objective function f has no simple analytic form and is
costly to evaluate, a practical limitation is that computing FANOVA terms may
be unaffordable due to numerical integration costs. Several approximate
approaches relying on random field models have been proposed to alleviate these
costs, where f is substituted by a (kriging) predictor or by conditional
simulations. In the present work, we focus on FANOVA decompositions of Gaussian
random field sample paths, and we notably introduce an associated kernel
decomposition (into 2^{2d} terms) called KANOVA. An interpretation in terms of
tensor product projections is obtained, and it is shown that projected kernels
control both the sparsity of Gaussian random field sample paths and the
dependence structure between FANOVA effects. Applications on simulated data
show the relevance of the approach for designing new classes of covariance
kernels dedicated to high-dimensional kriging
Metamodels for mixed variables by multiple kernel regression
Abstract This paper is concerned with the development of metamodels specifically tailored for mixed variables, in particular continuous and categorical variables. Practically, we propose a surrogate model based on multiple kernel regression, and apply it to six benchmark test functions and a rigid frame structural analysis. When compared to other metamodels (support vector regression, ordinary least squares), the numerical results show the efficiency of the method, related to the flexible selection of different types of kernel functions. Further work will include the use of these metamodels for mixed-variable surrogate-based optimization involving computationally expensive simulations. 2
Multifidelity Information Fusion Algorithms for High-Dimensional Systems and Massive Data sets
We develop a framework for multifidelity information fusion and predictive inference in high-dimensional input spaces and in the presence of massive data sets. Hence, we tackle simultaneously the âbig N" problem for big data and the curse of dimensionality in multivariate parametric problems. The proposed methodology establishes a new paradigm for constructing response surfaces of high-dimensional stochastic dynamical systems, simultaneously accounting for multifidelity in physical models as well as multifidelity in probability space. Scaling to high dimensions is achieved by data-driven dimensionality reduction techniques based on hierarchical functional decompositions and a graph-theoretic approach for encoding custom autocorrelation structure in Gaussian process priors. Multifidelity information fusion is facilitated through stochastic autoregressive schemes and frequency-domain machine learning algorithms that scale linearly with the data. Taking together these new developments leads to linear complexity algorithms as demonstrated in benchmark problems involving deterministic and stochastic fields in up to 10â” input dimensions and 10â” training points on a standard desktop computer
Sensitivity Prewarping for Local Surrogate Modeling
In the continual effort to improve product quality and decrease operations
costs, computational modeling is increasingly being deployed to determine
feasibility of product designs or configurations. Surrogate modeling of these
computer experiments via local models, which induce sparsity by only
considering short range interactions, can tackle huge analyses of complicated
input-output relationships. However, narrowing focus to local scale means that
global trends must be re-learned over and over again. In this article, we
propose a framework for incorporating information from a global sensitivity
analysis into the surrogate model as an input rotation and rescaling
preprocessing step. We discuss the relationship between several sensitivity
analysis methods based on kernel regression before describing how they give
rise to a transformation of the input variables. Specifically, we perform an
input warping such that the "warped simulator" is equally sensitive to all
input directions, freeing local models to focus on local dynamics. Numerical
experiments on observational data and benchmark test functions, including a
high-dimensional computer simulator from the automotive industry, provide
empirical validation