230 research outputs found
Differentiating the multipoint Expected Improvement for optimal batch design
This work deals with parallel optimization of expensive objective functions
which are modeled as sample realizations of Gaussian processes. The study is
formalized as a Bayesian optimization problem, or continuous multi-armed bandit
problem, where a batch of q > 0 arms is pulled in parallel at each iteration.
Several algorithms have been developed for choosing batches by trading off
exploitation and exploration. As of today, the maximum Expected Improvement
(EI) and Upper Confidence Bound (UCB) selection rules appear as the most
prominent approaches for batch selection. Here, we build upon recent work on
the multipoint Expected Improvement criterion, for which an analytic expansion
relying on Tallis' formula was recently established. The computational burden
of this selection rule being still an issue in application, we derive a
closed-form expression for the gradient of the multipoint Expected Improvement,
which aims at facilitating its maximization using gradient-based ascent
algorithms. Substantial computational savings are shown in application. In
addition, our algorithms are tested numerically and compared to
state-of-the-art UCB-based batch-sequential algorithms. Combining starting
designs relying on UCB with gradient-based EI local optimization finally
appears as a sound option for batch design in distributed Gaussian Process
optimization
Additive Kernels for Gaussian Process Modeling
Gaussian Process (GP) models are often used as mathematical approximations of
computationally expensive experiments. Provided that its kernel is suitably
chosen and that enough data is available to obtain a reasonable fit of the
simulator, a GP model can beneficially be used for tasks such as prediction,
optimization, or Monte-Carlo-based quantification of uncertainty. However, the
former conditions become unrealistic when using classical GPs as the dimension
of input increases. One popular alternative is then to turn to Generalized
Additive Models (GAMs), relying on the assumption that the simulator's response
can approximately be decomposed as a sum of univariate functions. If such an
approach has been successfully applied in approximation, it is nevertheless not
completely compatible with the GP framework and its versatile applications. The
ambition of the present work is to give an insight into the use of GPs for
additive models by integrating additivity within the kernel, and proposing a
parsimonious numerical method for data-driven parameter estimation. The first
part of this article deals with the kernels naturally associated to additive
processes and the properties of the GP models based on such kernels. The second
part is dedicated to a numerical procedure based on relaxation for additive
kernel parameter estimation. Finally, the efficiency of the proposed method is
illustrated and compared to other approaches on Sobol's g-function
Invariances of random fields paths, with applications in Gaussian Process Regression
We study pathwise invariances of centred random fields that can be controlled
through the covariance. A result involving composition operators is obtained in
second-order settings, and we show that various path properties including
additivity boil down to invariances of the covariance kernel. These results are
extended to a broader class of operators in the Gaussian case, via the Lo\`eve
isometry. Several covariance-driven pathwise invariances are illustrated,
including fields with symmetric paths, centred paths, harmonic paths, or sparse
paths. The proposed approach delivers a number of promising results and
perspectives in Gaussian process regression
Quantifying uncertainties on excursion sets under a Gaussian random field prior
We focus on the problem of estimating and quantifying uncertainties on the
excursion set of a function under a limited evaluation budget. We adopt a
Bayesian approach where the objective function is assumed to be a realization
of a Gaussian random field. In this setting, the posterior distribution on the
objective function gives rise to a posterior distribution on excursion sets.
Several approaches exist to summarize the distribution of such sets based on
random closed set theory. While the recently proposed Vorob'ev approach
exploits analytical formulae, further notions of variability require Monte
Carlo estimators relying on Gaussian random field conditional simulations. In
the present work we propose a method to choose Monte Carlo simulation points
and obtain quasi-realizations of the conditional field at fine designs through
affine predictors. The points are chosen optimally in the sense that they
minimize the posterior expected distance in measure between the excursion set
and its reconstruction. The proposed method reduces the computational costs due
to Monte Carlo simulations and enables the computation of quasi-realizations on
fine designs in large dimensions. We apply this reconstruction approach to
obtain realizations of an excursion set on a fine grid which allow us to give a
new measure of uncertainty based on the distance transform of the excursion
set. Finally we present a safety engineering test case where the simulation
method is employed to compute a Monte Carlo estimate of a contour line
Profile extrema for visualizing and quantifying uncertainties on excursion regions. Application to coastal flooding
We consider the problem of describing excursion sets of a real-valued
function , i.e. the set of inputs where is above a fixed threshold. Such
regions are hard to visualize if the input space dimension, , is higher than
2. For a given projection matrix from the input space to a lower dimensional
(usually ) subspace, we introduce profile sup (inf) functions that
associate to each point in the projection's image the sup (inf) of the function
constrained over the pre-image of this point by the considered projection.
Plots of profile extrema functions convey a simple, although intrinsically
partial, visualization of the set. We consider expensive to evaluate functions
where only a very limited number of evaluations, , is available, e.g.
, and we surrogate with a posterior quantity of a Gaussian process
(GP) model. We first compute profile extrema functions for the posterior mean
given evaluations of . We quantify the uncertainty on such estimates by
studying the distribution of GP profile extrema with posterior
quasi-realizations obtained from an approximating process. We control such
approximation with a bound inherited from the Borell-TIS inequality. The
technique is applied to analytical functions () and to a -dimensional
coastal flooding test case for a site located on the Atlantic French coast.
Here is a numerical model returning the area of flooded surface in the
coastal region given some offshore conditions. Profile extrema functions
allowed us to better understand which offshore conditions impact large flooding
events
On ANOVA decompositions of kernels and Gaussian random field paths
The FANOVA (or "Sobol'-Hoeffding") decomposition of multivariate functions
has been used for high-dimensional model representation and global sensitivity
analysis. When the objective function f has no simple analytic form and is
costly to evaluate, a practical limitation is that computing FANOVA terms may
be unaffordable due to numerical integration costs. Several approximate
approaches relying on random field models have been proposed to alleviate these
costs, where f is substituted by a (kriging) predictor or by conditional
simulations. In the present work, we focus on FANOVA decompositions of Gaussian
random field sample paths, and we notably introduce an associated kernel
decomposition (into 2^{2d} terms) called KANOVA. An interpretation in terms of
tensor product projections is obtained, and it is shown that projected kernels
control both the sparsity of Gaussian random field sample paths and the
dependence structure between FANOVA effects. Applications on simulated data
show the relevance of the approach for designing new classes of covariance
kernels dedicated to high-dimensional kriging
Fast calculation of Gaussian Process multiple-fold cross-validation residuals and their covariances
We generalize fast Gaussian process leave-one-out formulae to multiple-fold
cross-validation, highlighting in turn in broad settings the covariance
structure of cross-validation residuals. The employed approach, that relies on
block matrix inversion via Schur complements, is applied to both Simple and
Universal Kriging frameworks. We illustrate how resulting covariances affect
model diagnostics and how to properly transform residuals in the first place.
Beyond that, we examine how accounting for dependency between such residuals
affect cross-validation-based estimation of the scale parameter. It is found in
two distinct cases, namely in scale estimation and in broader covariance
parameter estimation via pseudo-likelihood, that correcting for covariances
between cross-validation residuals leads back to maximum likelihood estimation
or to an original variation thereof. The proposed fast calculation of Gaussian
Process multiple-fold cross-validation residuals is implemented and benchmarked
against a naive implementation, all in R language. Numerical experiments
highlight the accuracy of our approach as well as the substantial speed-ups
that it enables. It is noticeable however, as supported by a discussion on the
main drivers of computational costs and by a dedicated numerical benchmark,
that speed-ups steeply decline as the number of folds (say, all sharing the
same size) decreases. Overall, our results enable fast multiple-fold
cross-validation, have direct consequences in GP model diagnostics, and pave
the way to future work on hyperparameter fitting as well as on the promising
field of goal-oriented fold design
- …