136 research outputs found
Asymptotic analysis of the role of spatial sampling for covariance parameter estimation of Gaussian processes
Covariance parameter estimation of Gaussian processes is analyzed in an
asymptotic framework. The spatial sampling is a randomly perturbed regular grid
and its deviation from the perfect regular grid is controlled by a single
scalar regularity parameter. Consistency and asymptotic normality are proved
for the Maximum Likelihood and Cross Validation estimators of the covariance
parameters. The asymptotic covariance matrices of the covariance parameter
estimators are deterministic functions of the regularity parameter. By means of
an exhaustive study of the asymptotic covariance matrices, it is shown that the
estimation is improved when the regular grid is strongly perturbed. Hence, an
asymptotic confirmation is given to the commonly admitted fact that using
groups of observation points with small spacing is beneficial to covariance
function estimation. Finally, the prediction error, using a consistent
estimator of the covariance parameters, is analyzed in details.Comment: 47 pages. A supplementary material (pdf) is available in the arXiv
source
Asymptotic analysis of covariance parameter estimation for Gaussian processes in the misspecified case
In parametric estimation of covariance function of Gaussian processes, it is
often the case that the true covariance function does not belong to the
parametric set used for estimation. This situation is called the misspecified
case. In this case, it has been shown that, for irregular spatial sampling of
observation points, Cross Validation can yield smaller prediction errors than
Maximum Likelihood. Motivated by this observation, we provide a general
asymptotic analysis of the misspecified case, for independent and uniformly
distributed observation points. We prove that the Maximum Likelihood estimator
asymptotically minimizes a Kullback-Leibler divergence, within the misspecified
parametric set, while Cross Validation asymptotically minimizes the integrated
square prediction error. In a Monte Carlo simulation, we show that the
covariance parameters estimated by Maximum Likelihood and Cross Validation, and
the corresponding Kullback-Leibler divergences and integrated square prediction
errors, can be strongly contrasting. On a more technical level, we provide new
increasing-domain asymptotic results for independent and uniformly distributed
observation points.Comment: A supplementary material (pdf) is available in the arXiv source
Differentiating the multipoint Expected Improvement for optimal batch design
This work deals with parallel optimization of expensive objective functions
which are modeled as sample realizations of Gaussian processes. The study is
formalized as a Bayesian optimization problem, or continuous multi-armed bandit
problem, where a batch of q > 0 arms is pulled in parallel at each iteration.
Several algorithms have been developed for choosing batches by trading off
exploitation and exploration. As of today, the maximum Expected Improvement
(EI) and Upper Confidence Bound (UCB) selection rules appear as the most
prominent approaches for batch selection. Here, we build upon recent work on
the multipoint Expected Improvement criterion, for which an analytic expansion
relying on Tallis' formula was recently established. The computational burden
of this selection rule being still an issue in application, we derive a
closed-form expression for the gradient of the multipoint Expected Improvement,
which aims at facilitating its maximization using gradient-based ascent
algorithms. Substantial computational savings are shown in application. In
addition, our algorithms are tested numerically and compared to
state-of-the-art UCB-based batch-sequential algorithms. Combining starting
designs relying on UCB with gradient-based EI local optimization finally
appears as a sound option for batch design in distributed Gaussian Process
optimization
Unsupervised Learning via Mixtures of Skewed Distributions with Hypercube Contours
Mixture models whose components have skewed hypercube contours are developed
via a generalization of the multivariate shifted asymmetric Laplace density.
Specifically, we develop mixtures of multiple scaled shifted asymmetric Laplace
distributions. The component densities have two unique features: they include a
multivariate weight function, and the marginal distributions are also
asymmetric Laplace. We use these mixtures of multiple scaled shifted asymmetric
Laplace distributions for clustering applications, but they could equally well
be used in the supervised or semi-supervised paradigms. The
expectation-maximization algorithm is used for parameter estimation and the
Bayesian information criterion is used for model selection. Simulated and real
data sets are used to illustrate the approach and, in some cases, to visualize
the skewed hypercube structure of the components
- …