38 research outputs found
Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration
In this paper, we consider the challenge of maximizing an unknown function f
for which evaluations are noisy and are acquired with high cost. An iterative
procedure uses the previous measures to actively select the next estimation of
f which is predicted to be the most useful. We focus on the case where the
function can be evaluated in parallel with batches of fixed size and analyze
the benefit compared to the purely sequential procedure in terms of cumulative
regret. We introduce the Gaussian Process Upper Confidence Bound and Pure
Exploration algorithm (GP-UCB-PE) which combines the UCB strategy and Pure
Exploration in the same batch of evaluations along the parallel iterations. We
prove theoretical upper bounds on the regret with batches of size K for this
procedure which show the improvement of the order of sqrt{K} for fixed
iteration cost over purely sequential versions. Moreover, the multiplicative
constants involved have the property of being dimension-free. We also confirm
empirically the efficiency of GP-UCB-PE on real and synthetic problems compared
to state-of-the-art competitors
Novel Exploration Techniques (NETs) for Malaria Policy Interventions
The task of decision-making under uncertainty is daunting, especially for
problems which have significant complexity. Healthcare policy makers across the
globe are facing problems under challenging constraints, with limited tools to
help them make data driven decisions. In this work we frame the process of
finding an optimal malaria policy as a stochastic multi-armed bandit problem,
and implement three agent based strategies to explore the policy space. We
apply a Gaussian Process regression to the findings of each agent, both for
comparison and to account for stochastic results from simulating the spread of
malaria in a fixed population. The generated policy spaces are compared with
published results to give a direct reference with human expert decisions for
the same simulated population. Our novel approach provides a powerful resource
for policy makers, and a platform which can be readily extended to capture
future more nuanced policy spaces.Comment: Under-revie
Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization
In this paper, we consider the problem of sequentially optimizing a black-box
function based on noisy samples and bandit feedback. We assume that is
smooth in the sense of having a bounded norm in some reproducing kernel Hilbert
space (RKHS), yielding a commonly-considered non-Bayesian form of Gaussian
process bandit optimization. We provide algorithm-independent lower bounds on
the simple regret, measuring the suboptimality of a single point reported after
rounds, and on the cumulative regret, measuring the sum of regrets over the
chosen points. For the isotropic squared-exponential kernel in
dimensions, we find that an average simple regret of requires , and the
average cumulative regret is at least , thus matching existing upper bounds up to the replacement of by
in both cases. For the Mat\'ern- kernel, we give analogous
bounds of the form and
, and discuss the resulting
gaps to the existing upper bounds.Comment: Appearing in COLT 2017. This version corrects a few minor mistakes in
Table I, which summarizes the new and existing regret bound