Search CORE

3,686 research outputs found

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Author: Brochu Eric
Cora Vlad M.
de Freitas Nando
Publication venue
Publication date: 01/01/2009
Field of study

We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive

Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization

Author: Bogunovic Ilijia
Cevher Volkan
Scarlett Jonathan
Publication venue
Publication date: 31/05/2017
Field of study

In this paper, we consider the problem of sequentially optimizing a black-box function

f

based on noisy samples and bandit feedback. We assume that

f

is smooth in the sense of having a bounded norm in some reproducing kernel Hilbert space (RKHS), yielding a commonly-considered non-Bayesian form of Gaussian process bandit optimization. We provide algorithm-independent lower bounds on the simple regret, measuring the suboptimality of a single point reported after

T

rounds, and on the cumulative regret, measuring the sum of regrets over the

T

chosen points. For the isotropic squared-exponential kernel in

d

dimensions, we find that an average simple regret of

\epsilon

requires

T = \Omega\big(\frac{1}{\epsilon^2} (\log\frac{1}{\epsilon})^{d/2}\big)

, and the average cumulative regret is at least

\Omega\big( \sqrt{T(\log T)^{d/2}} \big)

, thus matching existing upper bounds up to the replacement of

d/2

2d+O(1)

in both cases. For the Mat\'ern-

\nu

kernel, we give analogous bounds of the form

\Omega\big( (\frac{1}{\epsilon})^{2+d/\nu}\big)

and

\Omega\big( T^{\frac{\nu + d}{2\nu + d}} \big)

, and discuss the resulting gaps to the existing upper bounds.Comment: Appearing in COLT 2017. This version corrects a few minor mistakes in Table I, which summarizes the new and existing regret bound

arXiv.org e-Print Archive

Procrastinated Tree Search: Black-box Optimization with Delayed, Noisy, and Multi-fidelity Feedback

Author: Basu Debabrota
Trummer Immanuel
Wang Junxiong
Publication venue
Publication date: 14/10/2021
Field of study

In black-box optimization problems, we aim to maximize an unknown objective function, where the function is only accessible through feedbacks of an evaluation or simulation oracle. In real-life, the feedbacks of such oracles are often noisy and available after some unknown delay that may depend on the computation time of the oracle. Additionally, if the exact evaluations are expensive but coarse approximations are available at a lower cost, the feedbacks can have multi-fidelity. In order to address this problem, we propose a generic extension of hierarchical optimistic tree search (HOO), called ProCrastinated Tree Search (PCTS), that flexibly accommodates a delay and noise-tolerant bandit algorithm. We provide a generic proof technique to quantify regret of PCTS under delayed, noisy, and multi-fidelity feedbacks. Specifically, we derive regret bounds of PCTS enabled with delayed-UCB1 (DUCB1) and delayed-UCB-V (DUCBV) algorithms. Given a horizon

T

, PCTS retains the regret bound of non-delayed HOO for expected delay of

O(\log T)

and worsens by

O(T^{\frac{1-\alpha}{d+2}})

for expected delays of

O(T^{1-\alpha})

for

\alpha \in (0,1]

. We experimentally validate on multiple synthetic functions and hyperparameter tuning problems that PCTS outperforms the state-of-the-art black-box optimization methods for feedbacks with different noise levels, delays, and fidelity

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Association for the Advancement of Artificial Intelligence: AAAI Publications

An Entropy Search Portfolio for Bayesian Optimization

Author: Bouchard-Côté Alexandre
de Freitas Nando
Hoffman Matthew W.
Shahriari Bobak
Wang Ziyu
Publication venue
Publication date: 01/01/2014
Field of study

Bayesian optimization is a sample-efficient method for black-box global optimization. How- ever, the performance of a Bayesian optimization method very much depends on its exploration strategy, i.e. the choice of acquisition function, and it is not clear a priori which choice will result in superior performance. While portfolio methods provide an effective, principled way of combining a collection of acquisition functions, they are often based on measures of past performance which can be misleading. To address this issue, we introduce the Entropy Search Portfolio (ESP): a novel approach to portfolio construction which is motivated by information theoretic considerations. We show that ESP outperforms existing portfolio methods on several real and synthetic problems, including geostatistical datasets and simulated control tasks. We not only show that ESP is able to offer performance as good as the best, but unknown, acquisition function, but surprisingly it often gives better performance. Finally, over a wide range of conditions we find that ESP is robust to the inclusion of poor acquisition functions.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive