3,686 research outputs found

    A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

    Full text link
    We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences

    Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization

    Get PDF
    In this paper, we consider the problem of sequentially optimizing a black-box function ff based on noisy samples and bandit feedback. We assume that ff is smooth in the sense of having a bounded norm in some reproducing kernel Hilbert space (RKHS), yielding a commonly-considered non-Bayesian form of Gaussian process bandit optimization. We provide algorithm-independent lower bounds on the simple regret, measuring the suboptimality of a single point reported after TT rounds, and on the cumulative regret, measuring the sum of regrets over the TT chosen points. For the isotropic squared-exponential kernel in dd dimensions, we find that an average simple regret of ϵ\epsilon requires T=Ω(1ϵ2(log1ϵ)d/2)T = \Omega\big(\frac{1}{\epsilon^2} (\log\frac{1}{\epsilon})^{d/2}\big), and the average cumulative regret is at least Ω(T(logT)d/2)\Omega\big( \sqrt{T(\log T)^{d/2}} \big), thus matching existing upper bounds up to the replacement of d/2d/2 by 2d+O(1)2d+O(1) in both cases. For the Mat\'ern-ν\nu kernel, we give analogous bounds of the form Ω((1ϵ)2+d/ν)\Omega\big( (\frac{1}{\epsilon})^{2+d/\nu}\big) and Ω(Tν+d2ν+d)\Omega\big( T^{\frac{\nu + d}{2\nu + d}} \big), and discuss the resulting gaps to the existing upper bounds.Comment: Appearing in COLT 2017. This version corrects a few minor mistakes in Table I, which summarizes the new and existing regret bound

    Procrastinated Tree Search: Black-box Optimization with Delayed, Noisy, and Multi-fidelity Feedback

    Full text link
    In black-box optimization problems, we aim to maximize an unknown objective function, where the function is only accessible through feedbacks of an evaluation or simulation oracle. In real-life, the feedbacks of such oracles are often noisy and available after some unknown delay that may depend on the computation time of the oracle. Additionally, if the exact evaluations are expensive but coarse approximations are available at a lower cost, the feedbacks can have multi-fidelity. In order to address this problem, we propose a generic extension of hierarchical optimistic tree search (HOO), called ProCrastinated Tree Search (PCTS), that flexibly accommodates a delay and noise-tolerant bandit algorithm. We provide a generic proof technique to quantify regret of PCTS under delayed, noisy, and multi-fidelity feedbacks. Specifically, we derive regret bounds of PCTS enabled with delayed-UCB1 (DUCB1) and delayed-UCB-V (DUCBV) algorithms. Given a horizon TT, PCTS retains the regret bound of non-delayed HOO for expected delay of O(logT)O(\log T) and worsens by O(T1αd+2)O(T^{\frac{1-\alpha}{d+2}}) for expected delays of O(T1α)O(T^{1-\alpha}) for α(0,1]\alpha \in (0,1]. We experimentally validate on multiple synthetic functions and hyperparameter tuning problems that PCTS outperforms the state-of-the-art black-box optimization methods for feedbacks with different noise levels, delays, and fidelity

    An Entropy Search Portfolio for Bayesian Optimization

    Full text link
    Bayesian optimization is a sample-efficient method for black-box global optimization. How- ever, the performance of a Bayesian optimization method very much depends on its exploration strategy, i.e. the choice of acquisition function, and it is not clear a priori which choice will result in superior performance. While portfolio methods provide an effective, principled way of combining a collection of acquisition functions, they are often based on measures of past performance which can be misleading. To address this issue, we introduce the Entropy Search Portfolio (ESP): a novel approach to portfolio construction which is motivated by information theoretic considerations. We show that ESP outperforms existing portfolio methods on several real and synthetic problems, including geostatistical datasets and simulated control tasks. We not only show that ESP is able to offer performance as good as the best, but unknown, acquisition function, but surprisingly it often gives better performance. Finally, over a wide range of conditions we find that ESP is robust to the inclusion of poor acquisition functions.Comment: 10 pages, 5 figure
    corecore