7 research outputs found

    A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption

    Get PDF
    We study the problem of optimizing a function under a \emph{budgeted number of evaluations}. We only assume that the function is \emph{locally} smooth around one of its global optima. The difficulty of optimization is measured in terms of 1) the amount of \emph{noise} bb of the function evaluation and 2) the local smoothness, dd, of the function. A smaller dd results in smaller optimization error. We come with a new, simple, and parameter-free approach. First, for all values of bb and dd, this approach recovers at least the state-of-the-art regret guarantees. Second, our approach additionally obtains these results while being \textit{agnostic} to the values of both bb and dd. This leads to the first algorithm that naturally adapts to an \textit{unknown} range of noise bb and leads to significant improvements in a moderate and low-noise regime. Third, our approach also obtains a remarkable improvement over the state-of-the-art SOO algorithm when the noise is very low which includes the case of optimization under deterministic feedback (b=0b=0). There, under our minimal local smoothness assumption, this improvement is of exponential magnitude and holds for a class of functions that covers the vast majority of functions that practitioners optimize (d=0d=0). We show that our algorithmic improvement is borne out in experiments as we empirically show faster convergence on common benchmarks

    A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption

    Get PDF
    International audienceWe study the problem of optimizing a function under a budgeted number of evaluations. We only assume that the function is locally smooth around one of its global optima. The difficulty of optimization is measured in terms of 1) the amount of noise b of the function evaluation and 2) the local smoothness, d, of the function. A smaller d results in smaller optimization error. We come with a new, simple, and parameter-free approach. First, for all values of b and d, this approach recovers at least the state-of-the-art regret guarantees. Second, our approach additionally obtains these results while being agnostic to the values of both b and d. This leads to the first algorithm that naturally adapts to an unknown range of noise b and leads to significant improvements in a moderate and low-noise regime. Third, our approach also obtains a remarkable improvement over the state-of-the-art SOO algorithm when the noise is very low which includes the case of optimization under deterministic feedback (b=0). There, under our minimal local smoothness assumption, this improvement is of exponential magnitude and holds for a class of functions that covers the vast majority of functions that practitioners optimize (d=0). We show that our algorithmic improvement is borne out in experiments as we empirically show faster convergence on common benchmarks

    A simple dynamic bandit algorithm for hyper-parameter tuning

    Get PDF
    International audienceHyper-parameter tuning is a major part of modern machine learning systems. The tuning itself can be seen as a sequential resource allocation problem. As such, methods for multi-armed bandits have been already applied. In this paper, we view hyper-parameter optimization as an instance of best-arm identification in infinitely many-armed bandits. We propose D-TTTS, a new adaptive algorithm inspired by Thompson sampling, which dynamically balances between refining the estimate of the quality of hyper-parameter configurations previously explored and adding new hyper-parameter configurations to the pool of candidates. The algorithm is easy to implement and shows competitive performance compared to state-of-the-art algorithms for hyper-parameter tuning

    Regret analysis of the Piyavskii-Shubert algorithm for global Lipschitz optimization

    Full text link
    We consider the problem of maximizing a non-concave Lipschitz multivariate function f over a compact domain. We provide regret guarantees (i.e., optimization error bounds) for a very natural algorithm originally designed by Piyavskii and Shubert in 1972. Our results hold in a general setting in which values of f can only be accessed approximately. In particular, they yield state-of-the-art regret bounds both when f is observed exactly and when evaluations are perturbed by an independent subgaussian noise

    Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits

    Full text link
    We study the problem of zero-order optimization of a strongly convex function. The goal is to find the minimizer of the function by a sequential exploration of its values, under measurement noise. We study the impact of higher order smoothness properties of the function on the optimization error and on the cumulative regret. To solve this problem we consider a randomized approximation of the projected gradient descent algorithm. The gradient is estimated by a randomized procedure involving two function evaluations and a smoothing kernel. We derive upper bounds for this algorithm both in the constrained and unconstrained settings and prove minimax lower bounds for any sequential search method. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters. Based on this algorithm, we also propose an estimator of the minimum value of the function achieving almost sharp oracle behavior. We compare our results with the state-of-the-art, highlighting a number of key improvements

    Scale-free adaptive planning for deterministic dynamics & discounted rewards

    Get PDF
    International audienceWe address the problem of planning in an environment with deterministic dynamics and stochas-tic discounted rewards under a limited numerical budget where the ranges of both rewards and noise are unknown. We introduce PlaTγPOOS, an adaptive, robust, and efficient alternative to the OLOP (open-loop optimistic planning) algorithm. Whereas OLOP requires a priori knowledge of the ranges of both rewards and noise, PlaTγPOOS dynamically adapts its behavior to both. This allows PlaTγPOOS to be immune to two vulnerabil-ities of OLOP: failure when given underestimated ranges of noise and rewards and inefficiency when these are overestimated. PlaTγPOOS additionally adapts to the global smoothness of the value function. PlaTγPOOS acts in a provably more efficient manner vs. OLOP when OLOP is given an overestimated reward and show that in the case of no noise, PlaTγPOOS learns exponentially faster
    corecore