7 research outputs found
A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption
We study the problem of optimizing a function under a \emph{budgeted number
of evaluations}. We only assume that the function is \emph{locally} smooth
around one of its global optima. The difficulty of optimization is measured in
terms of 1) the amount of \emph{noise} of the function evaluation and 2)
the local smoothness, , of the function. A smaller results in smaller
optimization error. We come with a new, simple, and parameter-free approach.
First, for all values of and , this approach recovers at least the
state-of-the-art regret guarantees. Second, our approach additionally obtains
these results while being \textit{agnostic} to the values of both and .
This leads to the first algorithm that naturally adapts to an \textit{unknown}
range of noise and leads to significant improvements in a moderate and
low-noise regime. Third, our approach also obtains a remarkable improvement
over the state-of-the-art SOO algorithm when the noise is very low which
includes the case of optimization under deterministic feedback (). There,
under our minimal local smoothness assumption, this improvement is of
exponential magnitude and holds for a class of functions that covers the vast
majority of functions that practitioners optimize (). We show that our
algorithmic improvement is borne out in experiments as we empirically show
faster convergence on common benchmarks
A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption
International audienceWe study the problem of optimizing a function under a budgeted number of evaluations. We only assume that the function is locally smooth around one of its global optima. The difficulty of optimization is measured in terms of 1) the amount of noise b of the function evaluation and 2) the local smoothness, d, of the function. A smaller d results in smaller optimization error. We come with a new, simple, and parameter-free approach. First, for all values of b and d, this approach recovers at least the state-of-the-art regret guarantees. Second, our approach additionally obtains these results while being agnostic to the values of both b and d. This leads to the first algorithm that naturally adapts to an unknown range of noise b and leads to significant improvements in a moderate and low-noise regime. Third, our approach also obtains a remarkable improvement over the state-of-the-art SOO algorithm when the noise is very low which includes the case of optimization under deterministic feedback (b=0). There, under our minimal local smoothness assumption, this improvement is of exponential magnitude and holds for a class of functions that covers the vast majority of functions that practitioners optimize (d=0). We show that our algorithmic improvement is borne out in experiments as we empirically show faster convergence on common benchmarks
A simple dynamic bandit algorithm for hyper-parameter tuning
International audienceHyper-parameter tuning is a major part of modern machine learning systems. The tuning itself can be seen as a sequential resource allocation problem. As such, methods for multi-armed bandits have been already applied. In this paper, we view hyper-parameter optimization as an instance of best-arm identification in infinitely many-armed bandits. We propose D-TTTS, a new adaptive algorithm inspired by Thompson sampling, which dynamically balances between refining the estimate of the quality of hyper-parameter configurations previously explored and adding new hyper-parameter configurations to the pool of candidates. The algorithm is easy to implement and shows competitive performance compared to state-of-the-art algorithms for hyper-parameter tuning
Regret analysis of the Piyavskii-Shubert algorithm for global Lipschitz optimization
We consider the problem of maximizing a non-concave Lipschitz multivariate
function f over a compact domain. We provide regret guarantees (i.e.,
optimization error bounds) for a very natural algorithm originally designed by
Piyavskii and Shubert in 1972. Our results hold in a general setting in which
values of f can only be accessed approximately. In particular, they yield
state-of-the-art regret bounds both when f is observed exactly and when
evaluations are perturbed by an independent subgaussian noise
Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits
We study the problem of zero-order optimization of a strongly convex
function. The goal is to find the minimizer of the function by a sequential
exploration of its values, under measurement noise. We study the impact of
higher order smoothness properties of the function on the optimization error
and on the cumulative regret. To solve this problem we consider a randomized
approximation of the projected gradient descent algorithm. The gradient is
estimated by a randomized procedure involving two function evaluations and a
smoothing kernel. We derive upper bounds for this algorithm both in the
constrained and unconstrained settings and prove minimax lower bounds for any
sequential search method. Our results imply that the zero-order algorithm is
nearly optimal in terms of sample complexity and the problem parameters. Based
on this algorithm, we also propose an estimator of the minimum value of the
function achieving almost sharp oracle behavior. We compare our results with
the state-of-the-art, highlighting a number of key improvements
Scale-free adaptive planning for deterministic dynamics & discounted rewards
International audienceWe address the problem of planning in an environment with deterministic dynamics and stochas-tic discounted rewards under a limited numerical budget where the ranges of both rewards and noise are unknown. We introduce PlaTγPOOS, an adaptive, robust, and efficient alternative to the OLOP (open-loop optimistic planning) algorithm. Whereas OLOP requires a priori knowledge of the ranges of both rewards and noise, PlaTγPOOS dynamically adapts its behavior to both. This allows PlaTγPOOS to be immune to two vulnerabil-ities of OLOP: failure when given underestimated ranges of noise and rewards and inefficiency when these are overestimated. PlaTγPOOS additionally adapts to the global smoothness of the value function. PlaTγPOOS acts in a provably more efficient manner vs. OLOP when OLOP is given an overestimated reward and show that in the case of no noise, PlaTγPOOS learns exponentially faster