research

Multi-scale exploration of convex functions and bandit convex optimization

Abstract

We construct a new map from a convex function to a distribution on its domain, with the property that this distribution is a multi-scale exploration of the function. We use this map to solve a decade-old open problem in adversarial bandit convex optimization by showing that the minimax regret for this problem is O~(poly(n)T)\tilde{O}(\mathrm{poly}(n) \sqrt{T}), where nn is the dimension and TT the number of rounds. This bound is obtained by studying the dual Bayesian maximin regret via the information ratio analysis of Russo and Van Roy, and then using the multi-scale exploration to solve the Bayesian problem.Comment: Preliminary version; 22 page

    Similar works