2,296 research outputs found
Stochastic convex optimization with bandit feedback
This paper addresses the problem of minimizing a convex, Lipschitz function
over a convex, compact set \xset under a stochastic bandit feedback
model. In this model, the algorithm is allowed to observe noisy realizations of
the function value at any query point x \in \xset. The quantity of
interest is the regret of the algorithm, which is the sum of the function
values at algorithm's query points minus the optimal function value. We
demonstrate a generalization of the ellipsoid algorithm that incurs
\otil(\poly(d)\sqrt{T}) regret. Since any algorithm has regret at least
on this problem, our algorithm is optimal in terms of the
scaling with
- …