Skip to main content
Article thumbnail
Location of Repository

Newtron: an efficient bandit algorithm for online multiclass prediction

By Elad Hazan and Satyen Kale

Abstract

We present an efficient algorithm for the problem of online multiclass prediction with bandit feedback in the fully adversarial setting. We measure its regret with respect to the log-loss defined in [AR09], which is parameterized by a scalar α. We prove that the regret of NEWTRON is O(log T) when α is a constant that does not vary with horizon T, and at most O(T 2/3) if α is allowed to increase to infinity with T. For α = O(log T), the regret is bounded by O ( √ T), thus solving the open problem of [KSST08, AR09]. Our algorithm is based on a novel application of the online Newton method [HAK07]. We test our algorithm and show it to perform well in experiments, even when α is a small constant.

Year: 2011
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.7939
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.satyenkale.com/pape... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.