Skip to main content
Article thumbnail
Location of Repository

Online Decision Problems with Large Strategy Sets

By Robert David Kleinberg


In an online decision problem, an algorithm performs a sequence of trials, each of which involves selecting one element from a fixed set of alternatives (the “strategy set”) whose costs vary over time. After T trials, the combined cost of the algorithm’s choices is compared with that of the single strategy whose combined cost is minimum. Their difference is called regret, and one seeks algorithms which are efficient in that their regret is sublinear in T and polynomial in the problem size. We study an important class of online decision problems called generalized multiarmed bandit problems. In the past such problems have found applications in areas as diverse as statistics, computer science, economic theory, and medical decision-making. Most existing algorithms were efficient only in the case of a small (i.e. polynomialsized) strategy set. We extend the theory by supplying non-trivial algorithms and lower bounds for cases in which the strategy set is much larger (exponential or infinite) an

Year: 2005
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.