Finite-Time Logarithmic Bayes Regret Upper Bounds

Atsidakou, Alexia; Caramanis, Constantine; Katariya, Sumeet; Kveton, Branislav; Sanghavi, Sujay

Finite-Time Logarithmic Bayes Regret Upper Bounds

Authors: Alexia Atsidakou
Constantine Caramanis
Sumeet Katariya
Branislav Kveton
Sujay Sanghavi
Publication date: 21 January 2024
Publisher

Abstract

We derive the first finite-time logarithmic Bayes regret upper bounds for Bayesian bandits. In a multi-armed bandit, we obtain

O(c_\Delta \log n)

and

O(c_h \log^2 n)

upper bounds for an upper confidence bound algorithm, where

c_h

and

c_\Delta

are constants depending on the prior distribution and the gaps of bandit instances sampled from it, respectively. The latter bound asymptotically matches the lower bound of Lai (1987). Our proofs are a major technical departure from prior works, while being simple and general. To show the generality of our techniques, we apply them to linear bandits. Our results provide insights on the value of prior in the Bayesian setting, both in the objective and as a side information given to the learner. They significantly improve upon existing

\tilde{O}(\sqrt{n})

bounds, which have become standard in the literature despite the logarithmic lower bound of Lai (1987)

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2306.09136

Last time updated on 22/08/2024