3,843 research outputs found
SPRT-based Efficient Best Arm Identification in Stochastic Bandits
This paper investigates the best arm identification (BAI) problem in
stochastic multi-armed bandits in the fixed confidence setting. The general
class of the exponential family of bandits is considered. The state-of-the-art
algorithms for the exponential family of bandits face computational challenges.
To mitigate these challenges, a novel framework is proposed, which views the
BAI problem as sequential hypothesis testing, and is amenable to tractable
analysis for the exponential family of bandits. Based on this framework, a BAI
algorithm is designed that leverages the canonical sequential probability ratio
tests. This algorithm has three features for both settings: (1) its sample
complexity is asymptotically optimal, (2) it is guaranteed to be PAC,
and (3) it addresses the computational challenge of the state-of-the-art
approaches. Specifically, these approaches, which are focused only on the
Gaussian setting, require Thompson sampling from the arm that is deemed the
best and a challenger arm. This paper analytically shows that identifying the
challenger is computationally expensive and that the proposed algorithm
circumvents it. Finally, numerical experiments are provided to support the
analysis
- …