Search CORE

5 research outputs found

PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits

Author: Chaudhuri Arghya Roy
Kalyanakrishnan Shivaram
Publication venue
Publication date: 10/07/2017
Field of study

We consider the problem of identifying any

k

out of the best

m

arms in an

n

-armed stochastic multi-armed bandit. Framed in the PAC setting, this particular problem generalises both the problem of `best subset selection' and that of selecting `one out of the best m' arms [arcsk 2017]. In applications such as crowd-sourcing and drug-designing, identifying a single good solution is often not sufficient. Moreover, finding the best subset might be hard due to the presence of many indistinguishably close solutions. Our generalisation of identifying exactly

k

arms out of the best

m

, where

1 \leq k \leq m

, serves as a more effective alternative. We present a lower bound on the worst-case sample complexity for general

k

, and a fully sequential PAC algorithm, \GLUCB, which is more sample-efficient on easy instances. Also, extending our analysis to infinite-armed bandits, we present a PAC algorithm that is independent of

n

, which identifies an arm from the best

\rho

fraction of arms using at most an additive poly-log number of samples than compared to the lower bound, thereby improving over [arcsk 2017] and [Aziz+AKA:2018]. The problem of identifying

k > 1

distinct arms from the best

\rho

fraction is not always well-defined; for a special class of this problem, we present lower and upper bounds. Finally, through a reduction, we establish a relation between upper bounds for the `one out of the best

\rho

' problem for infinite instances and the `one out of the best

m

' problem for finite instances. We conjecture that it is more efficient to solve `small' finite instances using the latter formulation, rather than going through the former

arXiv.org e-Print Archive

opac.isi.ac.id

Indonesian Institute of the Art Yogyakarta

Finding All ∈-Good Arms in Stochastic Bandits

Author: Jain Lalit
Mason Blake
Nowak Robert
Tripathy Ardhendu S.
Publication venue: Scholars\u27 Mine
Publication date: 12/12/2020
Field of study

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ∈-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ∈-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ∈-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all ∈-good candidates. Mathematically, the all-∈-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2Mratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine