8 research outputs found

    Finding All ∈-Good Arms in Stochastic Bandits

    Get PDF
    The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ∈-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ∈-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ∈-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all ∈-good candidates. Mathematically, the all-∈-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2Mratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs

    Variance-Dependent Best Arm Identification

    Full text link
    We study the problem of identifying the best arm in a stochastic multi-armed bandit game. Given a set of nn arms indexed from 11 to nn, each arm ii is associated with an unknown reward distribution supported on [0,1][0,1] with mean θi\theta_i and variance σi2\sigma_i^2. Assume θ1>θ2θn\theta_1 > \theta_2 \geq \cdots \geq\theta_n. We propose an adaptive algorithm which explores the gaps and variances of the rewards of the arms and makes future decisions based on the gathered information using a novel approach called \textit{grouped median elimination}. The proposed algorithm guarantees to output the best arm with probability (1δ)(1-\delta) and uses at most O(i=1n(σi2Δi2+1Δi)(lnδ1+lnlnΔi1))O \left(\sum_{i = 1}^n \left(\frac{\sigma_i^2}{\Delta_i^2} + \frac{1}{\Delta_i}\right)(\ln \delta^{-1} + \ln \ln \Delta_i^{-1})\right) samples, where Δi\Delta_i (i2i \geq 2) denotes the reward gap between arm ii and the best arm and we define Δ1=Δ2\Delta_1 = \Delta_2. This achieves a significant advantage over the variance-independent algorithms in some favorable scenarios and is the first result that removes the extra lnn\ln n factor on the best arm compared with the state-of-the-art. We further show that Ω(i=1n(σi2Δi2+1Δi)lnδ1)\Omega \left( \sum_{i = 1}^n \left( \frac{\sigma_i^2}{\Delta_i^2} + \frac{1}{\Delta_i} \right) \ln \delta^{-1} \right) samples are necessary for an algorithm to achieve the same goal, thereby illustrating that our algorithm is optimal up to doubly logarithmic terms

    Searching for structure in complex data: a modern statistical quest

    Get PDF
    Current research in statistics has taken interesting new directions, as data collected from scientific studies has become increasingly complex. At first glance, the number of experiments conducted by a scientist must be fairly large in order for a statistician to draw correct conclusions based on noisy measurements of a large number of factors. However, statisticians may often uncover simpler structure in the data, enabling accurate statistical inference based on relatively few experiments. In this snapshot, we will introduce the concept of high-dimensional statistical estimation via optimization, and illustrate this principle using an example from medical imaging. We will also present several open questions which are actively being studied by researchers in statistics
    corecore