We introduce a variant of the classical PAC multi-armed bandit problem. There
is an ordered set of n arms A[1],…,A[n], each with some stochastic
reward drawn from some unknown bounded distribution. The goal is to identify
the skyline of the set A, consisting of all arms A[i] such that A[i]
has larger expected reward than all lower-numbered arms A[1],…,A[i−1]. We
define a natural notion of an ε-approximate skyline and prove
matching upper and lower bounds for identifying an ε-skyline.
Specifically, we show that in order to identify an ε-skyline from
among n arms with probability 1−δ, Θ(ε2n⋅min{log(εδ1),log(δn)}) samples are necessary and sufficient. When ε≫1/n, our results improve over the naive algorithm, which draws enough samples
to approximate the expected reward of every arm; the algorithm of (Auer et al.,
AISTATS'16) for Pareto-optimal arm identification is likewise superseded. Our
results show that the sample complexity of the skyline problem lies strictly in
between that of best arm identification (Even-Dar et al., COLT'02) and that of
approximating the expected reward of every arm.Comment: 18 pages, 2 Figures; an ALT'18/ISIT'18 submissio