3 research outputs found
Decoy Bandits Dueling on a Poset
We adress the problem of dueling bandits defined on partially ordered sets,
or posets. In this setting, arms may not be comparable, and there may be
several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits,
that efficiently finds the set of optimal arms of any poset even when pairs of
comparable arms cannot be distinguished from pairs of incomparable arms, with a
set of minimal assumptions. This algorithm relies on the concept of decoys,
which stems from social psychology. For the easier case where the
incomparability information may be accessible, we propose a second algorithm,
SlicingBandits, which takes advantage of this information and achieves a very
significant gain of performance compared to UnchainedBandits. We provide
theoretical guarantees and experimental evaluation for both algorithms
Bandits Dueling on Partially Ordered Sets
International audienceWe address the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms —the Pareto front— of any poset even when pairs of comparable arms cannot be a priori distinguished from pairs of incomparable arms, with a set of minimal assumptions. This means that Un-chainedBandits does not require information about comparability and can be used with limited knowledge of the poset. To achieve this, the algorithm relies on the concept of decoys, which stems from social psychology. We also provide theoretical guarantees on both the regret incurred and the number of comparison required by UnchainedBandits, and we report compelling empirical results