2 research outputs found

    Mistake bounds on the noise-free multi-armed bandit game

    Get PDF
    We study the {0, 1}-loss version of adaptive adversarial multi-armed bandit problems with alpha(>= 1) lossless arms. For the problem, we show a tight bound K - alpha - Theta(1/T) on the minimax expected number of mistakes (1-losses), where K is the number of arms and T is the number of rounds. (C) 2019 Elsevier Inc. All rights reserved
    corecore