16 research outputs found
Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits
We consider an adversarial variant of the classic -armed linear contextual
bandit problem where the sequence of loss functions associated with each arm
are allowed to change without restriction over time. Under the assumption that
the -dimensional contexts are generated i.i.d.~at random from a known
distributions, we develop computationally efficient algorithms based on the
classic Exp3 algorithm. Our first algorithm, RealLinExp3, is shown to achieve a
regret guarantee of over rounds, which matches
the best available bound for this problem. Our second algorithm, RobustLinExp3,
is shown to be robust to misspecification, in that it achieves a regret bound
of if the true
reward function is linear up to an additive nonlinear error uniformly bounded
in absolute value by . To our knowledge, our performance
guarantees constitute the very first results on this problem setting