994 research outputs found
A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits
We consider the sequential optimization of an unknown, continuous, and
expensive to evaluate reward function, from noisy and adversarially corrupted
observed rewards. When the corruption attacks are subject to a suitable budget
and the function lives in a Reproducing Kernel Hilbert Space (RKHS), the
problem can be posed as corrupted Gaussian process (GP) bandit optimization. We
propose a novel robust elimination-type algorithm that runs in epochs, combines
exploration with infrequent switching to select a small subset of actions, and
plays each action for multiple time instants. Our algorithm, Robust GP Phased
Elimination (RGP-PE), successfully balances robustness to corruptions with
exploration and exploitation such that its performance degrades minimally in
the presence (or absence) of adversarial corruptions. When is the number of
samples and is the maximal information gain, the
corruption-dependent term in our regret bound is , which
is significantly tighter than the existing for several
commonly-considered kernels. We perform the first empirical study of robustness
in the corrupted GP bandit setting, and show that our algorithm is robust
against a variety of adversarial attacks.Comment: Added reference
Stochastic Linear Bandits Robust to Adversarial Attacks
We consider a stochastic linear bandit problem in which the rewards are not
only subject to random noise, but also adversarial attacks subject to a
suitable budget (i.e., an upper bound on the sum of corruption magnitudes
across the time horizon). We provide two variants of a Robust Phased
Elimination algorithm, one that knows and one that does not. Both variants
are shown to attain near-optimal regret in the non-corrupted case ,
while incurring additional additive terms respectively having a linear and
quadratic dependency on in general. We present algorithm independent lower
bounds showing that these additive terms are near-optimal. In addition, in a
contextual setting, we revisit a setup of diverse contexts, and show that a
simple greedy algorithm is provably robust with a near-optimal additive regret
term, despite performing no explicit exploration and not knowing
Bias-Robust Bayesian Optimization via Dueling Bandits
We consider Bayesian optimization in settings where observations can be
adversarially biased, for example by an uncontrolled hidden confounder. Our
first contribution is a reduction of the confounded setting to the dueling
bandit model. Then we propose a novel approach for dueling bandits based on
information-directed sampling (IDS). Thereby, we obtain the first efficient
kernelized algorithm for dueling bandits that comes with cumulative regret
guarantees. Our analysis further generalizes a previously proposed
semi-parametric linear bandit model to non-linear reward functions, and
uncovers interesting links to doubly-robust estimation
Contextual Search in the Presence of Irrational Agents
We study contextual search, a generalization of binary search in higher
dimensions, which captures settings such as feature-based dynamic pricing.
Standard game-theoretic formulations of this problem assume that agents act in
accordance with a specific behavioral model. In practice, however, some agents
may not prescribe to the dominant behavioral model or may act in ways that are
seemingly arbitrarily irrational. Existing algorithms heavily depend on the
behavioral model being (approximately) accurate for all agents and have poor
performance in the presence of even a few such arbitrarily irrational agents.
We initiate the study of contextual search when some of the agents can behave
in ways inconsistent with the underlying behavioral model. In particular, we
provide two algorithms, one built on robustifying multidimensional binary
search methods and one on translating the setting to a proxy setting
appropriate for gradient descent. Our techniques draw inspiration from learning
theory, game theory, high-dimensional geometry, and convex analysis.Comment: Compared to the first version titled "Corrupted Multidimensional
Binary Search: Learning in the Presence of Irrational Agents", this version
provides a broader scope of behavioral models of irrationality, specifies how
the results apply to different loss functions, and discusses the power and
limitations of additional algorithmic approache
- …