Search CORE

994 research outputs found

A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

Author: Bogunovic Ilija
Krause Andreas
Li Zihan
Scarlett Jonathan
Publication venue
Publication date: 01/01/2022
Field of study

We consider the sequential optimization of an unknown, continuous, and expensive to evaluate reward function, from noisy and adversarially corrupted observed rewards. When the corruption attacks are subject to a suitable budget

C

and the function lives in a Reproducing Kernel Hilbert Space (RKHS), the problem can be posed as corrupted Gaussian process (GP) bandit optimization. We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants. Our algorithm, Robust GP Phased Elimination (RGP-PE), successfully balances robustness to corruptions with exploration and exploitation such that its performance degrades minimally in the presence (or absence) of adversarial corruptions. When

T

is the number of samples and

\gamma_T

is the maximal information gain, the corruption-dependent term in our regret bound is

O(C \gamma_T^{3/2})

, which is significantly tighter than the existing

O(C \sqrt{T \gamma_T})

for several commonly-considered kernels. We perform the first empirical study of robustness in the corrupted GP bandit setting, and show that our algorithm is robust against a variety of adversarial attacks.Comment: Added reference

arXiv.org e-Print Archive

Repository for Publications and Research Data

Stochastic Linear Bandits Robust to Adversarial Attacks

Author: Bogunovic Ilija
Krause Andreas
Losalka Arpan
Scarlett Jonathan
Publication venue
Publication date: 07/07/2020
Field of study

We consider a stochastic linear bandit problem in which the rewards are not only subject to random noise, but also adversarial attacks subject to a suitable budget

C

(i.e., an upper bound on the sum of corruption magnitudes across the time horizon). We provide two variants of a Robust Phased Elimination algorithm, one that knows

C

and one that does not. Both variants are shown to attain near-optimal regret in the non-corrupted case

C = 0

, while incurring additional additive terms respectively having a linear and quadratic dependency on

C

in general. We present algorithm independent lower bounds showing that these additive terms are near-optimal. In addition, in a contextual setting, we revisit a setup of diverse contexts, and show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing

C

arXiv.org e-Print Archive

Repository for Publications and Research Data

Bias-Robust Bayesian Optimization via Dueling Bandits

Author: Kirschner Johannes
Krause Andreas
Publication venue
Publication date: 01/01/2021
Field of study

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation

arXiv.org e-Print Archive

Repository for Publications and Research Data

Contextual Search in the Presence of Irrational Agents

Author: Krishnamurthy Akshay
Lykouris Thodoris
Podimata Chara
Schapire Robert
Publication venue
Publication date: 07/11/2020
Field of study

We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard game-theoretic formulations of this problem assume that agents act in accordance with a specific behavioral model. In practice, however, some agents may not prescribe to the dominant behavioral model or may act in ways that are seemingly arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance in the presence of even a few such arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one built on robustifying multidimensional binary search methods and one on translating the setting to a proxy setting appropriate for gradient descent. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis.Comment: Compared to the first version titled "Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents", this version provides a broader scope of behavioral models of irrationality, specifies how the results apply to different loss functions, and discusses the power and limitations of additional algorithmic approache

arXiv.org e-Print Archive