3 research outputs found
Bias-Robust Bayesian Optimization via Dueling Bandits
We consider Bayesian optimization in settings where observations can be
adversarially biased, for example by an uncontrolled hidden confounder. Our
first contribution is a reduction of the confounded setting to the dueling
bandit model. Then we propose a novel approach for dueling bandits based on
information-directed sampling (IDS). Thereby, we obtain the first efficient
kernelized algorithm for dueling bandits that comes with cumulative regret
guarantees. Our analysis further generalizes a previously proposed
semi-parametric linear bandit model to non-linear reward functions, and
uncovers interesting links to doubly-robust estimation
Procrastinated Tree Search: Black-box Optimization with Delayed, Noisy, and Multi-fidelity Feedback
In black-box optimization problems, we aim to maximize an unknown objective
function, where the function is only accessible through feedbacks of an
evaluation or simulation oracle. In real-life, the feedbacks of such oracles
are often noisy and available after some unknown delay that may depend on the
computation time of the oracle. Additionally, if the exact evaluations are
expensive but coarse approximations are available at a lower cost, the
feedbacks can have multi-fidelity. In order to address this problem, we propose
a generic extension of hierarchical optimistic tree search (HOO), called
ProCrastinated Tree Search (PCTS), that flexibly accommodates a delay and
noise-tolerant bandit algorithm. We provide a generic proof technique to
quantify regret of PCTS under delayed, noisy, and multi-fidelity feedbacks.
Specifically, we derive regret bounds of PCTS enabled with delayed-UCB1 (DUCB1)
and delayed-UCB-V (DUCBV) algorithms. Given a horizon , PCTS retains the
regret bound of non-delayed HOO for expected delay of and worsens
by for expected delays of for
. We experimentally validate on multiple synthetic functions
and hyperparameter tuning problems that PCTS outperforms the state-of-the-art
black-box optimization methods for feedbacks with different noise levels,
delays, and fidelity