Search CORE

98 research outputs found

Information Directed Sampling and Bandits with Heteroscedastic Noise

Author: Kirschner Johannes
Krause Andreas
Publication venue
Publication date: 01/01/2018
Field of study

In the stochastic bandit problem, the goal is to maximize an unknown function via a sequence of noisy evaluations. Typically, the observation noise is assumed to be independent of the evaluation point and to satisfy a tail bound uniformly on the domain; a restrictive assumption for many applications. In this work, we consider bandits with heteroscedastic noise, where we explicitly allow the noise distribution to depend on the evaluation point. We show that this leads to new trade-offs for information and regret, which are not taken into account by existing approaches like upper confidence bound algorithms (UCB) or Thompson Sampling. To address these shortcomings, we introduce a frequentist regret analysis framework, that is similar to the Bayesian framework of Russo and Van Roy (2014), and we prove a new high-probability regret bound for general, possibly randomized policies, which depends on a quantity we refer to as regret-information ratio. From this bound, we define a frequentist version of Information Directed Sampling (IDS) to minimize the regret-information ratio over all possible action sampling distributions. This further relies on concentration inequalities for online least squares regression in separable Hilbert spaces, which we generalize to the case of heteroscedastic noise. We then formulate several variants of IDS for linear and reproducing kernel Hilbert space response functions, yielding novel algorithms for Bayesian optimization. We also prove frequentist regret bounds, which in the homoscedastic case recover known bounds for UCB, but can be much better when the noise is heteroscedastic. Empirically, we demonstrate in a linear setting with heteroscedastic noise, that some of our methods can outperform UCB and Thompson Sampling, while staying competitive when the noise is homoscedastic.Comment: Figure 1a,2a update

arXiv.org e-Print Archive

Repository for Publications and Research Data

Bias-Robust Bayesian Optimization via Dueling Bandits

Author: Kirschner Johannes
Krause Andreas
Publication venue
Publication date: 01/01/2021
Field of study

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation

arXiv.org e-Print Archive

Repository for Publications and Research Data

Stochastic Bandits with Context Distributions

Author: Kirschner Johannes
Krause Andreas
Publication venue
Publication date: 14/11/2019
Field of study

We introduce a stochastic contextual bandit model where at each time step the environment chooses a distribution over a context set and samples the context from this distribution. The learner observes only the context distribution while the exact context realization remains hidden. This allows for a broad range of applications where the context is stochastic or when the learner needs to predict the context. We leverage the UCB algorithm to this setting and show that it achieves an order-optimal high-probability bound on the cumulative regret for linear and kernelized reward functions. Our results strictly generalize previous work in the sense that both our model and the algorithm reduce to the standard setting when the environment chooses only Dirac delta distributions and therefore provides the exact context to the learner. We further analyze a variant where the learner observes the realized context after choosing the action. Finally, we demonstrate the proposed method on synthetic and real-world datasets.Comment: Accepted at NeurIPS 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications

Author: Kirschner Johannes
Krause Andreas
Lattimore Tor
Publication venue
Publication date: 13/11/2023
Field of study

Partial monitoring is an expressive framework for sequential decision-making with an abundance of applications, including graph-structured and dueling bandits, dynamic pricing and transductive feedback models. We survey and extend recent results on the linear formulation of partial monitoring that naturally generalizes the standard linear bandit setting. The main result is that a single algorithm, information-directed sampling (IDS), is (nearly) worst-case rate optimal in all finite-action games. We present a simple and unified analysis of stochastic partial monitoring, and further extend the model to the contextual and kernelized setting

arXiv.org e-Print Archive

Information Directed Sampling for Linear Partial Monitoring

Author: Kirschner Johannes
Krause Andreas
Lattimore Tor
Publication venue
Publication date: 25/02/2020
Field of study

Partial monitoring is a rich framework for sequential decision making under uncertainty that generalizes many well known bandit models, including linear, combinatorial and dueling bandits. We introduce information directed sampling (IDS) for stochastic partial monitoring with a linear reward and observation structure. IDS achieves adaptive worst-case regret rates that depend on precise observability conditions of the game. Moreover, we prove lower bounds that classify the minimax regret of all finite games into four possible regimes. IDS achieves the optimal rate in all cases up to logarithmic factors, without tuning any hyper-parameters. We further extend our results to the contextual and the kernelized setting, which significantly increases the range of possible applications

arXiv.org e-Print Archive

Repository for Publications and Research Data

Distributionally Robust Bayesian Optimization

Author: Bogunovic Ilija
Jegelka Stefanie
Kirschner Johannes
Krause Andreas
Publication venue
Publication date: 01/01/2020
Field of study

Robustness to distributional shift is one of the key challenges of contemporary machine learning. Attaining such robustness is the goal of distributionally robust optimization, which seeks a solution to an optimization problem that is worst-case robust under a specified distributional shift of an uncontrolled covariate. In this paper, we study such a problem when the distributional shift is measured via the maximum mean discrepancy (MMD). For the setting of zeroth-order, noisy optimization, we present a novel distributionally robust Bayesian optimization algorithm (DRBO). Our algorithm provably obtains sub-linear robust regret in various settings that differ in how the uncertain covariate is observed. We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.Comment: Accepted at AISTATS 202

arXiv.org e-Print Archive

Repository for Publications and Research Data

Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces

Author: Hiller Nicole
Ischebeck Rasmus
Kirschner Johannes
Krause Andreas
Mutný Mojmír
Publication venue
Publication date: 01/01/2019
Field of study

Bayesian optimization is known to be difficult to scale to high dimensions, because the acquisition step requires solving a non-convex optimization problem in the same search space. In order to scale the method and keep its benefits, we propose an algorithm (LineBO) that restricts the problem to a sequence of iteratively chosen one-dimensional sub-problems that can be solved efficiently. We show that our algorithm converges globally and obtains a fast local rate when the function is strongly convex. Further, if the objective has an invariant subspace, our method automatically adapts to the effective dimension without changing the algorithm. When combined with the SafeOpt algorithm to solve the sub-problems, we obtain the first safe Bayesian optimization algorithm with theoretical guarantees applicable in high-dimensional settings. We evaluate our method on multiple synthetic benchmarks, where we obtain competitive performance. Further, we deploy our algorithm to optimize the beam intensity of the Swiss Free Electron Laser with up to 40 parameters while satisfying safe operation constraints

arXiv.org e-Print Archive

Repository for Publications and Research Data

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Author: Ayoub Alex
Dehghan Masood
Kirschner Johannes
Schuurmans Dale
Zanini Francesco
Zhang Junxi
Zhang Zichen
Publication venue
Publication date: 16/01/2024
Field of study

A default assumption in reinforcement learning (RL) and optimal control is that observations arrive at discrete time points on a fixed clock cycle. Yet, many applications involve continuous-time systems where the time discretization, in principle, can be managed. The impact of time discretization on RL methods has not been fully characterized in existing theory, but a more detailed analysis of its effect could reveal opportunities for improving data-efficiency. We address this gap by analyzing Monte-Carlo policy evaluation for LQR systems and uncover a fundamental trade-off between approximation and statistical error in value estimation. Importantly, these two errors behave differently to time discretization, leading to an optimal choice of temporal resolution for a given data budget. These findings show that managing the temporal resolution can provably improve policy evaluation efficiency in LQR systems with finite data. Empirically, we demonstrate the trade-off in numerical simulations of LQR instances and standard RL benchmarks for non-linear continuous control.Comment: NeurIPS 202

arXiv.org e-Print Archive

Tuning Particle Accelerators with Safety Constraints using Bayesian Optimization

Author: de Portugal Jaime Coello
Hiller Nicole
Kirschner Johannes
Krause Andreas
Mutný Mojmir
Snuverink Jochem
Publication venue
Publication date: 29/03/2022
Field of study

Tuning machine parameters of particle accelerators is a repetitive and time-consuming task, that is challenging to automate. While many off-the-shelf optimization algorithms are available, in practice their use is limited because most methods do not account for safety-critical constraints that apply to each iteration, including loss signals or step-size limitations. One notable exception is safe Bayesian optimization, which is a data-driven tuning approach for global optimization with noisy feedback. We propose and evaluate a step size-limited variant of safe Bayesian optimization on two research faculties of the Paul Scherrer Institut (PSI): a) the Swiss Free Electron Laser (SwissFEL) and b) the High-Intensity Proton Accelerator (HIPA). We report promising experimental results on both machines, tuning up to 16 parameters subject to more than 200 constraints

arXiv.org e-Print Archive

Repository for Publications and Research Data

O(\alpha^2 L) Radiative Corrections to Deep Inelastic ep Scattering

Author: Akhundov
Akhundov
Akusevich
Akusevich
Arbuzov
Arbuzov
Ball
Bardin
Bardin
Bardin
Bardin
Bardin
Bartels
Bartels
Beenakker
Berends
Berends
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Blümlein
Böhm
Böhm
Callan
Consoli
Courau
Curci
Donnachie
Donnachie
Donnachie
Donnachie
Floratos
Furmanski
Hiroyuki Kawamura
Ingelman
Johannes Blümlein
Kirschner
Kripfganz
Kripfganz
Kuraev
Kwiatkowski
Mo
Montagna
Royon
Spiesberger
Symanzik
Symanzik
Wüsthoff
Publication venue: 'Elsevier BV'
Publication date: 13/11/2002
Field of study

The leptonic QED radiative corrections are calculated in the next-to-leading log approximation

{\cal O}[\alpha^2 \ln(Q^2/m_e^2)]

for unpolarized deeply inelastic

ep

--scattering in the case of mixed variables. The corrections are determined using mass factorization in the OMS--scheme for the double--differential scattering cross sections.Comment: 10 pages LATEX, 1 style file

arXiv.org e-Print Archive

Crossref

CERN Document Server