Search CORE

267 research outputs found

Federated Linear Contextual Bandits with User-level Differential Privacy

Author: Hajzinia Meisam
Huang Ruiquan
Melis Luca
Shen Milan
Yang Jing
Zhang Huanyu
Publication venue
Publication date: 08/06/2023
Field of study

This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP). We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting. We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamental trade-offs between the learning regrets and the corresponding DP guarantees in a federated linear contextual bandits model. For CDP, we propose a federated algorithm termed as \robin and show that it is near-optimal in terms of the number of clients

M

and the privacy budget

\varepsilon

by deriving nearly-matching upper and lower regret bounds when user-level DP is satisfied. For LDP, we obtain several lower bounds, indicating that learning under user-level

(\varepsilon,\delta)

-LDP must suffer a regret blow-up factor at least {

\min\{1/\varepsilon,M\}

\min\{1/\sqrt{\varepsilon},\sqrt{M}\}

} under different conditions.Comment: Accepted by ICML 202

arXiv.org e-Print Archive

Decentralized Exploration in Multi-Armed Bandits

Author: Alami Réda
Féraud Raphaël
Laroche Romain
Publication venue
Publication date: 13/05/2019
Field of study

We consider the decentralized exploration problem: a set of players collaborate to identify the best arm by asynchronously interacting with the same stochastic environment. The objective is to insure privacy in the best arm identification problem between asynchronous, collaborative, and thrifty players. In the context of a digital service, we advocate that this decentralized approach allows a good balance between the interests of users and those of service providers: the providers optimize their services, while protecting the privacy of the users and saving resources. We define the privacy level as the amount of information an adversary could infer by intercepting the messages concerning a single user. We provide a generic algorithm Decentralized Elimination, which uses any best arm identification algorithm as a subroutine. We prove that this algorithm insures privacy, with a low communication cost, and that in comparison to the lower bound of the best arm identification problem, its sample complexity suffers from a penalty depending on the inverse of the probability of the most frequent players. Then, thanks to the genericity of the approach, we extend the proposed algorithm to the non-stationary bandits. Finally, experiments illustrate and complete the analysis

arXiv.org e-Print Archive

Corrupt Bandits for Preserving Local Privacy

Author: Gajane Pratik
Kaufmann Emilie
Urvoy Tanguy
Publication venue
Publication date: 02/11/2017
Field of study

We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower bound on the expected regret of any bandit algorithm in this corrupted setting. We devise a frequentist algorithm, KLUCB-CF, and a Bayesian algorithm, TS-CF and give upper bounds on their regret. We also provide the appropriate corruption parameters to guarantee a desired level of local privacy and analyze how this impacts the regret. Finally, we present some experimental results that confirm our analysis

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot