6 research outputs found
Federated Multi-Armed Bandits
Federated multi-armed bandits (FMAB) is a new bandit paradigm that parallels
the federated learning (FL) framework in supervised learning. It is inspired by
practical applications in cognitive radio and recommender systems, and enjoys
features that are analogous to FL. This paper proposes a general framework of
FMAB and then studies two specific federated bandit models. We first study the
approximate model where the heterogeneous local models are random realizations
of the global model from an unknown distribution. This model introduces a new
uncertainty of client sampling, as the global model may not be reliably learned
even if the finite local models are perfectly known. Furthermore, this
uncertainty cannot be quantified a priori without knowledge of the
suboptimality gap. We solve the approximate model by proposing Federated Double
UCB (Fed2-UCB), which constructs a novel "double UCB" principle accounting for
uncertainties from both arm and client sampling. We show that gradually
admitting new clients is critical in achieving an O(log(T)) regret while
explicitly considering the communication cost. The exact model, where the
global bandit model is the exact average of heterogeneous local models, is then
studied as a special case. We show that, somewhat surprisingly, the
order-optimal regret can be achieved independent of the number of clients with
a careful choice of the update periodicity. Experiments using both synthetic
and real-world datasets corroborate the theoretical analysis and demonstrate
the effectiveness and efficiency of the proposed algorithms.Comment: AAAI 2021, Camera Ready. Code is available at:
https://github.com/ShenGroup/FMA