In this paper, we study the collaborative learning model, which concerns the
tradeoff between parallelism and communication overhead in multi-agent
multi-armed bandits. For regret minimization in multi-armed bandits, we present
the first set of tradeoffs between the number of rounds of communication among
the agents and the regret of the collaborative learning process.Comment: 13 pages, 1 figur

Karpov, Nikolai

Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits

Abstract

Similar works

Full text

Available Versions

arXiv.org e-Print Archive