9 research outputs found
Distributed Linear Bandits under Communication Constraints
We consider distributed linear bandits where agents learn collaboratively
to minimize the overall cumulative regret incurred by all agents. Information
exchange is facilitated by a central server, and both the uplink and downlink
communications are carried over channels with fixed capacity, which limits the
amount of information that can be transmitted in each use of the channels. We
investigate the regret-communication trade-off by (i) establishing
information-theoretic lower bounds on the required communications (in terms of
bits) for achieving a sublinear regret order; (ii) developing an efficient
algorithm that achieves the minimum sublinear regret order offered by
centralized learning using the minimum order of communications dictated by the
information-theoretic lower bounds. For sparse linear bandits, we show a
variant of the proposed algorithm offers better regret-communication trade-off
by leveraging the sparsity of the problem
Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency
We consider Bayesian optimization using Gaussian Process models, also
referred to as kernel-based bandit optimization. We study the methodology of
exploring the domain using random samples drawn from a distribution. We show
that this random exploration approach achieves the optimal error rates. Our
analysis is based on novel concentration bounds in an infinite dimensional
Hilbert space established in this work, which may be of independent interest.
We further develop an algorithm based on random exploration with domain
shrinking and establish its order-optimal regret guarantees under both
noise-free and noisy settings. In the noise-free setting, our analysis closes
the existing gap in regret performance and thereby resolves a COLT open
problem. The proposed algorithm also enjoys a computational advantage over
prevailing methods due to the random exploration that obviates the expensive
optimization of a non-convex acquisition function for choosing the query points
at each iteration
A Communication-Efficient Adaptive Algorithm for Federated Learning under Cumulative Regret
We consider the problem of online stochastic optimization in a distributed
setting with clients connected through a central server. We develop a
distributed online learning algorithm that achieves order-optimal cumulative
regret with low communication cost measured in the total number of bits
transmitted over the entire learning horizon. This is in contrast to existing
studies which focus on the offline measure of simple regret for learning
efficiency. The holistic measure for communication cost also departs from the
prevailing approach that \emph{separately} tackles the communication frequency
and the number of bits in each communication round