14,554 research outputs found
Logarithmic Communication for Distributed Optimization in Multi-Agent Systems
Classically, the design of multi-agent systems is approached using techniques from distributed optimization such as dual descent and consensus algorithms. Such algorithms depend on convergence to global consensus before any individual agent can determine its local action. This leads to challenges with respect to communication overhead and robustness, and improving algorithms with respect to these measures has been a focus of the community for decades.
This paper presents a new approach for multi-agent system design based on ideas from the emerging field of local computation algorithms. The framework we develop, LOcal Convex Optimization (LOCO), is the first local computation algorithm for convex optimization problems and can be applied in a wide-variety of settings. We demonstrate the generality of the framework via applications to Network Utility Maximization (NUM) and the distributed training of Support Vector Machines (SVMs), providing numerical results illustrating the improvement compared to classical distributed optimization approaches in each case
Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up
We analyse the learning performance of Distributed Gradient Descent in the
context of multi-agent decentralised non-parametric regression with the square
loss function when i.i.d. samples are assigned to agents. We show that if
agents hold sufficiently many samples with respect to the network size, then
Distributed Gradient Descent achieves optimal statistical rates with a number
of iterations that scales, up to a threshold, with the inverse of the spectral
gap of the gossip matrix divided by the number of samples owned by each agent
raised to a problem-dependent power. The presence of the threshold comes from
statistics. It encodes the existence of a "big data" regime where the number of
required iterations does not depend on the network topology. In this regime,
Distributed Gradient Descent achieves optimal statistical rates with the same
order of iterations as gradient descent run with all the samples in the
network. Provided the communication delay is sufficiently small, the
distributed protocol yields a linear speed-up in runtime compared to the
single-machine protocol. This is in contrast to decentralised optimisation
algorithms that do not exploit statistics and only yield a linear speed-up in
graphs where the spectral gap is bounded away from zero. Our results exploit
the statistical concentration of quantities held by agents and shed new light
on the interplay between statistics and communication in decentralised methods.
Bounds are given in the standard non-parametric setting with source/capacity
assumptions
- …