660 research outputs found
Improved Convergence Rates for Distributed Resource Allocation
In this paper, we develop a class of decentralized algorithms for solving a
convex resource allocation problem in a network of agents, where the agent
objectives are decoupled while the resource constraints are coupled. The agents
communicate over a connected undirected graph, and they want to collaboratively
determine a solution to the overall network problem, while each agent only
communicates with its neighbors. We first study the connection between the
decentralized resource allocation problem and the decentralized consensus
optimization problem. Then, using a class of algorithms for solving consensus
optimization problems, we propose a novel class of decentralized schemes for
solving resource allocation problems in a distributed manner. Specifically, we
first propose an algorithm for solving the resource allocation problem with an
convergence rate guarantee when the agents' objective functions are
generally convex (could be nondifferentiable) and per agent local convex
constraints are allowed; We then propose a gradient-based algorithm for solving
the resource allocation problem when per agent local constraints are absent and
show that such scheme can achieve geometric rate when the objective functions
are strongly convex and have Lipschitz continuous gradients. We have also
provided scalability/network dependency analysis. Based on these two
algorithms, we have further proposed a gradient projection-based algorithm
which can handle smooth objective and simple constraints more efficiently.
Numerical experiments demonstrates the viability and performance of all the
proposed algorithms
Nested Distributed Gradient Methods with Adaptive Quantized Communication
In this paper, we consider minimizing a sum of local convex objective
functions in a distributed setting, where communication can be costly. We
propose and analyze a class of nested distributed gradient methods with
adaptive quantized communication (NEAR-DGD+Q). We show the effect of performing
multiple quantized communication steps on the rate of convergence and on the
size of the neighborhood of convergence, and prove R-Linear convergence to the
exact solution with increasing number of consensus steps and adaptive
quantization. We test the performance of the method, as well as some practical
variants, on quadratic functions, and show the effects of multiple quantized
communication steps in terms of iterations/gradient evaluations, communication
and cost.Comment: 9 pages, 2 figures. arXiv admin note: text overlap with
arXiv:1709.0299
D: Decentralized Training over Decentralized Data
While training a machine learning model using multiple workers, each of which
collects data from their own data sources, it would be most useful when the
data collected from different workers can be {\em unique} and {\em different}.
Ironically, recent analysis of decentralized parallel stochastic gradient
descent (D-PSGD) relies on the assumption that the data hosted on different
workers are {\em not too different}. In this paper, we ask the question: {\em
Can we design a decentralized parallel stochastic gradient descent algorithm
that is less sensitive to the data variance across workers?} In this paper, we
present D, a novel decentralized parallel stochastic gradient descent
algorithm designed for large data variance \xr{among workers} (imprecisely,
"decentralized" data). The core of D is a variance blackuction extension of
the standard D-PSGD algorithm, which improves the convergence rate from
to where
denotes the variance among data on different workers. As a result, D is
robust to data variance among workers. We empirically evaluated D on image
classification tasks where each worker has access to only the data of a limited
set of labels, and find that D significantly outperforms D-PSGD
- …