14,806 research outputs found
Efficient Distributed Online Prediction and Stochastic Optimization with Approximate Distributed Averaging
We study distributed methods for online prediction and stochastic
optimization. Our approach is iterative: in each round nodes first perform
local computations and then communicate in order to aggregate information and
synchronize their decision variables. Synchronization is accomplished through
the use of a distributed averaging protocol. When an exact distributed
averaging protocol is used, it is known that the optimal regret bound of
can be achieved using the distributed mini-batch
algorithm of Dekel et al. (2012), where is the total number of samples
processed across the network. We focus on methods using approximate distributed
averaging protocols and show that the optimal regret bound can also be achieved
in this setting. In particular, we propose a gossip-based optimization method
which achieves the optimal regret bound. The amount of communication required
depends on the network topology through the second largest eigenvalue of the
transition matrix of a random walk on the network. In the setting of stochastic
optimization, the proposed gossip-based approach achieves nearly-linear
scaling: the optimization error is guaranteed to be no more than
after rounds, each of which involves
gossip iterations, when nodes communicate over a
well-connected graph. This scaling law is also observed in numerical
experiments on a cluster.Comment: 30 pages, 2 figure
Communication Efficient Distributed Optimization using an Approximate Newton-type Method
We present a novel Newton-type method for distributed optimization, which is
particularly well suited for stochastic optimization and learning problems. For
quadratic objectives, the method enjoys a linear rate of convergence which
provably \emph{improves} with the data size, requiring an essentially constant
number of iterations under reasonable assumptions. We provide theoretical and
empirical evidence of the advantages of our method compared to other
approaches, such as one-shot parameter averaging and ADMM
- …