Search CORE

14,806 research outputs found

Efficient Distributed Online Prediction and Stochastic Optimization with Approximate Distributed Averaging

Author: Rabbat Michael G.
Tsianos Konstantinos I.
Publication venue
Publication date: 05/03/2014
Field of study

We study distributed methods for online prediction and stochastic optimization. Our approach is iterative: in each round nodes first perform local computations and then communicate in order to aggregate information and synchronize their decision variables. Synchronization is accomplished through the use of a distributed averaging protocol. When an exact distributed averaging protocol is used, it is known that the optimal regret bound of

\mathcal{O}(\sqrt{m})

can be achieved using the distributed mini-batch algorithm of Dekel et al. (2012), where

m

is the total number of samples processed across the network. We focus on methods using approximate distributed averaging protocols and show that the optimal regret bound can also be achieved in this setting. In particular, we propose a gossip-based optimization method which achieves the optimal regret bound. The amount of communication required depends on the network topology through the second largest eigenvalue of the transition matrix of a random walk on the network. In the setting of stochastic optimization, the proposed gossip-based approach achieves nearly-linear scaling: the optimization error is guaranteed to be no more than

\epsilon

after

\mathcal{O}(\frac{1}{n \epsilon^2})

rounds, each of which involves

\mathcal{O}(\log n)

gossip iterations, when nodes communicate over a well-connected graph. This scaling law is also observed in numerical experiments on a cluster.Comment: 30 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Communication Efficient Distributed Optimization using an Approximate Newton-type Method

Author: Shamir Ohad
Srebro Nathan
Zhang Tong
Publication venue
Publication date: 13/05/2014
Field of study

We present a novel Newton-type method for distributed optimization, which is particularly well suited for stochastic optimization and learning problems. For quadratic objectives, the method enjoys a linear rate of convergence which provably \emph{improves} with the data size, requiring an essentially constant number of iterations under reasonable assumptions. We provide theoretical and empirical evidence of the advantages of our method compared to other approaches, such as one-shot parameter averaging and ADMM

arXiv.org e-Print Archive

CiteSeerX