Search CORE

26,591 research outputs found

On the Linear Convergence of the ADMM in Decentralized Consensus Optimization

Author: Ling Qing
Shi Wei
Wu Gang
Yin Wotao
Yuan Kun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In decentralized consensus optimization, a connected network of agents collaboratively minimize the sum of their local objective functions over a common decision variable, where their information exchange is restricted between the neighbors. To this end, one can first obtain a problem reformulation and then apply the alternating direction method of multipliers (ADMM). The method applies iterative computation at the individual agents and information exchange between the neighbors. This approach has been observed to converge quickly and deemed powerful. This paper establishes its linear convergence rate for decentralized consensus optimization problem with strongly convex local objective functions. The theoretical convergence rate is explicitly given in terms of the network topology, the properties of local objective functions, and the algorithm parameter. This result is not only a performance guarantee but also a guideline toward accelerating the ADMM convergence.Comment: 11 figures, IEEE Transactions on Signal Processing, 201

arXiv.org e-Print Archive

CiteSeerX

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Author: Bach Francis
Bubeck Sébastien
Lee Yin Tat
Massoulié Laurent
Scaman Kevin
Publication venue
Publication date: 01/06/2018
Field of study

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in

O(1/\sqrt{t})

, the structure of the communication network only impacts a second-order term in

O(1/t)

, where

t

is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a

d^{1/4}

multiplicative factor of the optimal convergence rate, where

d

is the underlying dimension.Comment: 17 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Hal-Diderot

Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks

Author: Muhammed O. Sayin
N. Denizcan Vanli
Senior Member
Suleyman S. Kozat
Publication venue
Publication date: 31/08/2015
Field of study

We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a different node, and a limited number of gradient oracle calls is allowed at each node. In this framework, we introduce a convex optimization algorithm based on the stochastic gradient descent (SGD) updates. Particularly, we use a carefully designed time-dependent weighted averaging of the SGD iterates, which yields a convergence rate of

O\left(\frac{N\sqrt{N}}{T}\right)

after

T

gradient updates for each node on a network of

N

nodes. We then show that after

T

gradient oracle calls, the average SGD iterate achieves a mean square deviation (MSD) of

O\left(\frac{\sqrt{N}}{T}\right)

. This rate of convergence is optimal as it matches the performance lower bound up to constant terms. Similar to the SGD algorithm, the computational complexity of the proposed algorithm also scales linearly with the dimensionality of the data. Furthermore, the communication load of the proposed method is the same as the communication load of the SGD algorithm. Thus, the proposed algorithm is highly efficient in terms of complexity and communication load. We illustrate the merits of the algorithm with respect to the state-of-art methods over benchmark real life data sets and widely studied network topologies

arXiv.org e-Print Archive

CiteSeerX