43,714 research outputs found
Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling
The goal of decentralized optimization over a network is to optimize a global
objective formed by a sum of local (possibly nonsmooth) convex functions using
only local computation and communication. It arises in various application
domains, including distributed tracking and localization, multi-agent
co-ordination, estimation in sensor networks, and large-scale optimization in
machine learning. We develop and analyze distributed algorithms based on dual
averaging of subgradients, and we provide sharp bounds on their convergence
rates as a function of the network size and topology. Our method of analysis
allows for a clear separation between the convergence of the optimization
algorithm itself and the effects of communication constraints arising from the
network structure. In particular, we show that the number of iterations
required by our algorithm scales inversely in the spectral gap of the network.
The sharpness of this prediction is confirmed both by theoretical lower bounds
and simulations for various networks. Our approach includes both the cases of
deterministic optimization and communication, as well as problems with
stochastic optimization and/or communication.Comment: 40 pages, 4 figure
Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks
We study diffusion and consensus based optimization of a sum of unknown
convex objective functions over distributed networks. The only access to these
functions is through stochastic gradient oracles, each of which is only
available at a different node, and a limited number of gradient oracle calls is
allowed at each node. In this framework, we introduce a convex optimization
algorithm based on the stochastic gradient descent (SGD) updates. Particularly,
we use a carefully designed time-dependent weighted averaging of the SGD
iterates, which yields a convergence rate of
after gradient updates for each node on
a network of nodes. We then show that after gradient oracle calls, the
average SGD iterate achieves a mean square deviation (MSD) of
. This rate of convergence is optimal as it
matches the performance lower bound up to constant terms. Similar to the SGD
algorithm, the computational complexity of the proposed algorithm also scales
linearly with the dimensionality of the data. Furthermore, the communication
load of the proposed method is the same as the communication load of the SGD
algorithm. Thus, the proposed algorithm is highly efficient in terms of
complexity and communication load. We illustrate the merits of the algorithm
with respect to the state-of-art methods over benchmark real life data sets and
widely studied network topologies
Gossip Algorithms for Distributed Signal Processing
Gossip algorithms are attractive for in-network processing in sensor networks
because they do not require any specialized routing, there is no bottleneck or
single point of failure, and they are robust to unreliable wireless network
conditions. Recently, there has been a surge of activity in the computer
science, control, signal processing, and information theory communities,
developing faster and more robust gossip algorithms and deriving theoretical
performance guarantees. This article presents an overview of recent work in the
area. We describe convergence rate results, which are related to the number of
transmitted messages and thus the amount of energy consumed in the network for
gossiping. We discuss issues related to gossiping over wireless links,
including the effects of quantization and noise, and we illustrate the use of
gossip algorithms for canonical signal processing tasks including distributed
estimation, source localization, and compression.Comment: Submitted to Proceedings of the IEEE, 29 page
- …