43,714 research outputs found

    Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling

    Full text link
    The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains, including distributed tracking and localization, multi-agent co-ordination, estimation in sensor networks, and large-scale optimization in machine learning. We develop and analyze distributed algorithms based on dual averaging of subgradients, and we provide sharp bounds on their convergence rates as a function of the network size and topology. Our method of analysis allows for a clear separation between the convergence of the optimization algorithm itself and the effects of communication constraints arising from the network structure. In particular, we show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network. The sharpness of this prediction is confirmed both by theoretical lower bounds and simulations for various networks. Our approach includes both the cases of deterministic optimization and communication, as well as problems with stochastic optimization and/or communication.Comment: 40 pages, 4 figure

    Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks

    Full text link
    We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a different node, and a limited number of gradient oracle calls is allowed at each node. In this framework, we introduce a convex optimization algorithm based on the stochastic gradient descent (SGD) updates. Particularly, we use a carefully designed time-dependent weighted averaging of the SGD iterates, which yields a convergence rate of O(NNT)O\left(\frac{N\sqrt{N}}{T}\right) after TT gradient updates for each node on a network of NN nodes. We then show that after TT gradient oracle calls, the average SGD iterate achieves a mean square deviation (MSD) of O(NT)O\left(\frac{\sqrt{N}}{T}\right). This rate of convergence is optimal as it matches the performance lower bound up to constant terms. Similar to the SGD algorithm, the computational complexity of the proposed algorithm also scales linearly with the dimensionality of the data. Furthermore, the communication load of the proposed method is the same as the communication load of the SGD algorithm. Thus, the proposed algorithm is highly efficient in terms of complexity and communication load. We illustrate the merits of the algorithm with respect to the state-of-art methods over benchmark real life data sets and widely studied network topologies

    Gossip Algorithms for Distributed Signal Processing

    Full text link
    Gossip algorithms are attractive for in-network processing in sensor networks because they do not require any specialized routing, there is no bottleneck or single point of failure, and they are robust to unreliable wireless network conditions. Recently, there has been a surge of activity in the computer science, control, signal processing, and information theory communities, developing faster and more robust gossip algorithms and deriving theoretical performance guarantees. This article presents an overview of recent work in the area. We describe convergence rate results, which are related to the number of transmitted messages and thus the amount of energy consumed in the network for gossiping. We discuss issues related to gossiping over wireless links, including the effects of quantization and noise, and we illustrate the use of gossip algorithms for canonical signal processing tasks including distributed estimation, source localization, and compression.Comment: Submitted to Proceedings of the IEEE, 29 page
    • …
    corecore