1,793 research outputs found

    Distributed Stochastic Optimization over Time-Varying Noisy Network

    Full text link
    This paper is concerned with distributed stochastic multi-agent optimization problem over a class of time-varying network with slowly decreasing communication noise effects. This paper considers the problem in composite optimization setting which is more general in noisy network optimization. It is noteworthy that existing methods for noisy network optimization are Euclidean projection based. We present two related different classes of non-Euclidean methods and investigate their convergence behavior. One is distributed stochastic composite mirror descent type method (DSCMD-N) which provides a more general algorithm framework than former works in this literature. As a counterpart, we also consider a composite dual averaging type method (DSCDA-N) for noisy network optimization. Some main error bounds for DSCMD-N and DSCDA-N are obtained. The trade-off among stepsizes, noise decreasing rates, convergence rates of algorithm is analyzed in detail. To the best of our knowledge, this is the first work to analyze and derive convergence rates of optimization algorithm in noisy network optimization. We show that an optimal rate of O(1/T)O(1/\sqrt{T}) in nonsmooth convex optimization can be obtained for proposed methods under appropriate communication noise condition. Moveover, convergence rates in different orders are comprehensively derived in both expectation convergence and high probability convergence sense.Comment: 27 page

    Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling

    Full text link
    The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains, including distributed tracking and localization, multi-agent co-ordination, estimation in sensor networks, and large-scale optimization in machine learning. We develop and analyze distributed algorithms based on dual averaging of subgradients, and we provide sharp bounds on their convergence rates as a function of the network size and topology. Our method of analysis allows for a clear separation between the convergence of the optimization algorithm itself and the effects of communication constraints arising from the network structure. In particular, we show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network. The sharpness of this prediction is confirmed both by theoretical lower bounds and simulations for various networks. Our approach includes both the cases of deterministic optimization and communication, as well as problems with stochastic optimization and/or communication.Comment: 40 pages, 4 figure

    Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

    Full text link
    In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in O(1/t)O(1/\sqrt{t}), the structure of the communication network only impacts a second-order term in O(1/t)O(1/t), where tt is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a d1/4d^{1/4} multiplicative factor of the optimal convergence rate, where dd is the underlying dimension.Comment: 17 page
    corecore