9,126 research outputs found

    Performance Limits of Stochastic Sub-Gradient Learning, Part II: Multi-Agent Case

    Full text link
    The analysis in Part I revealed interesting properties for subgradient learning algorithms in the context of stochastic optimization when gradient noise is present. These algorithms are used when the risk functions are non-smooth and involve non-differentiable components. They have been long recognized as being slow converging methods. However, it was revealed in Part I that the rate of convergence becomes linear for stochastic optimization problems, with the error iterate converging at an exponential rate αi\alpha^i to within an O(μ)O(\mu)-neighborhood of the optimizer, for some α(0,1)\alpha \in (0,1) and small step-size μ\mu. The conclusion was established under weaker assumptions than the prior literature and, moreover, several important problems (such as LASSO, SVM, and Total Variation) were shown to satisfy these weaker assumptions automatically (but not the previously used conditions from the literature). These results revealed that sub-gradient learning methods have more favorable behavior than originally thought when used to enable continuous adaptation and learning. The results of Part I were exclusive to single-agent adaptation. The purpose of the current Part II is to examine the implications of these discoveries when a collection of networked agents employs subgradient learning as their cooperative mechanism. The analysis will show that, despite the coupled dynamics that arises in a networked scenario, the agents are still able to attain linear convergence in the stochastic case; they are also able to reach agreement within O(μ)O(\mu) of the optimizer

    Diffusion Adaptation Strategies for Distributed Optimization and Learning over Networks

    Full text link
    We propose an adaptive diffusion mechanism to optimize a global cost function in a distributed manner over a network of nodes. The cost function is assumed to consist of a collection of individual components. Diffusion adaptation allows the nodes to cooperate and diffuse information in real-time; it also helps alleviate the effects of stochastic gradient noise and measurement noise through a continuous learning process. We analyze the mean-square-error performance of the algorithm in some detail, including its transient and steady-state behavior. We also apply the diffusion algorithm to two problems: distributed estimation with sparse parameters and distributed localization. Compared to well-studied incremental methods, diffusion methods do not require the use of a cyclic path over the nodes and are robust to node and link failure. Diffusion methods also endow networks with adaptation abilities that enable the individual nodes to continue learning even when the cost function changes with time. Examples involving such dynamic cost functions with moving targets are common in the context of biological networks.Comment: 34 pages, 6 figures, to appear in IEEE Transactions on Signal Processing, 201

    On the topology of free paratopological groups

    Full text link
    The result often known as Joiner's lemma is fundamental in understanding the topology of the free topological group F(X)F(X) on a Tychonoff spaceXX. In this paper, an analogue of Joiner's lemma for the free paratopological group \FP(X) on a T1T_1 space XX is proved. Using this, it is shown that the following conditions are equivalent for a space XX: (1) XX is T1T_1; (2) \FP(X) is T1T_1; (3) the subspace XX of \FP(X) is closed; (4) the subspace X1X^{-1} of \FP(X) is discrete; (5) the subspace X1X^{-1} is T1T_1; (6) the subspace X1X^{-1} is closed; and (7) the subspace \FP_n(X) is closed for all nNn \in \N, where \FP_n(X) denotes the subspace of \FP(X) consisting of all words of length at most nn.Comment: http://blms.oxfordjournals.org/cgi/content/abstract/bds031?ijkey=9Su2bYV9e19JMxf&keytype=re

    Linear Convergence of Primal-Dual Gradient Methods and their Performance in Distributed Optimization

    Full text link
    In this work, we revisit a classical incremental implementation of the primal-descent dual-ascent gradient method used for the solution of equality constrained optimization problems. We provide a short proof that establishes the linear (exponential) convergence of the algorithm for smooth strongly-convex cost functions and study its relation to the non-incremental implementation. We also study the effect of the augmented Lagrangian penalty term on the performance of distributed optimization algorithms for the minimization of aggregate cost functions over multi-agent networks

    Sparse Distributed Learning Based on Diffusion Adaptation

    Full text link
    This article proposes diffusion LMS strategies for distributed estimation over adaptive networks that are able to exploit sparsity in the underlying system model. The approach relies on convex regularization, common in compressive sensing, to enhance the detection of sparsity via a diffusive process over the network. The resulting algorithms endow networks with learning abilities and allow them to learn the sparse structure from the incoming data in real-time, and also to track variations in the sparsity of the model. We provide convergence and mean-square performance analysis of the proposed method and show under what conditions it outperforms the unregularized diffusion version. We also show how to adaptively select the regularization parameter. Simulation results illustrate the advantage of the proposed filters for sparse data recovery.Comment: to appear in IEEE Trans. on Signal Processing, 201
    corecore