9,126 research outputs found
Performance Limits of Stochastic Sub-Gradient Learning, Part II: Multi-Agent Case
The analysis in Part I revealed interesting properties for subgradient
learning algorithms in the context of stochastic optimization when gradient
noise is present. These algorithms are used when the risk functions are
non-smooth and involve non-differentiable components. They have been long
recognized as being slow converging methods. However, it was revealed in Part I
that the rate of convergence becomes linear for stochastic optimization
problems, with the error iterate converging at an exponential rate
to within an neighborhood of the optimizer, for some and small step-size . The conclusion was established under weaker
assumptions than the prior literature and, moreover, several important problems
(such as LASSO, SVM, and Total Variation) were shown to satisfy these weaker
assumptions automatically (but not the previously used conditions from the
literature). These results revealed that sub-gradient learning methods have
more favorable behavior than originally thought when used to enable continuous
adaptation and learning. The results of Part I were exclusive to single-agent
adaptation. The purpose of the current Part II is to examine the implications
of these discoveries when a collection of networked agents employs subgradient
learning as their cooperative mechanism. The analysis will show that, despite
the coupled dynamics that arises in a networked scenario, the agents are still
able to attain linear convergence in the stochastic case; they are also able to
reach agreement within of the optimizer
Diffusion Adaptation Strategies for Distributed Optimization and Learning over Networks
We propose an adaptive diffusion mechanism to optimize a global cost function
in a distributed manner over a network of nodes. The cost function is assumed
to consist of a collection of individual components. Diffusion adaptation
allows the nodes to cooperate and diffuse information in real-time; it also
helps alleviate the effects of stochastic gradient noise and measurement noise
through a continuous learning process. We analyze the mean-square-error
performance of the algorithm in some detail, including its transient and
steady-state behavior. We also apply the diffusion algorithm to two problems:
distributed estimation with sparse parameters and distributed localization.
Compared to well-studied incremental methods, diffusion methods do not require
the use of a cyclic path over the nodes and are robust to node and link
failure. Diffusion methods also endow networks with adaptation abilities that
enable the individual nodes to continue learning even when the cost function
changes with time. Examples involving such dynamic cost functions with moving
targets are common in the context of biological networks.Comment: 34 pages, 6 figures, to appear in IEEE Transactions on Signal
Processing, 201
On the topology of free paratopological groups
The result often known as Joiner's lemma is fundamental in understanding the
topology of the free topological group on a Tychonoff space. In this
paper, an analogue of Joiner's lemma for the free paratopological group
\FP(X) on a space is proved. Using this, it is shown that the
following conditions are equivalent for a space : (1) is ; (2)
\FP(X) is ; (3) the subspace of \FP(X) is closed; (4) the subspace
of \FP(X) is discrete; (5) the subspace is ; (6) the
subspace is closed; and (7) the subspace \FP_n(X) is closed for all
, where \FP_n(X) denotes the subspace of \FP(X) consisting of all
words of length at most .Comment: http://blms.oxfordjournals.org/cgi/content/abstract/bds031?ijkey=9Su2bYV9e19JMxf&keytype=re
Linear Convergence of Primal-Dual Gradient Methods and their Performance in Distributed Optimization
In this work, we revisit a classical incremental implementation of the
primal-descent dual-ascent gradient method used for the solution of equality
constrained optimization problems. We provide a short proof that establishes
the linear (exponential) convergence of the algorithm for smooth
strongly-convex cost functions and study its relation to the non-incremental
implementation. We also study the effect of the augmented Lagrangian penalty
term on the performance of distributed optimization algorithms for the
minimization of aggregate cost functions over multi-agent networks
Sparse Distributed Learning Based on Diffusion Adaptation
This article proposes diffusion LMS strategies for distributed estimation
over adaptive networks that are able to exploit sparsity in the underlying
system model. The approach relies on convex regularization, common in
compressive sensing, to enhance the detection of sparsity via a diffusive
process over the network. The resulting algorithms endow networks with learning
abilities and allow them to learn the sparse structure from the incoming data
in real-time, and also to track variations in the sparsity of the model. We
provide convergence and mean-square performance analysis of the proposed method
and show under what conditions it outperforms the unregularized diffusion
version. We also show how to adaptively select the regularization parameter.
Simulation results illustrate the advantage of the proposed filters for sparse
data recovery.Comment: to appear in IEEE Trans. on Signal Processing, 201
- …
