    Distributed Dictionary Learning

    The paper studies distributed Dictionary Learning (DL) problems where the learning task is distributed over a multi-agent network with time-varying (nonsymmetric) connectivity. This formulation is relevant, for instance, in big-data scenarios where massive amounts of data are collected/stored in different spatial locations and it is unfeasible to aggregate and/or process all the data in a fusion center, due to resource limitations, communication overhead or privacy considerations. We develop a general distributed algorithmic framework for the (nonconvex) DL problem and establish its asymptotic convergence. The new method hinges on Successive Convex Approximation (SCA) techniques coupled with i) a gradient tracking mechanism instrumental to locally estimate the missing global information; and ii) a consensus step, as a mechanism to distribute the computations among the agents. To the best of our knowledge, this is the first distributed algorithm with provable convergence for the DL problem and, more in general, bi-convex optimization problems over (time-varying) directed graphs

    Fast ADMM Algorithm for Distributed Optimization with Adaptive Penalty

    We propose new methods to speed up convergence of the Alternating Direction Method of Multipliers (ADMM), a common optimization tool in the context of large scale and distributed learning. The proposed method accelerates the speed of convergence by automatically deciding the constraint penalty needed for parameter consensus in each iteration. In addition, we also propose an extension of the method that adaptively determines the maximum number of iterations to update the penalty. We show that this approach effectively leads to an adaptive, dynamic network topology underlying the distributed optimization. The utility of the new penalty update schemes is demonstrated on both synthetic and real data, including a computer vision application of distributed structure from motion.Comment: 8 pages manuscript, 2 pages appendix, 5 figure

    On the Convergence of Decentralized Gradient Descent

    Consider the consensus problem of minimizing f(x)=i=1nfi(x)f(x)=\sum_{i=1}^n f_i(x) where each fif_i is only known to one individual agent ii out of a connected network of nn agents. All the agents shall collaboratively solve this problem and obtain the solution subject to data exchanges restricted to between neighboring agents. Such algorithms avoid the need of a fusion center, offer better network load balance, and improve data privacy. We study the decentralized gradient descent method in which each agent ii updates its variable x(i)x_{(i)}, which is a local approximate to the unknown variable xx, by combining the average of its neighbors' with the negative gradient step αfi(x(i))-\alpha \nabla f_i(x_{(i)}). The iteration is x(i)(k+1)neighborjofiwijx(j)(k)αfi(x(i)(k)),for each agenti,x_{(i)}(k+1) \gets \sum_{\text{neighbor} j \text{of} i} w_{ij} x_{(j)}(k) - \alpha \nabla f_i(x_{(i)}(k)),\quad\text{for each agent} i, where the averaging coefficients form a symmetric doubly stochastic matrix W=[wij]Rn×nW=[w_{ij}] \in \mathbb{R}^{n \times n}. We analyze the convergence of this iteration and derive its converge rate, assuming that each fif_i is proper closed convex and lower bounded, fi\nabla f_i is Lipschitz continuous with constant LfiL_{f_i}, and stepsize α\alpha is fixed. Provided that α<O(1/Lh)\alpha < O(1/L_h) where Lh=maxi{Lfi}L_h=\max_i\{L_{f_i}\}, the objective error at the averaged solution, f(1nix(i)(k))ff(\frac{1}{n}\sum_i x_{(i)}(k))-f^*, reduces at a speed of O(1/k)O(1/k) until it reaches O(α)O(\alpha). If fif_i are further (restricted) strongly convex, then both 1nix(i)(k)\frac{1}{n}\sum_i x_{(i)}(k) and each x(i)(k)x_{(i)}(k) converge to the global minimizer xx^* at a linear rate until reaching an O(α)O(\alpha)-neighborhood of xx^*. We also develop an iteration for decentralized basis pursuit and establish its linear convergence to an O(α)O(\alpha)-neighborhood of the true unknown sparse signal

    New and Provable Results for Network Inference Problems and Multi-agent Optimization Algorithms

    abstract: Our ability to understand networks is important to many applications, from the analysis and modeling of biological networks to analyzing social networks. Unveiling network dynamics allows us to make predictions and decisions. Moreover, network dynamics models have inspired new ideas for computational methods involving multi-agent cooperation, offering effective solutions for optimization tasks. This dissertation presents new theoretical results on network inference and multi-agent optimization, split into two parts - The first part deals with modeling and identification of network dynamics. I study two types of network dynamics arising from social and gene networks. Based on the network dynamics, the proposed network identification method works like a `network RADAR', meaning that interaction strengths between agents are inferred by injecting `signal' into the network and observing the resultant reverberation. In social networks, this is accomplished by stubborn agents whose opinions do not change throughout a discussion. In gene networks, genes are suppressed to create desired perturbations. The steady-states under these perturbations are characterized. In contrast to the common assumption of full rank input, I take a laxer assumption where low-rank input is used, to better model the empirical network data. Importantly, a network is proven to be identifiable from low rank data of rank that grows proportional to the network's sparsity. The proposed method is applied to synthetic and empirical data, and is shown to offer superior performance compared to prior work. The second part is concerned with algorithms on networks. I develop three consensus-based algorithms for multi-agent optimization. The first method is a decentralized Frank-Wolfe (DeFW) algorithm. The main advantage of DeFW lies on its projection-free nature, where we can replace the costly projection step in traditional algorithms by a low-cost linear optimization step. I prove the convergence rates of DeFW for convex and non-convex problems. I also develop two consensus-based alternating optimization algorithms --- one for least square problems and one for non-convex problems. These algorithms exploit the problem structure for faster convergence and their efficacy is demonstrated by numerical simulations. I conclude this dissertation by describing future research directions.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201