18 research outputs found

    Local Exact-Diffusion for Decentralized Optimization and Learning

    Full text link
    Distributed optimization methods with local updates have recently attracted a lot of attention due to their potential to reduce the communication cost of distributed methods. In these algorithms, a collection of nodes performs several local updates based on their local data, and then they communicate with each other to exchange estimate information. While there have been many studies on distributed local methods with centralized network connections, there has been less work on decentralized networks. In this work, we propose and investigate a locally updated decentralized method called Local Exact-Diffusion (LED). We establish the convergence of LED in both convex and nonconvex settings for the stochastic online setting. Our convergence rate improves over the rate of existing decentralized methods. When we specialize the network to the centralized case, we recover the state-of-the-art bound for centralized methods. We also link LED to several other independently studied distributed methods, including Scaffnew, FedGate, and VRL-SGD. Additionally, we numerically investigate the benefits of local updates for decentralized networks and demonstrate the effectiveness of the proposed method

    Linear Convergence of Primal-Dual Gradient Methods and their Performance in Distributed Optimization

    Full text link
    In this work, we revisit a classical incremental implementation of the primal-descent dual-ascent gradient method used for the solution of equality constrained optimization problems. We provide a short proof that establishes the linear (exponential) convergence of the algorithm for smooth strongly-convex cost functions and study its relation to the non-incremental implementation. We also study the effect of the augmented Lagrangian penalty term on the performance of distributed optimization algorithms for the minimization of aggregate cost functions over multi-agent networks

    Distributed Coupled Multi-Agent Stochastic Optimization

    Full text link
    This work develops effective distributed strategies for the solution of constrained multi-agent stochastic optimization problems with coupled parameters across the agents. In this formulation, each agent is influenced by only a subset of the entries of a global parameter vector or model, and is subject to convex constraints that are only known locally. Problems of this type arise in several applications, most notably in disease propagation models, minimum-cost flow problems, distributed control formulations, and distributed power system monitoring. This work focuses on stochastic settings, where a stochastic risk function is associated with each agent and the objective is to seek the minimizer of the aggregate sum of all risks subject to a set of constraints. Agents are not aware of the statistical distribution of the data and, therefore, can only rely on stochastic approximations in their learning strategies. We derive an effective distributed learning strategy that is able to track drifts in the underlying parameter model. A detailed performance and stability analysis is carried out showing that the resulting coupled diffusion strategy converges at a linear rate to an O(ÎŒ)−O(\mu)-neighborhood of the true penalized optimizer

    A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

    Full text link
    Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based proximal decentralized methods are known to converge to the optimal solution with sublinear rates, and it remains unclear whether this family of methods can achieve global linear convergence. To tackle this problem, this work assumes the non-smooth regularization term is common across all networked agents, which is the case for many machine learning problems. Under this condition, we design a proximal gradient decentralized algorithm whose fixed point coincides with the desired minimizer. We then provide a concise proof that establishes its linear convergence. In the absence of the non-smooth term, our analysis technique covers the well known EXTRA algorithm and provides useful bounds on the convergence rate and step-size.Comment: NeurIPS 201

    A Proximal Diffusion Strategy for Multi-Agent Optimization with Sparse Affine Constraints

    Full text link
    This work develops a proximal primal-dual decentralized strategy for multi-agent optimization problems that involve multiple coupled affine constraints, where each constraint may involve only a subset of the agents. The constraints are generally sparse, meaning that only a small subset of the agents are involved in them. This scenario arises in many applications including decentralized control formulations, resource allocation problems, and smart grids. Traditional decentralized solutions tend to ignore the structure of the constraints and lead to degraded performance. We instead develop a decentralized solution that exploits the sparsity structure. Under constant step-size learning, the asymptotic convergence of the proposed algorithm is established in the presence of non-smooth terms, and it occurs at a linear rate in the smooth case. We also examine how the performance of the algorithm is influenced by the sparsity of the constraints. Simulations illustrate the superior performance of the proposed strategy.Comment: accepted for publication in IEEE TA

    Diffusion Stochastic Optimization for Min-Max Problems

    Full text link
    The optimistic gradient method is useful in addressing minimax optimization problems. Motivated by the observation that the conventional stochastic version suffers from the need for a large batch size on the order of O(Δ−2)\mathcal{O}(\varepsilon^{-2}) to achieve an Δ\varepsilon-stationary solution, we introduce and analyze a new formulation termed Diffusion Stochastic Same-Sample Optimistic Gradient (DSS-OG). We prove its convergence and resolve the large batch issue by establishing a tighter upper bound, under the more general setting of nonconvex Polyak-Lojasiewicz (PL) risk functions. We also extend the applicability of the proposed method to the distributed scenario, where agents communicate with their neighbors via a left-stochastic protocol. To implement DSS-OG, we can query the stochastic gradient oracles in parallel with some extra memory overhead, resulting in a complexity comparable to its conventional counterpart. To demonstrate the efficacy of the proposed algorithm, we conduct tests by training generative adversarial networks

    On the Influence of Bias-Correction on Distributed Stochastic Optimization

    Full text link
    Various bias-correction methods such as EXTRA, gradient tracking methods, and exact diffusion have been proposed recently to solve distributed {\em deterministic} optimization problems. These methods employ constant step-sizes and converge linearly to the {\em exact} solution under proper conditions. However, their performance under stochastic and adaptive settings is less explored. It is still unknown {\em whether}, {\em when} and {\em why} these bias-correction methods can outperform their traditional counterparts (such as consensus and diffusion) with noisy gradient and constant step-sizes. This work studies the performance of exact diffusion under the stochastic and adaptive setting, and provides conditions under which exact diffusion has superior steady-state mean-square deviation (MSD) performance than traditional algorithms without bias-correction. In particular, it is proven that this superiority is more evident over sparsely-connected network topologies such as lines, cycles, or grids. Conditions are also provided under which exact diffusion method match or may even degrade the performance of traditional methods. Simulations are provided to validate the theoretical findings.Comment: 17 pages, 9 figure, submitted for publicatio
    corecore