50,654 research outputs found

    Distributed Stochastic Optimization with Gradient Tracking over Time-Varying Directed Networks

    Full text link
    We study a distributed method called SAB-TV, which employs gradient tracking to collaboratively minimize the sum of smooth and strongly-convex local cost functions for networked agents communicating over a time-varying directed graph. Each agent, assumed to have access to a stochastic first-order oracle for obtaining an unbiased estimate of the gradient of its local cost function, maintains an auxiliary variable to asymptotically track the stochastic gradient of the global cost. The optimal decision and gradient tracking are updated over time through limited information exchange with local neighbors using row- and column-stochastic weights, guaranteeing both consensus and optimality. With a sufficiently small constant step-size, we demonstrate that, in expectation, SAB-TV converges linearly to a neighborhood of the optimal solution. Numerical simulations illustrate the effectiveness of the proposed algorithm

    Robust Fully-Asynchronous Methods for Distributed Training over General Architecture

    Full text link
    Perfect synchronization in distributed machine learning problems is inefficient and even impossible due to the existence of latency, package losses and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient Tracking method (R-FAST), where each device performs local computation and communication at its own pace without any form of synchronization. Different from existing asynchronous distributed algorithms, R-FAST can eliminate the impact of data heterogeneity across devices and allow for packet losses by employing a robust gradient tracking strategy that relies on properly designed auxiliary variables for tracking and buffering the overall gradient vector. More importantly, the proposed method utilizes two spanning-tree graphs for communication so long as both share at least one common root, enabling flexible designs in communication architectures. We show that R-FAST converges in expectation to a neighborhood of the optimum with a geometric rate for smooth and strongly convex objectives; and to a stationary point with a sublinear rate for general non-convex settings. Extensive experiments demonstrate that R-FAST runs 1.5-2 times faster than synchronous benchmark algorithms, such as Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP, especially in the presence of stragglers

    Fully Distributed Nash Equilibrium Seeking in N-Cluster Games

    Full text link
    Distributed optimization and Nash equilibrium (NE) seeking problems have drawn much attention in the control community recently. This paper studies a class of non-cooperative games, known as NN-cluster game, which subsumes both cooperative and non-cooperative nature among multiple agents in the two problems: solving distributed optimization problem within the cluster, while playing a non-cooperative game across the clusters. Moreover, we consider a partial-decision information game setup, i.e., the agents do not have direct access to other agents' decisions, and hence need to communicate with each other through a directed graph whose associated adjacency matrix is assumed to be non-doubly stochastic. To solve the NN-cluster game problem, we propose a fully distributed NE seeking algorithm by a synthesis of leader-following consensus and gradient tracking, where the leader-following consensus protocol is adopted to estimate the other agents' decisions and the gradient tracking method is employed to trace some weighted average of the gradient. Furthermore, the algorithm is equipped with uncoordinated constant step-sizes, which allows the agents to choose their own preferred step-sizes, instead of a uniform coordinated step-size. We prove that all agents' decisions converge linearly to their corresponding NE so long as the largest step-size and the heterogeneity of the step-size are small. We verify the derived results through a numerical example in a Cournot competition game

    Distributed randomized block stochastic gradient tracking methods: Rate analysis and numerical experiments

    Get PDF
    Distributed optimization has been a trending topic of research in the past few decades. This is mainly due to the recent advancements in the technology of wireless sensors and also the emerging applications in machine learning. Traditionally, optimization problems were addressed using centralized schemes where the data is assumed to be available all in one place. However, the main reasons that motivate the need for distributed implementations include: (i) the unavailability of the collected data in a centralized location, (ii) the privacy of the data among agents should be preserved, and (iii) the memory and computational power limitations of data processors. Accordingly, to address these challenges, distributed optimization provides a framework where agents (e.g., data processor, sensor) communicate their local information with each other over a network and seek to minimize a global objective function. In some applications, the data may have a huge sample size or a large number of attributes. The problems associated with this type of data are often known as big data problems. In this thesis, our goal is to address such high dimensional distributed optimizationproblems, where the computation of the local gradient mappings may become expensive.Recently, a distributed optimization algorithm has been developed for addressing possibly large-scale problems by considering stochasticity. This method is called Distributed Stochastic Gradient Tracking (DSGT). We develop a novel iterative method called Distributed Randomized Block Stochastic Gradient Tracking (DRBSGT), that is a randomized block variant of the existing DSGT method. We derive new non-asymptotic convergence rates of the order 1/k and 1/k^2 in terms of an optimality metric and a consensus violation metric, respectively. Importantly, while block coordinate schemes have been studied for distributed optimization problems before, the proposed algorithm appears to be the first randomized block-coordinate gradient tracking method that is equipped with the aforementioned convergence rate statements. We validate the performance of the proposed method on the MNIST and a synthetic data set under different network settings. A potential future research direction is to extend the results of this thesis to an asynchronous variant of the proposed method. This will allow for the consideration of communication delays
    • …
    corecore