22 research outputs found

    ExtraPush for convex smooth decentralized optimization over directed networks

    In this note, we extend the algorithms Extra and subgradient-push to a new algorithm ExtraPush for consensus optimization with convex differentiable objective functions over a directed network. When the stationary distribution of the network can be computed in advance}, we propose a simplified algorithm called Normalized ExtraPush. Just like Extra, both ExtraPush and Normalized ExtraPush can iterate with a fixed step size. But unlike Extra, they can take a column-stochastic mixing matrix, which is not necessarily doubly stochastic. Therefore, they remove the undirected-network restriction of Extra. Subgradient-push, while also works for directed networks, is slower on the same type of problem because it must use a sequence of diminishing step sizes. We present preliminary analysis for ExtraPush under a bounded sequence assumption. For Normalized ExtraPush, we show that it naturally produces a bounded, linearly convergent sequence provided that the objective function is strongly convex. In our numerical experiments, ExtraPush and Normalized ExtraPush performed similarly well. They are significantly faster than subgradient-push, even when we hand-optimize the step sizes for the latter.Comment: 16 pages, 3 figure

    Linear Convergence of First- and Zeroth-Order Primal-Dual Algorithms for Distributed Nonconvex Optimization

    This paper considers the distributed nonconvex optimization problem of minimizing a global cost function formed by a sum of local cost functions by using local information exchange. We first propose a distributed first-order primal-dual algorithm. We show that it converges sublinearly to the stationary point if each local cost function is smooth and linearly to the global optimum under an additional condition that the global cost function satisfies the Polyak-{\L}ojasiewicz condition. This condition is weaker than strong convexity, which is a standard condition for proving the linear convergence of distributed optimization algorithms, and the global minimizer is not necessarily unique or finite. Motivated by the situations where the gradients are unavailable, we then propose a distributed zeroth-order algorithm, derived from the proposed distributed first-order algorithm by using a deterministic gradient estimator, and show that it has the same convergence properties as the proposed first-order algorithm under the same conditions. The theoretical results are illustrated by numerical simulations

    A Robust Gradient Tracking Method for Distributed Optimization over Directed Networks

    In this paper, we consider the problem of distributed consensus optimization over multi-agent networks with directed network topology. Assuming each agent has a local cost function that is smooth and strongly convex, the global objective is to minimize the average of all the local cost functions. To solve the problem, we introduce a robust gradient tracking method (R-Push-Pull) adapted from the recently proposed Push-Pull/AB algorithm. R-Push-Pull inherits the advantages of Push-Pull and enjoys linear convergence to the optimal solution with exact communication. Under noisy information exchange, R-Push-Pull is more robust than the existing gradient tracking based algorithms; the solutions obtained by each agent reach a neighborhood of the optimum in expectation exponentially fast under a constant stepsize policy. We provide a numerical example that demonstrate the effectiveness of R-Push-Pull

    A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates

    This paper proposes a novel proximal-gradient algorithm for a decentralized optimization problem with a composite objective containing smooth and non-smooth terms. Specifically, the smooth and nonsmooth terms are dealt with by gradient and proximal updates, respectively. The proposed algorithm is closely related to a previous algorithm, PG-EXTRA \cite{shi2015proximal}, but has a few advantages. First of all, agents use uncoordinated step-sizes, and the stable upper bounds on step-sizes are independent of network topologies. The step-sizes depend on local objective functions, and they can be as large as those of the gradient descent. Secondly, for the special case without non-smooth terms, linear convergence can be achieved under the strong convexity assumption. The dependence of the convergence rate on the objective functions and the network are separated, and the convergence rate of the new algorithm is as good as one of the two convergence rates that match the typical rates for the general gradient descent and the consensus averaging. We provide numerical experiments to demonstrate the efficacy of the introduced algorithm and validate our theoretical discoveries

    Exponential Convergence for Distributed Smooth Optimization Under the Restricted Secant Inequality Condition

    This paper considers the distributed smooth optimization problem in which the objective is to minimize a global cost function formed by a sum of local smooth cost functions, by using local information exchange. The standard assumption for proving exponential/linear convergence of first-order methods is the strong convexity of the cost functions, which does not hold for many practical applications. In this paper, we first show that the continuous-time distributed primal-dual gradient algorithm converges to one global minimizer exponentially under the assumption that the global cost function satisfies the restricted secant inequality condition. This condition is weaker than the strong convexity condition since it does not require convexity and the global minimizers are not necessary to be unique. We then show that the discrete-time distributed primal-dual algorithm constructed by using the Euler's approximation method converges to one global minimizer linearly under the same condition. The theoretical results are illustrated by numerical simulations

    On the Linear Convergence of Distributed Optimization over Directed Graphs

    This paper develops a fast distributed algorithm, termed \emph{DEXTRA}, to solve the optimization problem when~nn agents reach agreement and collaboratively minimize the sum of their local objective functions over the network, where the communication between the agents is described by a~\emph{directed} graph. Existing algorithms solve the problem restricted to directed graphs with convergence rates of O(lnk/k)O(\ln k/\sqrt{k}) for general convex objective functions and O(lnk/k)O(\ln k/k) when the objective functions are strongly-convex, where~kk is the number of iterations. We show that, with the appropriate step-size, DEXTRA converges at a linear rate O(τk)O(\tau^{k}) for 0<τ<10<\tau<1, given that the objective functions are restricted strongly-convex. The implementation of DEXTRA requires each agent to know its local out-degree. Simulation examples further illustrate our findings

    Distributed Optimization over Directed Graphs with Row Stochasticity and Constraint Regularity

    This paper deals with an optimization problem over a network of agents, where the cost function is the sum of the individual objectives of the agents and the constraint set is the intersection of local constraints. Most existing methods employing subgradient and consensus steps for solving this problem require the weight matrix associated with the network to be column stochastic or even doubly stochastic, conditions that can be hard to arrange in directed networks. Moreover, known convergence analyses for distributed subgradient methods vary depending on whether the problem is unconstrained or constrained, and whether the local constraint sets are identical or nonidentical and compact. The main goals of this paper are: (i) removing the common column stochasticity requirement; (ii) relaxing the compactness assumption, and (iii) providing a unified convergence analysis. Specifically, assuming the communication graph to be fixed and strongly connected and the weight matrix to (only) be row stochastic, a distributed projected subgradient algorithm and its variation are presented to solve the problem for cost functions that are convex and Lipschitz continuous. Based on a regularity assumption on the local constraint sets, a unified convergence analysis is given that can be applied to both unconstrained and constrained problems and without assuming compactness of the constraint sets or an interior point in their intersection. Further, we also establish an upper bound on the absolute objective error evaluated at each agent's available local estimate under a nonincreasing step size sequence. This bound allows us to analyze the convergence rate of both algorithms.Comment: 14 pages, 3 figure

    A Unification and Generalization of Exact Distributed First Order Methods

    Recently, there has been significant progress in the development of distributed first order methods. (At least) two different types of methods, designed from very different perspectives, have been proposed that achieve both exact and linear convergence when a constant step size is used -- a favorable feature that was not achievable by most prior methods. In this paper, we unify, generalize, and improve convergence speed of these exact distributed first order methods. We first carry out a novel unifying analysis that sheds light on how the different existing methods compare. The analysis reveals that a major difference between the methods is on how a past dual gradient of an associated augmented Lagrangian dual function is weighted. We then capitalize on the insights from the analysis to derive a novel method -- with a tuned past gradient weighting -- that improves upon the existing methods. We establish for the proposed generalized method global R-linear convergence rate under strongly convex costs with Lipschitz continuous gradients.Comment: revised Dec 17, 201

    Push-Pull Gradient Methods for Distributed Optimization in Networks

    In this paper, we focus on solving a distributed convex optimization problem in a network, where each agent has its own convex cost function and the goal is to minimize the sum of the agents' cost functions while obeying the network connectivity structure. In order to minimize the sum of the cost functions, we consider new distributed gradient-based methods where each node maintains two estimates, namely, an estimate of the optimal decision variable and an estimate of the gradient for the average of the agents' objective functions. From the viewpoint of an agent, the information about the gradients is pushed to the neighbors, while the information about the decision variable is pulled from the neighbors hence giving the name "push-pull gradient methods". The methods utilize two different graphs for the information exchange among agents, and as such, unify the algorithms with different types of distributed architecture, including decentralized (peer-to-peer), centralized (master-slave), and semi-centralized (leader-follower) architecture. We show that the proposed algorithms and their many variants converge linearly for strongly convex and smooth objective functions over a network (possibly with unidirectional data links) in both synchronous and asynchronous random-gossip settings. In particular, under the random-gossip setting, "push-pull" is the first class of algorithms for distributed optimization over directed graphs. Moreover, we numerically evaluate our proposed algorithms in both scenarios, and show that they outperform other existing linearly convergent schemes, especially for ill-conditioned problems and networks that are not well balanced.Comment: Parts of the results appear in Proceedings of the 57th IEEE Conference on Decision and Control (see arXiv:1803.07588

    Distributed Dual Gradient Tracking for Resource Allocation in Unbalanced Networks

    This paper proposes a distributed dual gradient tracking algorithm (DDGT) to solve resource allocation problems over an unbalanced network, where each node in the network holds a private cost function and computes the optimal resource by interacting only with its neighboring nodes. Our key idea is the novel use of the distributed push-pull gradient algorithm (PPG) to solve the dual problem of the resource allocation problem. To study the convergence of the DDGT, we first establish the sublinear convergence rate of PPG for non-convex objective functions, which advances the existing results on PPG as they require the strong-convexity of objective functions. Then we show that the DDGT converges linearly for strongly convex and Lipschitz smooth cost functions, and sublinearly without the Lipschitz smoothness. Finally, experimental results suggest that DDGT outperforms existing algorithms.Comment: Accepted by IEEE Transactions on Signal Processing. This version fixed some typos in the accepted versio