9 research outputs found

    A geometrically converging dual method for distributed optimization over time-varying graphs

    Full text link
    In this paper we consider a distributed convex optimization problem over time-varying undirected networks. We propose a dual method, primarily averaged network dual ascent (PANDA), that is proven to converge R-linearly to the optimal point given that the agents objective functions are strongly convex and have Lipschitz continuous gradients. Like dual decomposition, PANDA requires half the amount of variable exchanges per iterate of methods based on DIGing, and can provide with practical improved performance as empirically demonstrated.Comment: Submitted to Transactions on Automatic Contro

    An Accelerated Method For Decentralized Distributed Stochastic Optimization Over Time-Varying Graphs

    Full text link
    We consider a distributed stochastic optimization problem that is solved by a decentralized network of agents with only local communication between neighboring agents. The goal of the whole system is to minimize a global objective function given as a sum of local objectives held by each agent. Each local objective is defined as an expectation of a convex smooth random function and the agent is allowed to sample stochastic gradients for this function. For this setting we propose the first accelerated (in the sense of Nesterov's acceleration) method that simultaneously attains optimal up to a logarithmic factor communication and oracle complexity bounds for smooth strongly convex distributed stochastic optimization. We also consider the case when the communication graph is allowed to vary with time and obtain complexity bounds for our algorithm, which are the first upper complexity bounds for this setting in the literature

    On the Complexity of Approximating Wasserstein Barycenter

    Get PDF
    We study the complexity of approximating Wassertein barycenter of mm discrete measures, or histograms of size nn by contrasting two alternative approaches, both using entropic regularization. The first approach is based on the Iterative Bregman Projections (IBP) algorithm for which our novel analysis gives a complexity bound proportional to mn2Δ2\frac{mn^2}{\varepsilon^2} to approximate the original non-regularized barycenter. Using an alternative accelerated-gradient-descent-based approach, we obtain a complexity proportional to mn2.5Δ\frac{mn^{2.5}}{\varepsilon} . As a byproduct, we show that the regularization parameter in both approaches has to be proportional to Δ\varepsilon, which causes instability of both algorithms when the desired accuracy is high. To overcome this issue, we propose a novel proximal-IBP algorithm, which can be seen as a proximal gradient method, which uses IBP on each iteration to make a proximal step. We also consider the question of scalability of these algorithms using approaches from distributed optimization and show that the first algorithm can be implemented in a centralized distributed setting (master/slave), while the second one is amenable to a more general decentralized distributed setting with an arbitrary network topology.Comment: Corrected misprints. Added a reference to accelerated Iterative Bregman Projections introduced in arXiv:1906.0362

    On the complexity of approximating Wasserstein barycenter

    Get PDF
    We study the complexity of approximating Wassertein barycenter of discrete measures, or histograms by contrasting two alternative approaches, both using entropic regularization. We provide a novel analysis for our approach based on the Iterative Bregman Projections (IBP) algorithm to approximate the original non-regularized barycenter. We also get the complexity bound for alternative accelerated-gradient-descent-based approach and compare it with the bound obtained for IBP. As a byproduct, we show that the regularization parameter in both approaches has to be proportional to ", which causes instability of both algorithms when the desired accuracy is high. To overcome this issue, we propose a novel proximal-IBP algorithm, which can be seen as a proximal gradient method, which uses IBP on each iteration to make a proximal step. We also consider the question of scalability of these algorithms using approaches from distributed optimization and show that the first algorithm can be implemented in a centralized distributed setting (master/slave), while the second one is amenable to a more general decentralized distributed setting with an arbitrary network topology

    Fenchel Dual Gradient Methods for Distributed Convex Optimization Over Time-Varying Networks

    No full text
    corecore