240 research outputs found

    Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

    Full text link
    In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in O(1/t)O(1/\sqrt{t}), the structure of the communication network only impacts a second-order term in O(1/t)O(1/t), where tt is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a d1/4d^{1/4} multiplicative factor of the optimal convergence rate, where dd is the underlying dimension.Comment: 17 page

    GMRES-Accelerated ADMM for Quadratic Objectives

    Full text link
    We consider the sequence acceleration problem for the alternating direction method-of-multipliers (ADMM) applied to a class of equality-constrained problems with strongly convex quadratic objectives, which frequently arise as the Newton subproblem of interior-point methods. Within this context, the ADMM update equations are linear, the iterates are confined within a Krylov subspace, and the General Minimum RESidual (GMRES) algorithm is optimal in its ability to accelerate convergence. The basic ADMM method solves a Îş\kappa-conditioned problem in O(Îş)O(\sqrt{\kappa}) iterations. We give theoretical justification and numerical evidence that the GMRES-accelerated variant consistently solves the same problem in O(Îş1/4)O(\kappa^{1/4}) iterations for an order-of-magnitude reduction in iterations, despite a worst-case bound of O(Îş)O(\sqrt{\kappa}) iterations. The method is shown to be competitive against standard preconditioned Krylov subspace methods for saddle-point problems. The method is embedded within SeDuMi, a popular open-source solver for conic optimization written in MATLAB, and used to solve many large-scale semidefinite programs with error that decreases like O(1/k2)O(1/k^{2}), instead of O(1/k)O(1/k), where kk is the iteration index.Comment: 31 pages, 7 figures. Accepted for publication in SIAM Journal on Optimization (SIOPT

    Dual-Free Stochastic Decentralized Optimization with Variance Reduction

    Get PDF
    We consider the problem of training machine learning models on distributed data in a decentralized way. For finite-sum problems, fast single-machine algorithms for large datasets rely on stochastic updates combined with variance reduction. Yet, existing decentralized stochastic algorithms either do not obtain the full speedup allowed by stochastic updates, or require oracles that are more expensive than regular gradients. In this work, we introduce a Decentralized stochastic algorithm with Variance Reduction called DVR. DVR only requires computing stochastic gradients of the local functions, and is computationally as fast as a standard stochastic variance-reduced algorithms run on a 1/n1/n fraction of the dataset, where nn is the number of nodes. To derive DVR, we use Bregman coordinate descent on a well-chosen dual problem, and obtain a dual-free algorithm using a specific Bregman divergence. We give an accelerated version of DVR based on the Catalyst framework, and illustrate its effectiveness with simulations on real data
    • …
    corecore