380 research outputs found

    Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming

    Full text link
    Motivated by big data applications, first-order methods have been extremely popular in recent years. However, naive gradient methods generally converge slowly. Hence, much efforts have been made to accelerate various first-order methods. This paper proposes two accelerated methods towards solving structured linearly constrained convex programming, for which we assume composite convex objective. The first method is the accelerated linearized augmented Lagrangian method (LALM). At each update to the primal variable, it allows linearization to the differentiable function and also the augmented term, and thus it enables easy subproblems. Assuming merely weak convexity, we show that LALM owns O(1/t)O(1/t) convergence if parameters are kept fixed during all the iterations and can be accelerated to O(1/t2)O(1/t^2) if the parameters are adapted, where tt is the number of total iterations. The second method is the accelerated linearized alternating direction method of multipliers (LADMM). In addition to the composite convexity, it further assumes two-block structure on the objective. Different from classic ADMM, our method allows linearization to the objective and also augmented term to make the update simple. Assuming strong convexity on one block variable, we show that LADMM also enjoys O(1/t2)O(1/t^2) convergence with adaptive parameters. This result is a significant improvement over that in [Goldstein et. al, SIIMS'14], which requires strong convexity on both block variables and no linearization to the objective or augmented term. Numerical experiments are performed on quadratic programming, image denoising, and support vector machine. The proposed accelerated methods are compared to nonaccelerated ones and also existing accelerated methods. The results demonstrate the validness of acceleration and superior performance of the proposed methods over existing ones

    An Adaptive Primal-Dual Framework for Nonsmooth Convex Minimization

    Full text link
    We propose a new self-adaptive, double-loop smoothing algorithm to solve composite, nonsmooth, and constrained convex optimization problems. Our algorithm is based on Nesterov's smoothing technique via general Bregman distance functions. It self-adaptively selects the number of iterations in the inner loop to achieve a desired complexity bound without requiring the accuracy a priori as in variants of Augmented Lagrangian methods (ALM). We prove \BigO{\frac{1}{k}}-convergence rate on the last iterate of the outer sequence for both unconstrained and constrained settings in contrast to ergodic rates which are common in ALM as well as alternating direction method-of-multipliers literature. Compared to existing inexact ALM or quadratic penalty methods, our analysis does not rely on the worst-case bounds of the subproblem solved by the inner loop. Therefore, our algorithm can be viewed as a restarting technique applied to the ASGARD method in \cite{TranDinh2015b} but with rigorous theoretical guarantees or as an inexact ALM with explicit inner loop termination rules and adaptive parameters. Our algorithm only requires to initialize the parameters once, and automatically update them during the iteration process without tuning. We illustrate the superiority of our methods via several examples as compared to the state-of-the-art.Comment: 39 pages, 7 figures, and 5 table

    The primal-dual hybrid gradient method reduces to a primal method for linearly constrained optimization problems

    Full text link
    In this work, we show that for linearly constrained optimization problems the primal-dual hybrid gradient algorithm, analyzed by Chambolle and Pock [3], can be written as an entirely primal algorithm. This allows us to prove convergence of the iterates even in the degenerate cases when the linear system is inconsistent or when the strong duality does not hold. We also obtain new convergence rates which seem to improve existing ones in the literature. For a decentralized distributed optimization we show that the new scheme is much more efficient than the original one

    Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters

    Full text link
    In this paper, we study the communication and (sub)gradient computation costs in distributed optimization and give a sharp complexity analysis for the proposed distributed accelerated gradient methods. We present two algorithms based on the framework of the accelerated penalty method with increasing penalty parameters. Our first algorithm is for smooth distributed optimization and it obtains the near optimal O(LΟ΅(1βˆ’Οƒ2(W))log⁑1Ο΅)O\left(\sqrt{\frac{L}{\epsilon(1-\sigma_2(W))}}\log\frac{1}{\epsilon}\right) communication complexity and the optimal O(LΟ΅)O\left(\sqrt{\frac{L}{\epsilon}}\right) gradient computation complexity for LL-smooth convex problems, where Οƒ2(W)\sigma_2(W) denotes the second largest singular value of the weight matrix WW associated to the network and Ο΅\epsilon is the target accuracy. When the problem is ΞΌ\mu-strongly convex and LL-smooth, our algorithm has the near optimal O(LΞΌ(1βˆ’Οƒ2(W))log⁑21Ο΅)O\left(\sqrt{\frac{L}{\mu(1-\sigma_2(W))}}\log^2\frac{1}{\epsilon}\right) complexity for communications and the optimal O(LΞΌlog⁑1Ο΅)O\left(\sqrt{\frac{L}{\mu}}\log\frac{1}{\epsilon}\right) complexity for gradient computations. Our communication complexities are only worse by a factor of (log⁑1Ο΅)\left(\log\frac{1}{\epsilon}\right) than the lower bounds for the smooth distributed optimization. %As far as we know, our method is the first to achieve both communication and gradient computation lower bounds up to an extra logarithm factor for smooth distributed optimization. Our second algorithm is designed for non-smooth distributed optimization and it achieves both the optimal O(1Ο΅1βˆ’Οƒ2(W))O\left(\frac{1}{\epsilon\sqrt{1-\sigma_2(W)}}\right) communication complexity and O(1Ο΅2)O\left(\frac{1}{\epsilon^2}\right) subgradient computation complexity, which match the communication and subgradient computation complexity lower bounds for non-smooth distributed optimization.Comment: The previous name of this paper was "A Sharp Convergence Rate Analysis for Distributed Accelerated Gradient Methods". The contents are consisten

    Randomized First-Order Methods for Saddle Point Optimization

    Full text link
    In this paper, we present novel randomized algorithms for solving saddle point problems whose dual feasible region is given by the direct product of many convex sets. Our algorithms can achieve an O(1/N){\cal O}(1/N) and O(1/N2){\cal O}(1/N^2) rate of convergence, respectively, for general bilinear saddle point and smooth bilinear saddle point problems based on a new prima-dual termination criterion, and each iteration of these algorithms needs to solve only one randomly selected dual subproblem. Moreover, these algorithms do not require strongly convex assumptions on the objective function and/or the incorporation of a strongly convex perturbation term. They do not necessarily require the primal or dual feasible regions to be bounded or the estimation of the distance from the initial point to the set of optimal solutions to be available either. We show that when applied to linearly constrained problems, RPDs are equivalent to certain randomized variants of the alternating direction method of multipliers (ADMM), while a direct extension of ADMM does not necessarily converge when the number of blocks exceeds two

    Proximal Alternating Penalty Algorithms for Constrained Convex Optimization

    Full text link
    We develop two new proximal alternating penalty algorithms to solve a wide range class of constrained convex optimization problems. Our approach mainly relies on a novel combination of the classical quadratic penalty, alternating minimization, Nesterov's acceleration, and adaptive strategy for parameters. The first algorithm is designed to solve generic and possibly nonsmooth constrained convex problems without requiring any Lipschitz gradient continuity or strong convexity, while achieving the best-known O(1k)O(\frac{1}{k})-convergence rate in a non-ergodic sense, where kk is the iteration counter. The second algorithm is also designed to solve non-strongly convex, but semi-strongly convex problems. This algorithm can achieve the best-known O(1k2)O(\frac{1}{k^2})-convergence rate on the primal constrained problem. Such a rate is obtained in two cases: (i)~averaging only on the iterate sequence of the strongly convex term, or (ii) using two proximal operators of this term without averaging. In both algorithms, we allow one to linearize the second subproblem to use the proximal operator of the corresponding objective term. Then, we customize our methods to solve different convex problems, and lead to new variants. As a byproduct, these algorithms preserve the same convergence guarantees as in our main algorithms. We verify our theoretical development via different numerical examples and compare our methods with some existing state-of-the-art algorithms.Comment: 35 pages, 6 figures and 1 table. The code is available at: https://github.com/quoctd/PAPA-s1.

    Efficiency of minimizing compositions of convex functions and smooth maps

    Full text link
    We consider global efficiency of algorithms for minimizing a sum of a convex function and a composition of a Lipschitz convex function with a smooth map. The basic algorithm we rely on is the prox-linear method, which in each iteration solves a regularized subproblem formed by linearizing the smooth map. When the subproblems are solved exactly, the method has efficiency O(Ξ΅βˆ’2)\mathcal{O}(\varepsilon^{-2}), akin to gradient descent for smooth minimization. We show that when the subproblems can only be solved by first-order methods, a simple combination of smoothing, the prox-linear method, and a fast-gradient scheme yields an algorithm with complexity O~(Ξ΅βˆ’3)\widetilde{\mathcal{O}}(\varepsilon^{-3}). The technique readily extends to minimizing an average of mm composite functions, with complexity O~(m/Ξ΅2+m/Ξ΅3)\widetilde{\mathcal{O}}(m/\varepsilon^{2}+\sqrt{m}/\varepsilon^{3}) in expectation. We round off the paper with an inertial prox-linear method that automatically accelerates in presence of convexity

    Adaptive Smoothing Algorithms for Nonsmooth Composite Convex Minimization

    Full text link
    We propose an adaptive smoothing algorithm based on Nesterov's smoothing technique in \cite{Nesterov2005c} for solving "fully" nonsmooth composite convex optimization problems. Our method combines both Nesterov's accelerated proximal gradient scheme and a new homotopy strategy for smoothness parameter. By an appropriate choice of smoothing functions, we develop a new algorithm that has the O(1Ξ΅)\mathcal{O}\left(\frac{1}{\varepsilon}\right)-worst-case iteration-complexity while preserves the same complexity-per-iteration as in Nesterov's method and allows one to automatically update the smoothness parameter at each iteration. Then, we customize our algorithm to solve four special cases that cover various applications. We also specify our algorithm to solve constrained convex optimization problems and show its convergence guarantee on a primal sequence of iterates. We demonstrate our algorithm through three numerical examples and compare it with other related algorithms.Comment: This paper has 23 pages, 3 figures and 1 tabl

    Bregman Augmented Lagrangian and Its Acceleration

    Full text link
    We study the Bregman Augmented Lagrangian method (BALM) for solving convex problems with linear constraints. For classical Augmented Lagrangian method, the convergence rate and its relation with the proximal point method is well-understood. However, the convergence rate for BALM has not yet been thoroughly studied in the literature. In this paper, we analyze the convergence rates of BALM in terms of the primal objective as well as the feasibility violation. We also develop, for the first time, an accelerated Bregman proximal point method, that improves the convergence rate from O(1/βˆ‘k=0Tβˆ’1Ξ·k)O(1/\sum_{k=0}^{T-1}\eta_k) to O(1/(βˆ‘k=0Tβˆ’1Ξ·k)2)O(1/(\sum_{k=0}^{T-1}\sqrt{\eta_k})^2), where {Ξ·k}k=0Tβˆ’1\{\eta_k\}_{k=0}^{T-1} is the sequence of proximal parameters. When applied to the dual of linearly constrained convex programs, this leads to the construction of an accelerated BALM, that achieves the improved rates for both primal and dual convergences.Comment: 25 pages, 2 figure

    Inertial primal-dual methods for linear equality constrained convex optimization problems

    Full text link
    Inspired by a second-order primal-dual dynamical system [Zeng X, Lei J, Chen J. Dynamical primal-dual accelerated method with applications to network optimization. 2019; arXiv:1912.03690], we propose an inertial primal-dual method for the linear equality constrained convex optimization problem. When the objective function has a "nonsmooth + smooth" composite structure, we further propose an inexact inertial primal-dual method by linearizing the smooth individual function and solving the subproblem inexactly. Assuming merely convexity, we prove that the proposed methods enjoy O(1/k2)\mathcal{O}(1/k^2) convergence rate on L(xk,Ξ»βˆ—)βˆ’L(xβˆ—,Ξ»βˆ—)\mathcal{L}(x_k,\lambda^*)-\mathcal{L}(x^*,\lambda^*) and O(1/k)\mathcal{O}(1/k) convergence rate on primal feasibility, where L\mathcal{L} is the Lagrangian function and (xβˆ—,Ξ»βˆ—)(x^*,\lambda^*) is a saddle point of L\mathcal{L}. Numerical results are reported to demonstrate the validity of the proposed methods
    • …
    corecore