37 research outputs found

    On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Optimization

    Full text link
    Extrapolation is a well-known technique for solving convex optimization and variational inequalities and recently attracts some attention for non-convex optimization. Several recent works have empirically shown its success in some machine learning tasks. However, it has not been analyzed for non-convex minimization and there still remains a gap between the theory and the practice. In this paper, we analyze gradient descent and stochastic gradient descent with extrapolation for finding an approximate first-order stationary point in smooth non-convex optimization problems. Our convergence upper bounds show that the algorithms with extrapolation can be accelerated than without extrapolation

    Accelerated Training of Max-Margin Markov Networks with Kernels

    Full text link
    Abstract. Structured output prediction is an important machine learn-ing problem both in theory and practice, and the max-margin Markov network (M3N) is an effective approach. All state-of-the-art algorithms for optimizing M3N objectives take at least O(1/) number of iterations to find an accurate solution. [1] broke this barrier by proposing an excessive gap reduction technique (EGR) which converges in O(1/ iterations. However, it is restricted to Euclidean projections which con-sequently requires an intractable amount of computation for each iter-ation when applied to solve M3N. In this paper, we show that by ex-tending EGR to Bregman projection, this faster rate of convergence can be retained, and more importantly, the updates can be performed effi-ciently by exploiting graphical model factorization. Further, we design a kernelized procedure which allows all computations per iteration to be performed at the same cost as the state-of-the-art approaches.

    An efficient symmetric primal-dual algorithmic framework for saddle point problems

    Full text link
    In this paper, we propose a new primal-dual algorithmic framework for a class of convex-concave saddle point problems frequently arising from image processing and machine learning. Our algorithmic framework updates the primal variable between the twice calculations of the dual variable, thereby appearing a symmetric iterative scheme, which is accordingly called the {\bf s}ymmetric {\bf p}r{\bf i}mal-{\bf d}ual {\bf a}lgorithm (SPIDA). It is noteworthy that the subproblems of our SPIDA are equipped with Bregman proximal regularization terms, which make SPIDA versatile in the sense that it enjoys an algorithmic framework covering some existing algorithms such as the classical augmented Lagrangian method (ALM), linearized ALM, and Jacobian splitting algorithms for linearly constrained optimization problems. Besides, our algorithmic framework allows us to derive some customized versions so that SPIDA works as efficiently as possible for structured optimization problems. Theoretically, under some mild conditions, we prove the global convergence of SPIDA and estimate the linear convergence rate under a generalized error bound condition defined by Bregman distance. Finally, a series of numerical experiments on the matrix game, basis pursuit, robust principal component analysis, and image restoration demonstrate that our SPIDA works well on synthetic and real-world datasets.Comment: 32 pages; 5 figure; 7 table

    Iterative Methods for the Elasticity Imaging Inverse Problem

    Get PDF
    Cancers of the soft tissue reign among the deadliest diseases throughout the world and effective treatments for such cancers rely on early and accurate detection of tumors within the interior of the body. One such diagnostic tool, known as elasticity imaging or elastography, uses measurements of tissue displacement to reconstruct the variable elasticity between healthy and unhealthy tissue inside the body. This gives rise to a challenging parameter identification inverse problem, that of identifying the Lamé parameter μ in a system of partial differential equations in linear elasticity. Due to the near incompressibility of human tissue, however, common techniques for solving the direct and inverse problems are rendered ineffective due to a phenomenon known as the “locking effect”. Alternative methods, such as mixed finite element methods, must be applied to overcome this complication. Using these methods, this work reposes the problem as a generalized saddle point problem along with a presentation of several optimization formulations, including the modified output least squares (MOLS), energy output least squares (EOLS), and equation error (EE) frameworks, for solving the elasticity imaging inverse problem. Subsequently, numerous iterative optimization methods, including gradient, extragradient, and proximal point methods, are explored and applied to solve the related optimization problem. Implementations of all of the iterative techniques under consideration are applied to all of the developed optimization frameworks using a representative numerical example in elasticity imaging. A thorough analysis and comparison of the methods is subsequently presented

    Frank-Wolfe Algorithms for Saddle Point Problems

    Full text link
    We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FW-type saddle point solver over polytopes, thereby partially answering a 30 year-old conjecture. We also survey other convergence results and highlight gaps in the theoretical underpinnings of FW-style algorithms. Motivating applications without known efficient alternatives are explored through structured prediction with combinatorial penalties as well as games over matching polytopes involving an exponential number of constraints.Comment: Appears in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017). 39 page
    corecore