Search CORE

37 research outputs found

On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Optimization

Author: Jin Rong
Xu Yi
Yang Sen
Yang Tianbao
Yuan Zhuoning
Publication venue
Publication date: 05/02/2019
Field of study

Extrapolation is a well-known technique for solving convex optimization and variational inequalities and recently attracts some attention for non-convex optimization. Several recent works have empirically shown its success in some machine learning tasks. However, it has not been analyzed for non-convex minimization and there still remains a gap between the theory and the practice. In this paper, we analyze gradient descent and stochastic gradient descent with extrapolation for finding an approximate first-order stationary point in smooth non-convex optimization problems. Our convergence upper bounds show that the algorithms with extrapolation can be accelerated than without extrapolation

arXiv.org e-Print Archive

Crossref

Accelerated Training of Max-Margin Markov Networks with Kernels

Author: A. Beck
A. Nemirovski
B. Taskar
C. Andrieu
C. Teo
F. Kschischang
G. Bakir
I. Tsochantaridis
M. Collins
M. Wainwright
S. Boyd
S.L. Lauritzen
Y. Altun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Abstract. Structured output prediction is an important machine learn-ing problem both in theory and practice, and the max-margin Markov network (M3N) is an effective approach. All state-of-the-art algorithms for optimizing M3N objectives take at least O(1/) number of iterations to find an accurate solution. [1] broke this barrier by proposing an excessive gap reduction technique (EGR) which converges in O(1/ iterations. However, it is restricted to Euclidean projections which con-sequently requires an intractable amount of computation for each iter-ation when applied to solve M3N. In this paper, we show that by ex-tending EGR to Bregman projection, this faster rate of convergence can be retained, and more importantly, the updates can be performed effi-ciently by exploiting graphical model factorization. Further, we design a kernelized procedure which allows all computations per iteration to be performed at the same cost as the state-of-the-art approaches.

CiteSeerX

Crossref

An efficient symmetric primal-dual algorithmic framework for saddle point problems

Author: He Hongjin
Wang Kai
Yu Jintao
Publication venue
Publication date: 25/07/2023
Field of study

In this paper, we propose a new primal-dual algorithmic framework for a class of convex-concave saddle point problems frequently arising from image processing and machine learning. Our algorithmic framework updates the primal variable between the twice calculations of the dual variable, thereby appearing a symmetric iterative scheme, which is accordingly called the {\bf s}ymmetric {\bf p}r{\bf i}mal-{\bf d}ual {\bf a}lgorithm (SPIDA). It is noteworthy that the subproblems of our SPIDA are equipped with Bregman proximal regularization terms, which make SPIDA versatile in the sense that it enjoys an algorithmic framework covering some existing algorithms such as the classical augmented Lagrangian method (ALM), linearized ALM, and Jacobian splitting algorithms for linearly constrained optimization problems. Besides, our algorithmic framework allows us to derive some customized versions so that SPIDA works as efficiently as possible for structured optimization problems. Theoretically, under some mild conditions, we prove the global convergence of SPIDA and estimate the linear convergence rate under a generalized error bound condition defined by Bregman distance. Finally, a series of numerical experiments on the matrix game, basis pursuit, robust principal component analysis, and image restoration demonstrate that our SPIDA works well on synthetic and real-world datasets.Comment: 32 pages; 5 figure; 7 table

arXiv.org e-Print Archive

Iterative Methods for the Elasticity Imaging Inverse Problem

Author: Winkler Brian C.
Publication venue: RIT Scholar Works
Publication date: 29/07/2014
Field of study

Cancers of the soft tissue reign among the deadliest diseases throughout the world and effective treatments for such cancers rely on early and accurate detection of tumors within the interior of the body. One such diagnostic tool, known as elasticity imaging or elastography, uses measurements of tissue displacement to reconstruct the variable elasticity between healthy and unhealthy tissue inside the body. This gives rise to a challenging parameter identification inverse problem, that of identifying the Lamé parameter μ in a system of partial differential equations in linear elasticity. Due to the near incompressibility of human tissue, however, common techniques for solving the direct and inverse problems are rendered ineffective due to a phenomenon known as the “locking effect”. Alternative methods, such as mixed finite element methods, must be applied to overcome this complication. Using these methods, this work reposes the problem as a generalized saddle point problem along with a presentation of several optimization formulations, including the modified output least squares (MOLS), energy output least squares (EOLS), and equation error (EE) frameworks, for solving the elasticity imaging inverse problem. Subsequently, numerous iterative optimization methods, including gradient, extragradient, and proximal point methods, are explored and applied to solve the related optimization problem. Implementations of all of the iterative techniques under consideration are applied to all of the developed optimization frameworks using a representative numerical example in elasticity imaging. A thorough analysis and comparison of the methods is subsequently presented

RIT Scholar Works

Frank-Wolfe Algorithms for Saddle Point Problems

Author: Gidel Gauthier
Jebara Tony
Lacoste-Julien Simon
Publication venue
Publication date: 25/10/2016
Field of study

We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FW-type saddle point solver over polytopes, thereby partially answering a 30 year-old conjecture. We also survey other convergence results and highlight gaps in the theoretical underpinnings of FW-style algorithms. Motivating applications without known efficient alternatives are explored through structured prediction with combinatorial penalties as well as games over matching polytopes involving an exponential number of constraints.Comment: Appears in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017). 39 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server