Search CORE

8 research outputs found

Consistent Dynamic Mode Decomposition

Author: Azencot Omri
Bertozzi Andrea
Yin Wotao
Publication venue
Publication date: 01/01/2019
Field of study

We propose a new method for computing Dynamic Mode Decomposition (DMD) evolution matrices, which we use to analyze dynamical systems. Unlike the majority of existing methods, our approach is based on a variational formulation consisting of data alignment penalty terms and constitutive orthogonality constraints. Our method does not make any assumptions on the structure of the data or their size, and thus it is applicable to a wide range of problems including non-linear scenarios or extremely small observation sets. In addition, our technique is robust to noise that is independent of the dynamics and it does not require input data to be sequential. Our key idea is to introduce a regularization term for the forward and backward dynamics. The obtained minimization problem is solved efficiently using the Alternating Method of Multipliers (ADMM) which requires two Sylvester equation solves per iteration. Our numerical scheme converges empirically and is similar to a provably convergent ADMM scheme. We compare our approach to various state-of-the-art methods on several benchmark dynamical systems

arXiv.org e-Print Archive

eScholarship - University of California

On the Iteration Complexity of Smoothed Proximal ALM for Nonconvex Optimization Problem with Convex Constraints

Author: Luo Zhi-Quan
Pu Wenqiang
Zhang Jiawei
Publication venue
Publication date: 31/10/2022
Field of study

It is well-known that the lower bound of iteration complexity for solving nonconvex unconstrained optimization problems is

\Omega(1/\epsilon^2)

, which can be achieved by standard gradient descent algorithm when the objective function is smooth. This lower bound still holds for nonconvex constrained problems, while it is still unknown whether a first-order method can achieve this lower bound. In this paper, we show that a simple single-loop first-order algorithm called smoothed proximal augmented Lagrangian method (ALM) can achieve such iteration complexity lower bound. The key technical contribution is a strong local error bound for a general convex constrained problem, which is of independent interest

arXiv.org e-Print Archive

Exterior-point Optimization for Nonconvex Learning

Author: Gupta Shuvomoy Das
Stellato Bartolomeo
Van Parys Bart P. G.
Publication venue
Publication date: 23/12/2021
Field of study

In this paper we present the nonconvex exterior-point optimization solver (NExOS) -- a novel first-order algorithm tailored to constrained nonconvex learning problems. We consider the problem of minimizing a convex function over nonconvex constraints, where the projection onto the constraint set is single-valued around local minima. A wide range of nonconvex learning problems have this structure including (but not limited to) sparse and low-rank optimization problems. By exploiting the underlying geometry of the constraint set, NExOS finds a locally optimal point by solving a sequence of penalized problems with strictly decreasing penalty parameters. NExOS solves each penalized problem by applying a first-order algorithm, which converges linearly to a local minimum of the corresponding penalized formulation under regularity conditions. Furthermore, the local minima of the penalized problems converge to a local minimum of the original problem as the penalty parameter goes to zero. We implement NExOS in the open-source Julia package NExOS.jl, which has been extensively tested on many instances from a wide variety of learning problems. We demonstrate that our algorithm, in spite of being general purpose, outperforms specialized methods on several examples of well-known nonconvex learning problems involving sparse and low-rank optimization. For sparse regression problems, NExOS finds locally optimal solutions which dominate glmnet in terms of support recovery, yet its training loss is smaller by an order of magnitude. For low-rank optimization with real-world data, NExOS recovers solutions with 3 fold training loss reduction, but with a proportion of explained variance that is 2 times better compared to the nuclear norm heuristic.Comment: 40 pages, 6 figure

arXiv.org e-Print Archive

Recommended from our members

Convex Optimization and Extensions, with a View Toward Large-Scale Problems

Author: Gao Wenbo
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Machine learning is a major source of interesting optimization problems of current interest. These problems tend to be challenging because of their enormous scale, which makes it difficult to apply traditional optimization algorithms. We explore three avenues to designing algorithms suited to handling these challenges, with a view toward large-scale ML tasks. The first is to develop better general methods for unconstrained minimization. The second is to tailor methods to the features of modern systems, namely the availability of distributed computing. The third is to use specialized algorithms to exploit specific problem structure. Chapters 2 and 3 focus on improving quasi-Newton methods, a mainstay of unconstrained optimization. In Chapter 2, we analyze an extension of quasi-Newton methods wherein we use block updates, which add curvature information to the Hessian approximation on a higher-dimensional subspace. This defines a family of methods, Block BFGS, that form a spectrum between the classical BFGS method and Newton's method, in terms of the amount of curvature information used. We show that by adding a correction step, the Block BFGS method inherits the convergence guarantees of BFGS for deterministic problems, most notably a Q-superlinear convergence rate for strongly convex problems. To explore the tradeoff between reduced iterations and greater work per iteration of block methods, we present a set of numerical experiments. In Chapter 3, we focus on the problem of step size determination. To obviate the need for line searches, and for pre-computing fixed step sizes, we derive an analytic step size, which we call curvature-adaptive, for self-concordant functions. This adaptive step size allows us to generalize the damped Newton method of Nesterov to other iterative methods, including gradient descent and quasi-Newton methods. We provide simple proofs of convergence, including superlinear convergence for adaptive BFGS, allowing us to obtain superlinear convergence without line searches. In Chapter 4, we move from general algorithms to hardware-influenced algorithms. We consider a form of distributed stochastic gradient descent that we call Leader SGD, which is inspired by the Elastic Averaging SGD method. These methods are intended for distributed settings where communication between machines may be expensive, making it important to set their consensus mechanism. We show that LSGD avoids an issue with spurious stationary points that affects EASGD, and provide a convergence analysis of LSGD. In the stochastic strongly convex setting, LSGD converges at the rate O(1/k) with diminishing step sizes, matching other distributed methods. We also analyze the impact of varying communication delays, stochasticity in the selection of the leader points, and under what conditions LSGD may produce better search directions than the gradient alone. In Chapter 5, we switch again to focus on algorithms to exploit problem structure. Specifically, we consider problems where variables satisfy multiaffine constraints, which motivates us to apply the Alternating Direction Method of Multipliers (ADMM). Problems that can be formulated with such a structure include representation learning (e.g with dictionaries) and deep learning. We show that ADMM can be applied directly to multiaffine problems. By extending the theory of nonconvex ADMM, we prove that ADMM is convergent on multiaffine problems satisfying certain assumptions, and more broadly, analyze the theoretical properties of ADMM for general problems, investigating the effect of different types of structure

Columbia University Academic Commons