107 research outputs found
On the Burer-Monteiro method for general semidefinite programs
Consider a semidefinite program (SDP) involving an positive
semidefinite matrix . The Burer-Monteiro method uses the substitution to obtain a nonconvex optimization problem in terms of an
matrix . Boumal et al. showed that this nonconvex method provably solves
equality-constrained SDPs with a generic cost matrix when , where is the number of constraints. In this note we extend
their result to arbitrary SDPs, possibly involving inequalities or multiple
semidefinite constraints. We derive similar guarantees for a fixed cost matrix
and generic constraints. We illustrate applications to matrix sensing and
integer quadratic minimization.Comment: 10 page
Low-Rank Univariate Sum of Squares Has No Spurious Local Minima
We study the problem of decomposing a polynomial into a sum of
squares by minimizing a quadratically penalized objective . This objective is nonconvex
and is equivalent to the rank- Burer-Monteiro factorization of a
semidefinite program (SDP) encoding the sum of squares decomposition. We show
that for all univariate polynomials , if then
has no spurious second-order critical points, showing that all local optima are
also global optima. This is in contrast to previous work showing that for
general SDPs, in addition to genericity conditions, has to be roughly the
square root of the number of constraints (the degree of ) for there to be no
spurious second-order critical points. Our proof uses tools from computational
algebraic geometry and can be interpreted as constructing a certificate using
the first- and second-order necessary conditions. We also show that by choosing
a norm based on sampling equally-spaced points on the circle, the gradient
can be computed in nearly linear time using fast Fourier
transforms. Experimentally we demonstrate that this method has very fast
convergence using first-order optimization algorithms such as L-BFGS, with
near-linear scaling to million-degree polynomials.Comment: 18 pages, to appear in SIAM Journal on Optimizatio
The Global Geometry of Centralized and Distributed Low-rank Matrix Recovery without Regularization
Low-rank matrix recovery is a fundamental problem in signal processing and
machine learning. A recent very popular approach to recovering a low-rank
matrix X is to factorize it as a product of two smaller matrices, i.e., X =
UV^T, and then optimize over U, V instead of X. Despite the resulting
non-convexity, recent results have shown that many factorized objective
functions actually have benign global geometry---with no spurious local minima
and satisfying the so-called strict saddle property---ensuring convergence to a
global minimum for many local-search algorithms. Such results hold whenever the
original objective function is restricted strongly convex and smooth. However,
most of these results actually consider a modified cost function that includes
a balancing regularizer. While useful for deriving theory, this balancing
regularizer does not appear to be necessary in practice. In this work, we close
this theory-practice gap by proving that the unaltered factorized non-convex
problem, without the balancing regularizer, also has similar benign global
geometry. Moreover, we also extend our theoretical results to the field of
distributed optimization
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
This paper studies the role of over-parametrization in solving non-convex
optimization problems. The focus is on the important class of low-rank matrix
sensing, where we propose an infinite hierarchy of non-convex problems via the
lifting technique and the Burer-Monteiro factorization. This contrasts with the
existing over-parametrization technique where the search rank is limited by the
dimension of the matrix and it does not allow a rich over-parametrization of an
arbitrary degree. We show that although the spurious solutions of the problem
remain stationary points through the hierarchy, they will be transformed into
strict saddle points (under some technical conditions) and can be escaped via
local search methods. This is the first result in the literature showing that
over-parametrization creates a negative curvature for escaping spurious
solutions. We also derive a bound on how much over-parametrization is requited
to enable the elimination of spurious solutions
- …