107 research outputs found

    On the Burer-Monteiro method for general semidefinite programs

    Full text link
    Consider a semidefinite program (SDP) involving an n×nn\times n positive semidefinite matrix XX. The Burer-Monteiro method uses the substitution X=YYTX=Y Y^T to obtain a nonconvex optimization problem in terms of an n×pn\times p matrix YY. Boumal et al. showed that this nonconvex method provably solves equality-constrained SDPs with a generic cost matrix when p≳2mp \gtrsim \sqrt{2m}, where mm is the number of constraints. In this note we extend their result to arbitrary SDPs, possibly involving inequalities or multiple semidefinite constraints. We derive similar guarantees for a fixed cost matrix and generic constraints. We illustrate applications to matrix sensing and integer quadratic minimization.Comment: 10 page

    Low-Rank Univariate Sum of Squares Has No Spurious Local Minima

    Full text link
    We study the problem of decomposing a polynomial pp into a sum of rr squares by minimizing a quadratically penalized objective fp(u)=∥∑i=1rui2−p∥2f_p(\mathbf{u}) = \left\lVert \sum_{i=1}^r u_i^2 - p\right\lVert^2. This objective is nonconvex and is equivalent to the rank-rr Burer-Monteiro factorization of a semidefinite program (SDP) encoding the sum of squares decomposition. We show that for all univariate polynomials pp, if r≥2r \ge 2 then fp(u)f_p(\mathbf{u}) has no spurious second-order critical points, showing that all local optima are also global optima. This is in contrast to previous work showing that for general SDPs, in addition to genericity conditions, rr has to be roughly the square root of the number of constraints (the degree of pp) for there to be no spurious second-order critical points. Our proof uses tools from computational algebraic geometry and can be interpreted as constructing a certificate using the first- and second-order necessary conditions. We also show that by choosing a norm based on sampling equally-spaced points on the circle, the gradient ∇fp\nabla f_p can be computed in nearly linear time using fast Fourier transforms. Experimentally we demonstrate that this method has very fast convergence using first-order optimization algorithms such as L-BFGS, with near-linear scaling to million-degree polynomials.Comment: 18 pages, to appear in SIAM Journal on Optimizatio

    The Global Geometry of Centralized and Distributed Low-rank Matrix Recovery without Regularization

    Full text link
    Low-rank matrix recovery is a fundamental problem in signal processing and machine learning. A recent very popular approach to recovering a low-rank matrix X is to factorize it as a product of two smaller matrices, i.e., X = UV^T, and then optimize over U, V instead of X. Despite the resulting non-convexity, recent results have shown that many factorized objective functions actually have benign global geometry---with no spurious local minima and satisfying the so-called strict saddle property---ensuring convergence to a global minimum for many local-search algorithms. Such results hold whenever the original objective function is restricted strongly convex and smooth. However, most of these results actually consider a modified cost function that includes a balancing regularizer. While useful for deriving theory, this balancing regularizer does not appear to be necessary in practice. In this work, we close this theory-practice gap by proving that the unaltered factorized non-convex problem, without the balancing regularizer, also has similar benign global geometry. Moreover, we also extend our theoretical results to the field of distributed optimization

    Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points

    Full text link
    This paper studies the role of over-parametrization in solving non-convex optimization problems. The focus is on the important class of low-rank matrix sensing, where we propose an infinite hierarchy of non-convex problems via the lifting technique and the Burer-Monteiro factorization. This contrasts with the existing over-parametrization technique where the search rank is limited by the dimension of the matrix and it does not allow a rich over-parametrization of an arbitrary degree. We show that although the spurious solutions of the problem remain stationary points through the hierarchy, they will be transformed into strict saddle points (under some technical conditions) and can be escaped via local search methods. This is the first result in the literature showing that over-parametrization creates a negative curvature for escaping spurious solutions. We also derive a bound on how much over-parametrization is requited to enable the elimination of spurious solutions
    • …
    corecore