8 research outputs found
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
This paper studies the role of over-parametrization in solving non-convex
optimization problems. The focus is on the important class of low-rank matrix
sensing, where we propose an infinite hierarchy of non-convex problems via the
lifting technique and the Burer-Monteiro factorization. This contrasts with the
existing over-parametrization technique where the search rank is limited by the
dimension of the matrix and it does not allow a rich over-parametrization of an
arbitrary degree. We show that although the spurious solutions of the problem
remain stationary points through the hierarchy, they will be transformed into
strict saddle points (under some technical conditions) and can be escaped via
local search methods. This is the first result in the literature showing that
over-parametrization creates a negative curvature for escaping spurious
solutions. We also derive a bound on how much over-parametrization is requited
to enable the elimination of spurious solutions
Algorithmic Regularization in Tensor Optimization: Towards a Lifted Approach in Matrix Sensing
Gradient descent (GD) is crucial for generalization in machine learning
models, as it induces implicit regularization, promoting compact
representations. In this work, we examine the role of GD in inducing implicit
regularization for tensor optimization, particularly within the context of the
lifted matrix sensing framework. This framework has been recently proposed to
address the non-convex matrix sensing problem by transforming spurious
solutions into strict saddles when optimizing over symmetric, rank-1 tensors.
We show that, with sufficiently small initialization scale, GD applied to this
lifted problem results in approximate rank-1 tensors and critical points with
escape directions. Our findings underscore the significance of the tensor
parametrization of matrix sensing, in combination with first-order methods, in
achieving global optimality in such problems.Comment: NeurIPS23 Poste