13 research outputs found
Solving Complex Quadratic Systems with Full-Rank Random Matrices
We tackle the problem of recovering a complex signal from quadratic measurements of the form , where is a full-rank,
complex random measurement matrix whose entries are generated from a
rotation-invariant sub-Gaussian distribution. We formulate it as the
minimization of a nonconvex loss. This problem is related to the well
understood phase retrieval problem where the measurement matrix is a rank-1
positive semidefinite matrix. Here we study the general full-rank case which
models a number of key applications such as molecular geometry recovery from
distance distributions and compound measurements in phaseless diffractive
imaging. Most prior works either address the rank-1 case or focus on real
measurements. The several papers that address the full-rank complex case adopt
the computationally-demanding semidefinite relaxation approach. In this paper
we prove that the general class of problems with rotation-invariant
sub-Gaussian measurement models can be efficiently solved with high probability
via the standard framework comprising a spectral initialization followed by
iterative Wirtinger flow updates on a nonconvex loss. Numerical experiments on
simulated data corroborate our theoretical analysis.Comment: This updated version of the manuscript addresses several important
issues in the initial arXiv submissio
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Many problems encountered in science and engineering can be formulated as
estimating a low-rank object (e.g., matrices and tensors) from incomplete, and
possibly corrupted, linear measurements. Through the lens of matrix and tensor
factorization, one of the most popular approaches is to employ simple iterative
algorithms such as gradient descent (GD) to recover the low-rank factors
directly, which allow for small memory and computation footprints. However, the
convergence rate of GD depends linearly, and sometimes even quadratically, on
the condition number of the low-rank object, and therefore, GD slows down
painstakingly when the problem is ill-conditioned. This chapter introduces a
new algorithmic approach, dubbed scaled gradient descent (ScaledGD), that
provably converges linearly at a constant rate independent of the condition
number of the low-rank object, while maintaining the low per-iteration cost of
gradient descent for a variety of tasks including sensing, robust principal
component analysis and completion. In addition, ScaledGD continues to admit
fast global convergence to the minimax-optimal solution, again almost
independent of the condition number, from a small random initialization when
the rank is over-specified in the presence of Gaussian noise. In total,
ScaledGD highlights the power of appropriate preconditioning in accelerating
nonconvex statistical estimation, where the iteration-varying preconditioners
promote desirable invariance properties of the trajectory with respect to the
symmetry in low-rank factorization without hurting generalization.Comment: Book chapter for "Explorations in the Mathematics of Data Science -
The Inaugural Volume of the Center for Approximation and Mathematical Data
Analytics". arXiv admin note: text overlap with arXiv:2104.1452
Landscape Correspondence of Empirical and Population Risks in the Eigendecomposition Problem
Spectral methods include a family of algorithms related to the eigenvectors
of certain data-generated matrices. In this work, we are interested in studying
the geometric landscape of the eigendecomposition problem in various spectral
methods. In particular, we first extend known results regarding the landscape
at critical points to larger regions near the critical points in a special case
of finding the leading eigenvector of a symmetric matrix. For a more general
eigendecomposition problem, inspired by recent findings on the connection
between the landscapes of empirical risk and population risk, we then build a
novel connection between the landscape of an eigendecomposition problem that
uses random measurements and the one that uses the true data matrix. We also
apply our theory to a variety of low-rank matrix optimization problems and
conduct a series of simulations to illustrate our theoretical findings
Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data
This paper delivers improved theoretical guarantees for the convex
programming approach in low-rank matrix estimation, in the presence of (1)
random noise, (2) gross sparse outliers, and (3) missing data. This problem,
often dubbed as robust principal component analysis (robust PCA), finds
applications in various domains. Despite the wide applicability of convex
relaxation, the available statistical support (particularly the stability
analysis vis-a-vis random noise) remains highly suboptimal, which we strengthen
in this paper. When the unknown matrix is well-conditioned, incoherent, and of
constant rank, we demonstrate that a principled convex program achieves
near-optimal statistical accuracy, in terms of both the Euclidean loss and the
loss. All of this happens even when nearly a constant fraction
of observations are corrupted by outliers with arbitrary magnitudes. The key
analysis idea lies in bridging the convex program in use and an auxiliary
nonconvex optimization algorithm, and hence the title of this paper