Search CORE

1,118 research outputs found

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Author: Chen Yuxin
Chi Yuejie
Lu Yue M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/09/2019
Field of study

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.Comment: Invited overview articl

arXiv.org e-Print Archive

Scalable Robust Matrix Factorization with Nonconvex Loss

Author: Kwok James T.
Yao Quanming
Publication venue
Publication date: 23/09/2018
Field of study

Robust matrix factorization (RMF), which uses the

\ell_1

-loss, often outperforms standard matrix factorization using the

\ell_2

-loss, particularly when outliers are present. The state-of-the-art RMF solver is the RMF-MM algorithm, which, however, cannot utilize data sparsity. Moreover, sometimes even the (convex)

\ell_1

-loss is not robust enough. In this paper, we propose the use of nonconvex loss to enhance robustness. To address the resultant difficult optimization problem, we use majorization-minimization (MM) optimization and propose a new MM surrogate. To improve scalability, we exploit data sparsity and optimize the surrogate via its dual with the accelerated proximal gradient algorithm. The resultant algorithm has low time and space complexities and is guaranteed to converge to a critical point. Extensive experiments demonstrate its superiority over the state-of-the-art in terms of both accuracy and scalability

arXiv.org e-Print Archive

Exploiting the structure effectively and efficiently in low rank matrix recovery

Author: Cai Jian-Feng
Wei Ke
Publication venue
Publication date: 10/09/2018
Field of study

Low rank model arises from a wide range of applications, including machine learning, signal processing, computer algebra, computer vision, and imaging science. Low rank matrix recovery is about reconstructing a low rank matrix from incomplete measurements. In this survey we review recent developments on low rank matrix recovery, focusing on three typical scenarios: matrix sensing, matrix completion and phase retrieval. An overview of effective and efficient approaches for the problem is given, including nuclear norm minimization, projected gradient descent based on matrix factorization, and Riemannian optimization based on the embedded manifold of low rank matrices. Numerical recipes of different approaches are emphasized while accompanied by the corresponding theoretical recovery guarantees

arXiv.org e-Print Archive

Global Optimality in Low-rank Matrix Optimization

Author: Li Qiuwei
Tang Gongguo
Wakin Michael B.
Zhu Zhihui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2018
Field of study

This paper considers the minimization of a general objective function

f(X)

over the set of rectangular

n\times m

matrices that have rank at most

r

. To reduce the computational burden, we factorize the variable

X

into a product of two smaller matrices and optimize over these two matrices instead of

X

. Despite the resulting nonconvexity, recent studies in matrix completion and sensing have shown that the factored problem has no spurious local minima and obeys the so-called strict saddle property (the function has a directional negative curvature at all critical points but local minima). We analyze the global geometry for a general and yet well-conditioned objective function

f(X)

whose restricted strong convexity and restricted strong smoothness constants are comparable. In particular, we show that the reformulated objective function has no spurious local minima and obeys the strict saddle property. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) can provably solve the factored problem with global convergence

arXiv.org e-Print Archive

Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation

Author: Chen Yudong
Chi Yuejie
Publication venue
Publication date: 02/05/2018
Field of study

Low-rank modeling plays a pivotal role in signal processing and machine learning, with applications ranging from collaborative filtering, video surveillance, medical imaging, to dimensionality reduction and adaptive filtering. Many modern high-dimensional data and interactions thereof can be modeled as lying approximately in a low-dimensional subspace or manifold, possibly with additional structures, and its proper exploitations lead to significant reduction of costs in sensing, computation and storage. In recent years, there is a plethora of progress in understanding how to exploit low-rank structures using computationally efficient procedures in a provable manner, including both convex and nonconvex approaches. On one side, convex relaxations such as nuclear norm minimization often lead to statistically optimal procedures for estimating low-rank matrices, where first-order methods are developed to address the computational challenges; on the other side, there is emerging evidence that properly designed nonconvex procedures, such as projected gradient descent, often provide globally optimal solutions with a much lower computational cost in many problems. This survey article will provide a unified overview of these recent advances on low-rank matrix estimation from incomplete measurements. Attention is paid to rigorous characterization of the performance of these algorithms, and to problems where the low-rank matrix have additional structural properties that require new algorithmic designs and theoretical analysis.Comment: To appear in IEEE Signal Processing Magazin

arXiv.org e-Print Archive

Model-free Nonconvex Matrix Completion: Local Minima Analysis and Applications in Memory-efficient Kernel PCA

Author: Chen Ji
Li Xiaodong
Publication venue
Publication date: 04/04/2019
Field of study

This work studies low-rank approximation of a positive semidefinite matrix from partial entries via nonconvex optimization. We characterized how well local-minimum based low-rank factorization approximates a fixed positive semidefinite matrix without any assumptions on the rank-matching, the condition number or eigenspace incoherence parameter. Furthermore, under certain assumptions on rank-matching and well-boundedness of condition numbers and eigenspace incoherence parameters, a corollary of our main theorem improves the state-of-the-art sampling rate results for nonconvex matrix completion with no spurious local minima in Ge et al. [2016, 2017]. In addition, we investigated when the proposed nonconvex optimization results in accurate low-rank approximations even in presence of large condition numbers, large incoherence parameters, or rank mismatching. We also propose to apply the nonconvex optimization to memory-efficient Kernel PCA. Compared to the well-known Nystr\"{o}m methods, numerical experiments indicate that the proposed nonconvex optimization approach yields more stable results in both low-rank approximation and clustering.Comment: Main theorem improve

arXiv.org e-Print Archive

The Global Optimization Geometry of Low-Rank Matrix Optimization

Author: Li Qiuwei
Tang Gongguo
Wakin Michael B.
Zhu Zhihui
Publication venue
Publication date: 04/01/2018
Field of study

This paper considers general rank-constrained optimization problems that minimize a general objective function

f(X)

over the set of rectangular

n\times m

matrices that have rank at most

r

. To tackle the rank constraint and also to reduce the computational burden, we factorize

X

into

UV^T

where

U

and

V

are

n\times r

and

m\times r

matrices, respectively, and then optimize over the small matrices

U

and

V

. We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function

f

satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find

n\times r

and

m\times r

matrices

U

and

V

such that

UV^T

approximates a given matrix

X^\star

. Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where

rank(X^\star) = r

, but also for the over-parameterization case where

rank(X^\star) < r

and the under-parameterization case where

rank(X^\star) > r

. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization

arXiv.org e-Print Archive

Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent

Author: Lafferty John
Zheng Qinqing
Publication venue
Publication date: 21/11/2016
Field of study

We address the rectangular matrix completion problem by lifting the unknown matrix to a positive semidefinite matrix in higher dimension, and optimizing a nonconvex objective over the semidefinite factor using a simple gradient descent scheme. With

O( \mu r^2 \kappa^2 n \max(\mu, \log n))

random observations of a

n_1 \times n_2

\mu

-incoherent matrix of rank

r

and condition number

\kappa

, where

n = \max(n_1, n_2)

, the algorithm linearly converges to the global optimum with high probability

arXiv.org e-Print Archive

Noisy Matrix Completion: Understanding Statistical Guarantees for Convex Relaxation via Nonconvex Optimization

Author: Chen Yuxin
Chi Yuejie
Fan Jianqing
Ma Cong
Yan Yuling
Publication venue
Publication date: 07/10/2019
Field of study

This paper studies noisy low-rank matrix completion: given partial and noisy entries of a large low-rank matrix, the goal is to estimate the underlying matrix faithfully and efficiently. Arguably one of the most popular paradigms to tackle this problem is convex relaxation, which achieves remarkable efficacy in practice. However, the theoretical support of this approach is still far from optimal in the noisy setting, falling short of explaining its empirical success. We make progress towards demystifying the practical efficacy of convex relaxation vis-\`a-vis random noise. When the rank and the condition number of the unknown matrix are bounded by a constant, we demonstrate that the convex programming approach achieves near-optimal estimation errors --- in terms of the Euclidean loss, the entrywise loss, and the spectral norm loss --- for a wide range of noise levels. All of this is enabled by bridging convex relaxation with the nonconvex Burer-Monteiro approach, a seemingly distinct algorithmic paradigm that is provably robust against noise. More specifically, we show that an approximate critical point of the nonconvex formulation serves as an extremely tight approximation of the convex solution, thus allowing us to transfer the desired statistical guarantees of the nonconvex approach to its convex counterpart

arXiv.org e-Print Archive

How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?

Author: Josz Cédric
Lavaei Javad
Sojoudi Somayeh
Zhang Richard Y.
Publication venue
Publication date: 30/10/2018
Field of study

When the linear measurements of an instance of low-rank matrix recovery satisfy a restricted isometry property (RIP)---i.e. they are approximately norm-preserving---the problem is known to contain no spurious local minima, so exact recovery is guaranteed. In this paper, we show that moderate RIP is not enough to eliminate spurious local minima, so existing results can only hold for near-perfect RIP. In fact, counterexamples are ubiquitous: we prove that every x is the spurious local minimum of a rank-1 instance of matrix recovery that satisfies RIP. One specific counterexample has RIP constant

\delta=1/2

, but causes randomly initialized stochastic gradient descent (SGD) to fail 12% of the time. SGD is frequently able to avoid and escape spurious local minima, but this empirical result shows that it can occasionally be defeated by their existence. Hence, while exact recovery guarantees will likely require a proof of no spurious local minima, arguments based solely on norm preservation will only be applicable to a narrow set of nearly-isotropic instances.Comment: 32nd Conference on Neural Information Processing Systems (NIPS 2018

arXiv.org e-Print Archive