611 research outputs found

    Exploiting the structure effectively and efficiently in low rank matrix recovery

    Full text link
    Low rank model arises from a wide range of applications, including machine learning, signal processing, computer algebra, computer vision, and imaging science. Low rank matrix recovery is about reconstructing a low rank matrix from incomplete measurements. In this survey we review recent developments on low rank matrix recovery, focusing on three typical scenarios: matrix sensing, matrix completion and phase retrieval. An overview of effective and efficient approaches for the problem is given, including nuclear norm minimization, projected gradient descent based on matrix factorization, and Riemannian optimization based on the embedded manifold of low rank matrices. Numerical recipes of different approaches are emphasized while accompanied by the corresponding theoretical recovery guarantees

    Finding Low-Rank Solutions via Non-Convex Matrix Factorization, Efficiently and Provably

    Full text link
    A rank-rr matrix XRm×nX \in \mathbb{R}^{m \times n} can be written as a product UVU V^\top, where URm×rU \in \mathbb{R}^{m \times r} and VRn×rV \in \mathbb{R}^{n \times r}. One could exploit this observation in optimization: e.g., consider the minimization of a convex function f(X)f(X) over rank-rr matrices, where the set of rank-rr matrices is modeled via the factorization UVUV^\top. Though such parameterization reduces the number of variables, and is more computationally efficient (of particular interest is the case rmin{m,n}r \ll \min\{m, n\}), it comes at a cost: f(UV)f(UV^\top) becomes a non-convex function w.r.t. UU and VV. We study such parameterization for optimization of generic convex objectives ff, and focus on first-order, gradient descent algorithmic solutions. We propose the Bi-Factored Gradient Descent (BFGD) algorithm, an efficient first-order method that operates on the U,VU, V factors. We show that when ff is (restricted) smooth, BFGD has local sublinear convergence, and linear convergence when ff is both (restricted) smooth and (restricted) strongly convex. For several key applications, we provide simple and efficient initialization schemes that provide approximate solutions good enough for the above convergence results to hold.Comment: 45 page

    Dropping Convexity for Faster Semi-definite Optimization

    Full text link
    We study the minimization of a convex function f(X)f(X) over the set of n×nn\times n positive semi-definite matrices, but when the problem is recast as minUg(U):=f(UU)\min_U g(U) := f(UU^\top), with URn×rU \in \mathbb{R}^{n \times r} and rnr \leq n. We study the performance of gradient descent on gg---which we refer to as Factored Gradient Descent (FGD)---under standard assumptions on the original function ff. We provide a rule for selecting the step size and, with this choice, show that the local convergence rate of FGD mirrors that of standard gradient descent on the original ff: i.e., after kk steps, the error is O(1/k)O(1/k) for smooth ff, and exponentially small in kk when ff is (restricted) strongly convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to ff via a first-order oracle; for several problem instances, such proper initialization leads to global convergence guarantees. FGD and similar procedures are widely used in practice for problems that can be posed as matrix factorization. To the best of our knowledge, this is the first paper to provide precise convergence rate guarantees for general convex functions under standard convex assumptions.Comment: 40 page

    Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation

    Full text link
    Low-rank modeling plays a pivotal role in signal processing and machine learning, with applications ranging from collaborative filtering, video surveillance, medical imaging, to dimensionality reduction and adaptive filtering. Many modern high-dimensional data and interactions thereof can be modeled as lying approximately in a low-dimensional subspace or manifold, possibly with additional structures, and its proper exploitations lead to significant reduction of costs in sensing, computation and storage. In recent years, there is a plethora of progress in understanding how to exploit low-rank structures using computationally efficient procedures in a provable manner, including both convex and nonconvex approaches. On one side, convex relaxations such as nuclear norm minimization often lead to statistically optimal procedures for estimating low-rank matrices, where first-order methods are developed to address the computational challenges; on the other side, there is emerging evidence that properly designed nonconvex procedures, such as projected gradient descent, often provide globally optimal solutions with a much lower computational cost in many problems. This survey article will provide a unified overview of these recent advances on low-rank matrix estimation from incomplete measurements. Attention is paid to rigorous characterization of the performance of these algorithms, and to problems where the low-rank matrix have additional structural properties that require new algorithmic designs and theoretical analysis.Comment: To appear in IEEE Signal Processing Magazin

    Nonconvex Low-Rank Matrix Recovery with Arbitrary Outliers via Median-Truncated Gradient Descent

    Full text link
    Recent work has demonstrated the effectiveness of gradient descent for directly recovering the factors of low-rank matrices from random linear measurements in a globally convergent manner when initialized properly. However, the performance of existing algorithms is highly sensitive in the presence of outliers that may take arbitrary values. In this paper, we propose a truncated gradient descent algorithm to improve the robustness against outliers, where the truncation is performed to rule out the contributions of samples that deviate significantly from the {\em sample median} of measurement residuals adaptively in each iteration. We demonstrate that, when initialized in a basin of attraction close to the ground truth, the proposed algorithm converges to the ground truth at a linear rate for the Gaussian measurement model with a near-optimal number of measurements, even when a constant fraction of the measurements are arbitrarily corrupted. In addition, we propose a new truncated spectral method that ensures an initialization in the basin of attraction at slightly higher requirements. We finally provide numerical experiments to validate the superior performance of the proposed approach.Comment: 30 pages, 3 figure

    Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent

    Full text link
    We address the rectangular matrix completion problem by lifting the unknown matrix to a positive semidefinite matrix in higher dimension, and optimizing a nonconvex objective over the semidefinite factor using a simple gradient descent scheme. With O(μr2κ2nmax(μ,logn))O( \mu r^2 \kappa^2 n \max(\mu, \log n)) random observations of a n1×n2n_1 \times n_2 μ\mu-incoherent matrix of rank rr and condition number κ\kappa, where n=max(n1,n2)n = \max(n_1, n_2), the algorithm linearly converges to the global optimum with high probability

    Low-rank Solutions of Linear Matrix Equations via Procrustes Flow

    Full text link
    In this paper we study the problem of recovering a low-rank matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a non-convex objective. We show that as long as the measurements obey a standard restricted isometry property, our algorithm converges to the unknown matrix at a geometric rate. In the case of Gaussian measurements, such convergence occurs for a n1×n2n_1 \times n_2 matrix of rank rr when the number of measurements exceeds a constant times (n1+n2)r(n_1+n_2)r.Comment: Added new results for general rectangular matrice

    How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?

    Full text link
    When the linear measurements of an instance of low-rank matrix recovery satisfy a restricted isometry property (RIP)---i.e. they are approximately norm-preserving---the problem is known to contain no spurious local minima, so exact recovery is guaranteed. In this paper, we show that moderate RIP is not enough to eliminate spurious local minima, so existing results can only hold for near-perfect RIP. In fact, counterexamples are ubiquitous: we prove that every x is the spurious local minimum of a rank-1 instance of matrix recovery that satisfies RIP. One specific counterexample has RIP constant δ=1/2\delta=1/2, but causes randomly initialized stochastic gradient descent (SGD) to fail 12% of the time. SGD is frequently able to avoid and escape spurious local minima, but this empirical result shows that it can occasionally be defeated by their existence. Hence, while exact recovery guarantees will likely require a proof of no spurious local minima, arguments based solely on norm preservation will only be applicable to a narrow set of nearly-isotropic instances.Comment: 32nd Conference on Neural Information Processing Systems (NIPS 2018

    Coordinate Descent Algorithms for Phase Retrieval

    Full text link
    Phase retrieval aims at recovering a complex-valued signal from magnitude-only measurements, which attracts much attention since it has numerous applications in many disciplines. However, phase recovery involves solving a system of quadratic equations, indicating that it is a challenging nonconvex optimization problem. To tackle phase retrieval in an effective and efficient manner, we apply coordinate descent (CD) such that a single unknown is solved at each iteration while all other variables are kept fixed. As a result, only minimization of a univariate quartic polynomial is needed which is easily achieved by finding the closed-form roots of a cubic equation. Three computationally simple algorithms referred to as cyclic, randomized and greedy CDs, based on different updating rules, are devised. It is proved that the three CDs globally converge to a stationary point of the nonconvex problem, and specifically, the randomized CD locally converges to the global minimum and attains exact recovery at a geometric rate with high probability if the sample size is large enough. The cyclic and randomized CDs are also modified via minimization of the 1\ell_1-regularized quartic polynomial for phase retrieval of sparse signals. Furthermore, a novel application of the three CDs, namely, blind equalization in digital communications, is proposed. It is demonstrated that the CD methodology is superior to the state-of-the-art techniques in terms of computational efficiency and/or recovery performance

    Low-Rank Positive Semidefinite Matrix Recovery from Corrupted Rank-One Measurements

    Full text link
    We study the problem of estimating a low-rank positive semidefinite (PSD) matrix from a set of rank-one measurements using sensing vectors composed of i.i.d. standard Gaussian entries, which are possibly corrupted by arbitrary outliers. This problem arises from applications such as phase retrieval, covariance sketching, quantum space tomography, and power spectrum estimation. We first propose a convex optimization algorithm that seeks the PSD matrix with the minimum 1\ell_1-norm of the observation residual. The advantage of our algorithm is that it is free of parameters, therefore eliminating the need for tuning parameters and allowing easy implementations. We establish that with high probability, a low-rank PSD matrix can be exactly recovered as soon as the number of measurements is large enough, even when a fraction of the measurements are corrupted by outliers with arbitrary magnitudes. Moreover, the recovery is also stable against bounded noise. With the additional information of an upper bound of the rank of the PSD matrix, we propose another non-convex algorithm based on subgradient descent that demonstrates excellent empirical performance in terms of computational efficiency and accuracy.Comment: 12 pages, 7 figure
    corecore