39 research outputs found

    Fast matrix completion without the condition number

    Full text link
    We give the first algorithm for Matrix Completion whose running time and sample complexity is polynomial in the rank of the unknown target matrix, linear in the dimension of the matrix, and logarithmic in the condition number of the matrix. To the best of our knowledge, all previous algorithms either incurred a quadratic dependence on the condition number of the unknown matrix or a quadratic dependence on the dimension of the matrix in the running time. Our algorithm is based on a novel extension of Alternating Minimization which we show has theoretical guarantees under standard assumptions even in the presence of noise

    Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent

    Full text link
    We address the rectangular matrix completion problem by lifting the unknown matrix to a positive semidefinite matrix in higher dimension, and optimizing a nonconvex objective over the semidefinite factor using a simple gradient descent scheme. With O(μr2κ2nmax(μ,logn))O( \mu r^2 \kappa^2 n \max(\mu, \log n)) random observations of a n1×n2n_1 \times n_2 μ\mu-incoherent matrix of rank rr and condition number κ\kappa, where n=max(n1,n2)n = \max(n_1, n_2), the algorithm linearly converges to the global optimum with high probability

    Longitudinal data analysis using matrix completion

    Full text link
    In clinical practice and biomedical research, measurements are often collected sparsely and irregularly in time while the data acquisition is expensive and inconvenient. Examples include measurements of spine bone mineral density, cancer growth through mammography or biopsy, a progression of defect of vision, or assessment of gait in patients with neurological disorders. Since the data collection is often costly and inconvenient, estimation of progression from sparse observations is of great interest for practitioners. From the statistical standpoint, such data is often analyzed in the context of a mixed-effect model where time is treated as both random and fixed effect. Alternatively, researchers analyze Gaussian processes or functional data where observations are assumed to be drawn from a certain distribution of processes. These models are flexible but rely on probabilistic assumptions and require very careful implementation. In this study, we propose an alternative elementary framework for analyzing longitudinal data, relying on matrix completion. Our method yields point estimates of progression curves by iterative application of the SVD. Our framework covers multivariate longitudinal data, regression and can be easily extended to other settings. We apply our methods to understand trends of progression of motor impairment in children with Cerebral Palsy. Our model approximates individual progression curves and explains 30% of the variability. Low-rank representation of progression trends enables discovering that subtypes of Cerebral Palsy exhibit different progression trends

    Provable Burer-Monteiro factorization for a class of norm-constrained matrix problems

    Full text link
    We study the projected gradient descent method on low-rank matrix problems with a strongly convex objective. We use the Burer-Monteiro factorization approach to implicitly enforce low-rankness; such factorization introduces non-convexity in the objective. We focus on constraint sets that include both positive semi-definite (PSD) constraints and specific matrix norm-constraints. Such criteria appear in quantum state tomography and phase retrieval applications. We show that non-convex projected gradient descent favors local linear convergence in the factored space. We build our theory on a novel descent lemma, that non-trivially extends recent results on the unconstrained problem. The resulting algorithm is Projected Factored Gradient Descent, abbreviated as ProjFGD, and shows superior performance compared to state of the art on quantum state tomography and sparse phase retrieval applications.Comment: 28 page

    Matrix Completion has No Spurious Local Minimum

    Full text link
    Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular and effective in practice. Despite recent progress in proving various non-convex algorithms converge from a good initial point, it remains unclear why random or arbitrary initialization suffices in practice. We prove that the commonly used non-convex objective function for \textit{positive semidefinite} matrix completion has no spurious local minima --- all local minima must also be global. Therefore, many popular optimization algorithms such as (stochastic) gradient descent can provably solve positive semidefinite matrix completion with \textit{arbitrary} initialization in polynomial time. The result can be generalized to the setting when the observed entries contain noise. We believe that our main proof strategy can be useful for understanding geometric properties of other statistical problems involving partial or noisy observations.Comment: NIPS'16 best student paper. fixed Theorem 2.3 in preliminary section in the previous version. The results are not affecte

    Nearly-optimal Robust Matrix Completion

    Full text link
    In this paper, we consider the problem of Robust Matrix Completion (RMC) where the goal is to recover a low-rank matrix by observing a small number of its entries out of which a few can be arbitrarily corrupted. We propose a simple projected gradient descent method to estimate the low-rank matrix that alternately performs a projected gradient descent step and cleans up a few of the corrupted entries using hard-thresholding. Our algorithm solves RMC using nearly optimal number of observations as well as nearly optimal number of corruptions. Our result also implies significant improvement over the existing time complexity bounds for the low-rank matrix completion problem. Finally, an application of our result to the robust PCA problem (low-rank+sparse matrix separation) leads to nearly linear time (in matrix dimensions) algorithm for the same; existing state-of-the-art methods require quadratic time. Our empirical results corroborate our theoretical results and show that even for moderate sized problems, our method for robust PCA is an an order of magnitude faster than the existing methods

    Contextual Bandits with Latent Confounders: An NMF Approach

    Full text link
    Motivated by online recommendation and advertising systems, we consider a causal model for stochastic contextual bandits with a latent low-dimensional confounder. In our model, there are LL observed contexts and KK arms of the bandit. The observed context influences the reward obtained through a latent confounder variable with cardinality mm (mL,Km \ll L,K). The arm choice and the latent confounder causally determines the reward while the observed context is correlated with the confounder. Under this model, the L×KL \times K mean reward matrix U\mathbf{U} (for each context in [L][L] and each arm in [K][K]) factorizes into non-negative factors A\mathbf{A} (L×mL \times m) and W\mathbf{W} (m×Km \times K). This insight enables us to propose an ϵ\epsilon-greedy NMF-Bandit algorithm that designs a sequence of interventions (selecting specific arms), that achieves a balance between learning this low-dimensional structure and selecting the best arm to minimize regret. Our algorithm achieves a regret of O(Lpoly(m,logK)logT)\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log T \right) at time TT, as compared to O(LKlogT)\mathcal{O}(LK\log T) for conventional contextual bandits, assuming a constant gap between the best arm and the rest for each context. These guarantees are obtained under mild sufficiency conditions on the factors that are weaker versions of the well-known Statistical RIP condition. We further propose a class of generative models that satisfy our sufficient conditions, and derive a lower bound of O(KmlogT)\mathcal{O}\left(Km\log T\right). These are the first regret guarantees for online matrix completion with bandit feedback, when the rank is greater than one. We further compare the performance of our algorithm with the state of the art, on synthetic and real world data-sets.Comment: 37 pages, 2 figure

    Provable Subspace Tracking from Missing Data and Matrix Completion

    Full text link
    We study the problem of subspace tracking in the presence of missing data (ST-miss). In recent work, we studied a related problem called robust ST. In this work, we show that a simple modification of our robust ST solution also provably solves ST-miss and robust ST-miss. To our knowledge, our result is the first `complete' guarantee for ST-miss. This means that we can prove that under assumptions on only the algorithm inputs, the output subspace estimates are close to the true data subspaces at all times. Our guarantees hold under mild and easily interpretable assumptions, and allow the underlying subspace to change with time in a piecewise constant fashion. In contrast, all existing guarantees for ST are partial results and assume a fixed unknown subspace. Extensive numerical experiments are shown to back up our theoretical claims. Finally, our solution can be interpreted as a provably correct mini-batch and memory-efficient solution to low-rank Matrix Completion (MC).Comment: Writing changes; includes a detailed discussion of noise analysis; contains discussion for Matrix Completion; Accepted to IEEE Transactions on Signal Processin

    Static and Dynamic Robust PCA and Matrix Completion: A Review

    Full text link
    Principal Components Analysis (PCA) is one of the most widely used dimension reduction techniques. Robust PCA (RPCA) refers to the problem of PCA when the data may be corrupted by outliers. Recent work by Cand{\`e}s, Wright, Li, and Ma defined RPCA as a problem of decomposing a given data matrix into the sum of a low-rank matrix (true data) and a sparse matrix (outliers). The column space of the low-rank matrix then gives the PCA solution. This simple definition has lead to a large amount of interesting new work on provably correct, fast, and practical solutions to RPCA. More recently, the dynamic (time-varying) version of the RPCA problem has been studied and a series of provably correct, fast, and memory efficient tracking solutions have been proposed. Dynamic RPCA (or robust subspace tracking) is the problem of tracking data lying in a (slowly) changing subspace while being robust to sparse outliers. This article provides an exhaustive review of the last decade of literature on RPCA and its dynamic counterpart (robust subspace tracking), along with describing their theoretical guarantees, discussing the pros and cons of various approaches, and providing empirical comparisons of performance and speed. A brief overview of the (low-rank) matrix completion literature is also provided (the focus is on works not discussed in other recent reviews). This refers to the problem of completing a low-rank matrix when only a subset of its entries are observed. It can be interpreted as a simpler special case of RPCA in which the indices of the outlier corrupted entries are known.Comment: To appear in Proceedings of the IEEE, Special Issue on Rethinking PCA for Modern Datasets. arXiv admin note: text overlap with arXiv:1711.0949

    Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow

    Full text link
    We revisit the inductive matrix completion problem that aims to recover a rank-rr matrix with ambient dimension dd given nn features as the side prior information. The goal is to make use of the known nn features to reduce sample and computational complexities. We present and analyze a new gradient-based non-convex optimization algorithm that converges to the true underlying matrix at a linear rate with sample complexity only linearly depending on nn and logarithmically depending on dd. To the best of our knowledge, all previous algorithms either have a quadratic dependency on the number of features in sample complexity or a sub-linear computational convergence rate. In addition, we provide experiments on both synthetic and real world data to demonstrate the effectiveness of our proposed algorithm.Comment: 35 pages, 3 figures and 2 table
    corecore