    Diffusion Approximations for Online Principal Component Estimation and Global Convergence

    In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.Comment: Appeared in NIPS 201

    Convergence of Gradient Descent for Low-Rank Matrix Approximation

    This paper provides a proof of global convergence of gradient search for low-rank matrix approximation. Such approximations have recently been of interest for large-scale problems, as well as for dictionary learning for sparse signal representations and matrix completion. The proof is based on the interpretation of the problem as an optimization on the Grassmann manifold and Fubiny-Study distance on this space

    Faster Eigenvector Computation via Shift-and-Invert Preconditioning

    We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix Σ\Sigma -- i.e. computing a unit vector xx such that xTΣx(1ϵ)λ1(Σ)x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma): Offline Eigenvector Estimation: Given an explicit ARn×dA \in \mathbb{R}^{n \times d} with Σ=ATA\Sigma = A^TA, we show how to compute an ϵ\epsilon approximate top eigenvector in time O~([nnz(A)+dsr(A)gap2]log1/ϵ)\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon ) and O~([nnz(A)3/4(dsr(A))1/4gap]log1/ϵ)\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon ). Here nnz(A)nnz(A) is the number of nonzeros in AA, sr(A)sr(A) is the stable rank, gapgap is the relative eigengap. By separating the gapgap dependence from the nnz(A)nnz(A) term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on sr(A)sr(A) and ϵ\epsilon. Our second running time improves these further when nnz(A)dsr(A)gap2nnz(A) \le \frac{d*sr(A)}{gap^2}. Online Eigenvector Estimation: Given a distribution DD with covariance matrix Σ\Sigma and a vector x0x_0 which is an O(gap)O(gap) approximate top eigenvector for Σ\Sigma, we show how to refine to an ϵ\epsilon approximation using O(var(D)gapϵ) O(\frac{var(D)}{gap*\epsilon}) samples from DD. Here var(D)var(D) is a natural notion of variance. Combining our algorithm with previous work to initialize x0x_0, we obtain improved sample complexity and runtime results under a variety of assumptions on DD. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.Comment: Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.0889

    A Self-learning Algebraic Multigrid Method for Extremal Singular Triplets and Eigenpairs

    A self-learning algebraic multigrid method for dominant and minimal singular triplets and eigenpairs is described. The method consists of two multilevel phases. In the first, multiplicative phase (setup phase), tentative singular triplets are calculated along with a multigrid hierarchy of interpolation operators that approximately fit the tentative singular vectors in a collective and self-learning manner, using multiplicative update formulas. In the second, additive phase (solve phase), the tentative singular triplets are improved up to the desired accuracy by using an additive correction scheme with fixed interpolation operators, combined with a Ritz update. A suitable generalization of the singular value decomposition is formulated that applies to the coarse levels of the multilevel cycles. The proposed algorithm combines and extends two existing multigrid approaches for symmetric positive definite eigenvalue problems to the case of dominant and minimal singular triplets. Numerical tests on model problems from different areas show that the algorithm converges to high accuracy in a modest number of iterations, and is flexible enough to deal with a variety of problems due to its self-learning properties.Comment: 29 page

    Rayleigh quotient with bolzano booster for faster convergence of dominant eigenvalues

    Computation ranking algorithms are widely used in several informatics fields. One of them is the PageRank algorithm, recognized as the most popular search engine globally. Many researchers have improvised the ranking algorithm in order to get better results. Recent research using Rayleigh Quotient to speed up PageRank can guarantee the convergence of the dominant eigenvalues as a key value for stopping computation. Bolzano's method has a convergence character on a linear function by dividing an interval into two intervals for better convergence. This research aims to implant the Bolzano algorithm into Rayleigh for faster computation. This research produces an algorithm that has been tested and validated by mathematicians, which shows an optimization speed of a maximum 7.08% compared to the sole Rayleigh approach. Analysis of computation results using statistics software shows that the degree of the curve of the new algorithm, which is Rayleigh with Bolzano booster (RB), is positive and more significant than the original method. In other words, the linear function will always be faster in the subsequent computation than the previous method

    Weighted principal component analysis: a weighted covariance eigendecomposition approach

    We present a new straightforward principal component analysis (PCA) method based on the diagonalization of the weighted variance-covariance matrix through two spectral decomposition methods: power iteration and Rayleigh quotient iteration. This method allows one to retrieve a given number of orthogonal principal components amongst the most meaningful ones for the case of problems with weighted and/or missing data. Principal coefficients are then retrieved by fitting principal components to the data while providing the final decomposition. Tests performed on real and simulated cases show that our method is optimal in the identification of the most significant patterns within data sets. We illustrate the usefulness of this method by assessing its quality on the extrapolation of Sloan Digital Sky Survey quasar spectra from measured wavelengths to shorter and longer wavelengths. Our new algorithm also benefits from a fast and flexible implementation.Comment: 12 pages, 9 figure