532 research outputs found

    Spectral Condition-Number Estimation of Large Sparse Matrices

    Full text link
    We describe a randomized Krylov-subspace method for estimating the spectral condition number of a real matrix A or indicating that it is numerically rank deficient. The main difficulty in estimating the condition number is the estimation of the smallest singular value \sigma_{\min} of A. Our method estimates this value by solving a consistent linear least-squares problem with a known solution using a specific Krylov-subspace method called LSQR. In this method, the forward error tends to concentrate in the direction of a right singular vector corresponding to \sigma_{\min}. Extensive experiments show that the method is able to estimate well the condition number of a wide array of matrices. It can sometimes estimate the condition number when running a dense SVD would be impractical due to the computational cost or the memory requirements. The method uses very little memory (it inherits this property from LSQR) and it works equally well on square and rectangular matrices

    Subsampling Algorithms for Semidefinite Programming

    Full text link
    We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

    Approximating leading singular triplets of a matrix function

    Full text link
    Given a large square matrix AA and a sufficiently regular function ff so that f(A)f(A) is well defined, we are interested in the approximation of the leading singular values and corresponding singular vectors of f(A)f(A), and in particular of f(A)\|f(A)\|, where \|\cdot \| is the matrix norm induced by the Euclidean vector norm. Since neither f(A)f(A) nor f(A)vf(A)v can be computed exactly, we introduce and analyze an inexact Golub-Kahan-Lanczos bidiagonalization procedure, where the inexactness is related to the inaccuracy of the operations f(A)vf(A)v, f(A)vf(A)^*v. Particular outer and inner stopping criteria are devised so as to cope with the lack of a true residual. Numerical experiments with the new algorithm on typical application problems are reported

    Subspace Iteration Randomization and Singular Value Problems

    Full text link
    A classical problem in matrix computations is the efficient and reliable approximation of a given matrix by a matrix of lower rank. The truncated singular value decomposition (SVD) is known to provide the best such approximation for any given fixed rank. However, the SVD is also known to be very costly to compute. Among the different approaches in the literature for computing low-rank approximations, randomized algorithms have attracted researchers' recent attention due to their surprising reliability and computational efficiency in different application areas. Typically, such algorithms are shown to compute with very high probability low-rank approximations that are within a constant factor from optimal, and are known to perform even better in many practical situations. In this paper, we present a novel error analysis that considers randomized algorithms within the subspace iteration framework and show with very high probability that highly accurate low-rank approximations as well as singular values can indeed be computed quickly for matrices with rapidly decaying singular values. Such matrices appear frequently in diverse application areas such as data analysis, fast structured matrix computations and fast direct methods for large sparse linear systems of equations and are the driving motivation for randomized methods. Furthermore, we show that the low-rank approximations computed by these randomized algorithms are actually rank-revealing approximations, and the special case of a rank-1 approximation can also be used to correctly estimate matrix 2-norms with very high probability. Our numerical experiments are in full support of our conclusions.Comment: 45 pages, 5 figure

    Fast and Simple PCA via Convex Optimization

    Full text link
    The problem of principle component analysis (PCA) is traditionally solved by spectral or algebraic methods. We show how computing the leading principal component could be reduced to solving a \textit{small} number of well-conditioned {\it convex} optimization problems. This gives rise to a new efficient method for PCA based on recent advances in stochastic methods for convex optimization. In particular we show that given a d×dd\times d matrix \X = \frac{1}{n}\sum_{i=1}^n\x_i\x_i^{\top} with top eigenvector \u and top eigenvalue λ1\lambda_1 it is possible to: \begin{itemize} \item compute a unit vector \w such that (\w^{\top}\u)^2 \geq 1-\epsilon in O~(dδ2+N)\tilde{O}\left({\frac{d}{\delta^2}+N}\right) time, where δ=λ1λ2\delta = \lambda_1 - \lambda_2 and NN is the total number of non-zero entries in \x_1,...,\x_n, \item compute a unit vector \w such that \w^{\top}\X\w \geq \lambda_1-\epsilon in O~(d/ϵ2)\tilde{O}(d/\epsilon^2) time. \end{itemize} To the best of our knowledge, these bounds are the fastest to date for a wide regime of parameters. These results could be further accelerated when δ\delta (in the first case) and ϵ\epsilon (in the second case) are smaller than d/N\sqrt{d/N}

    Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

    Full text link
    We provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix. Offline Setting: Given an n×dn \times d matrix AA, we show how to compute an ϵ\epsilon approximate top eigenvector in time O~([nnz(A)+dsr(A)gap2]log1/ϵ)\tilde O ( [nnz(A) + \frac{d \cdot sr(A)}{gap^2}]\cdot \log 1/\epsilon ) and O~([nnz(A)3/4(dsr(A))1/4gap]log1/ϵ)\tilde O([\frac{nnz(A)^{3/4} (d \cdot sr(A))^{1/4}}{\sqrt{gap}}]\cdot \log1/\epsilon ). Here sr(A)sr(A) is the stable rank and gapgap is the multiplicative eigenvalue gap. By separating the gapgap dependence from nnz(A)nnz(A) we improve on the classic power and Lanczos methods. We also improve prior work using fast subspace embeddings and stochastic optimization, giving significantly improved dependencies on sr(A)sr(A) and ϵ\epsilon. Our second running time improves this further when nnz(A)dsr(A)gap2nnz(A) \le \frac{d\cdot sr(A)}{gap^2}. Online Setting: Given a distribution DD with covariance matrix Σ\Sigma and a vector x0x_0 which is an O(gap)O(gap) approximate top eigenvector for Σ\Sigma, we show how to refine to an ϵ\epsilon approximation using O~(v(D)gap2+v(D)gapϵ)\tilde O(\frac{v(D)}{gap^2} + \frac{v(D)}{gap \cdot \epsilon}) samples from DD. Here v(D)v(D) is a natural variance measure. Combining our algorithm with previous work to initialize x0x_0, we obtain a number of improved sample complexity and runtime results. For general distributions, we achieve asymptotically optimal accuracy as a function of sample size as the number of samples grows large. Our results center around a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast SVRG based approximate system solvers to achieve our claims. We believe our results suggest the general effectiveness of shift-and-invert based approaches and imply that further computational gains may be reaped in practice.Comment: Manuscript outdated. Updated version at arxiv:1605.0875

    Perspectives on information-based complexity

    Full text link
    The authors discuss information-based complexity theory, which is a model of finite-precision computations with real numbers, and its applications to numerical analysis.Comment: 24 pages. Abstract added in migration

    Implementing regularization implicitly via approximate eigenvector computation

    Full text link
    Regularization is a powerful technique for extracting useful information from noisy data. Typically, it is implemented by adding some sort of norm constraint to an objective function and then exactly optimizing the modified objective function. This procedure often leads to optimization problems that are computationally more expensive than the original problem, a fact that is clearly problematic if one is interested in large-scale applications. On the other hand, a large body of empirical work has demonstrated that heuristics, and in some cases approximation algorithms, developed to speed up computations sometimes have the side-effect of performing regularization implicitly. Thus, we consider the question: What is the regularized optimization objective that an approximation algorithm is exactly optimizing? We address this question in the context of computing approximations to the smallest nontrivial eigenvector of a graph Laplacian; and we consider three random-walk-based procedures: one based on the heat kernel of the graph, one based on computing the the PageRank vector associated with the graph, and one based on a truncated lazy random walk. In each case, we provide a precise characterization of the manner in which the approximation method can be viewed as implicitly computing the exact solution to a regularized problem. Interestingly, the regularization is not on the usual vector form of the optimization problem, but instead it is on a related semidefinite program.Comment: 11 pages; a few clarification

    Incremental Method for Spectral Clustering of Increasing Orders

    Full text link
    The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications the number of clusters or communities (say, KK) is generally unknown a-priori. Consequently, the majority of the existing methods either choose KK heuristically or they repeat the clustering method with different choices of KK and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the KK-th eigenpairs of the Laplacian matrix given a collection of all the K1K-1 smallest eigenpairs. Our proposed method adapts the Laplacian matrix such that the batch eigenvalue decomposition problem transforms into an efficient sequential leading eigenpair computation problem. As a practical application, we consider user-guided spectral clustering. Specifically, we demonstrate that users can utilize the proposed incremental method for effective eigenpair computation and determining the desired number of clusters based on multiple clustering metrics.Comment: in KDD workshop on mining and learning graph, 2016 http://www.mlgworkshop.org/2016

    Faster Eigenvector Computation via Shift-and-Invert Preconditioning

    Full text link
    We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix Σ\Sigma -- i.e. computing a unit vector xx such that xTΣx(1ϵ)λ1(Σ)x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma): Offline Eigenvector Estimation: Given an explicit ARn×dA \in \mathbb{R}^{n \times d} with Σ=ATA\Sigma = A^TA, we show how to compute an ϵ\epsilon approximate top eigenvector in time O~([nnz(A)+dsr(A)gap2]log1/ϵ)\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon ) and O~([nnz(A)3/4(dsr(A))1/4gap]log1/ϵ)\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon ). Here nnz(A)nnz(A) is the number of nonzeros in AA, sr(A)sr(A) is the stable rank, gapgap is the relative eigengap. By separating the gapgap dependence from the nnz(A)nnz(A) term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on sr(A)sr(A) and ϵ\epsilon. Our second running time improves these further when nnz(A)dsr(A)gap2nnz(A) \le \frac{d*sr(A)}{gap^2}. Online Eigenvector Estimation: Given a distribution DD with covariance matrix Σ\Sigma and a vector x0x_0 which is an O(gap)O(gap) approximate top eigenvector for Σ\Sigma, we show how to refine to an ϵ\epsilon approximation using O(var(D)gapϵ) O(\frac{var(D)}{gap*\epsilon}) samples from DD. Here var(D)var(D) is a natural notion of variance. Combining our algorithm with previous work to initialize x0x_0, we obtain improved sample complexity and runtime results under a variety of assumptions on DD. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.Comment: Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.0889
    corecore