603 research outputs found

    Conditioning of Leverage Scores and Computation by QR Decomposition

    Full text link
    The leverage scores of a full-column rank matrix A are the squared row norms of any orthonormal basis for range(A). We show that corresponding leverage scores of two matrices A and A + \Delta A are close in the relative sense, if they have large magnitude and if all principal angles between the column spaces of A and A + \Delta A are small. We also show three classes of bounds that are based on perturbation results of QR decompositions. They demonstrate that relative differences between individual leverage scores strongly depend on the particular type of perturbation \Delta A. The bounds imply that the relative accuracy of an individual leverage score depends on: its magnitude and the two-norm condition of A, if \Delta A is a general perturbation; the two-norm condition number of A, if \Delta A is a perturbation with the same norm-wise row-scaling as A; (to first order) neither condition number nor leverage score magnitude, if \Delta A is a component-wise row-scaled perturbation. Numerical experiments confirm the qualitative and quantitative accuracy of our bounds.Comment: This version has been accepted to SIMAX but has not yet gone through copy editin

    Efficient Algorithms for CUR and Interpolative Matrix Decompositions

    Full text link
    The manuscript describes efficient algorithms for the computation of the CUR and ID decompositions. The methods used are based on simple modifications to the classical truncated pivoted QR decomposition, which means that highly optimized library codes can be utilized for implementation. For certain applications, further acceleration can be attained by incorporating techniques based on randomized projections. Numerical experiments demonstrate advantageous performance compared to existing techniques for computing CUR factorizations

    A DEIM Induced CUR Factorization

    Full text link
    We derive a CUR matrix factorization based on the Discrete Empirical Interpolation Method (DEIM). For a given matrix AA, such a factorization provides a low rank approximate decomposition of the form A≈CURA \approx C U R, where CC and RR are subsets of the columns and rows of AA, and UU is constructed to make CURCUR a good approximation. Given a low-rank singular value decomposition A≈VSWTA \approx V S W^T, the DEIM procedure uses VV and WW to select the columns and rows of AA that form CC and RR. Through an error analysis applicable to a general class of CUR factorizations, we show that the accuracy tracks the optimal approximation error within a factor that depends on the conditioning of submatrices of VV and WW. For large-scale problems, VV and WW can be approximated using an incremental QR algorithm that makes one pass through AA. Numerical examples illustrate the favorable performance of the DEIM-CUR method, compared to CUR approximations based on leverage scores

    Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression

    Full text link
    Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix A∈Rn×dA \in \R^{n \times d} with n≫dn \gg d and a p∈[1,2)p \in [1, 2), with a constant probability, we can construct a low-distortion embedding matrix \Pi \in \R^{O(\poly(d)) \times n} that embeds \A_p, the ℓp\ell_p subspace spanned by AA's columns, into (\R^{O(\poly(d))}, \| \cdot \|_p); the distortion of our embeddings is only O(\poly(d)), and we can compute ΠA\Pi A in O(\nnz(A)) time, i.e., input-sparsity time. Our result generalizes the input-sparsity time ℓ2\ell_2 subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for ℓ2\ell_2. These input-sparsity time ℓp\ell_p embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as (1±ϔ)(1\pm \epsilon)-distortion ℓp\ell_p subspace embedding and relative-error ℓp\ell_p regression. For ℓ2\ell_2, we show that a (1+Ï”)(1+\epsilon)-approximate solution to the ℓ2\ell_2 regression problem specified by the matrix AA and a vector b∈Rnb \in \R^n can be computed in O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2) time; and for ℓp\ell_p, via a subspace-preserving sampling procedure, we show that a (1±ϔ)(1\pm \epsilon)-distortion embedding of \A_p into \R^{O(\poly(d))} can be computed in O(\nnz(A) \cdot \log n) time, and we also show that a (1+Ï”)(1+\epsilon)-approximate solution to the ℓp\ell_p regression problem min⁥x∈Rd∄Ax−b∄p\min_{x \in \R^d} \|A x - b\|_p can be computed in O(\nnz(A) \cdot \log n + \poly(d) \log(1/\epsilon)/\epsilon^2) time. Moreover, we can improve the embedding dimension or equivalently the sample size to O(d3+p/2log⁥(1/Ï”)/Ï”2)O(d^{3+p/2} \log(1/\epsilon) / \epsilon^2) without increasing the complexity.Comment: 22 page

    Algorithmic and Statistical Perspectives on Large-Scale Data Analysis

    Full text link
    In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors, "Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201

    A Statistical Perspective on Algorithmic Leveraging

    Full text link
    One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales rows/columns of data matrices to reduce the data size before performing computations on the subproblem. This method has been successful in improving computational efficiency of algorithms for matrix problems such as least-squares approximation, least absolute deviations approximation, and low-rank matrix approximation. Existing work has focused on algorithmic issues such as worst-case running times and numerical issues associated with providing high-quality implementations, but none of it addresses statistical aspects of this method. In this paper, we provide a simple yet effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model with a fixed number of predictors. We show that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other. This result is particularly striking, given the well-known result that, from the algorithmic perspective of worst-case analysis, leverage-based sampling provides uniformly superior worst-case algorithmic results, when compared with uniform sampling. Based on these theoretical results, we propose and analyze two new leveraging algorithms. A detailed empirical evaluation of existing leverage-based methods as well as these two new methods is carried out on both synthetic and real data sets. The empirical results indicate that our theory is a good predictor of practical performance of existing and new leverage-based algorithms and that the new algorithms achieve improved performance.Comment: 44 pages, 17 figure

    Randomized Dynamic Mode Decomposition

    Full text link
    This paper presents a randomized algorithm for computing the near-optimal low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging techniques to compute low-rank matrix approximations at a fraction of the cost of deterministic algorithms, easing the computational challenges arising in the area of `big data'. The idea is to derive a small matrix from the high-dimensional data, which is then used to efficiently compute the dynamic modes and eigenvalues. The algorithm is presented in a modular probabilistic framework, and the approximation quality can be controlled via oversampling and power iterations. The effectiveness of the resulting randomized DMD algorithm is demonstrated on several benchmark examples of increasing complexity, providing an accurate and efficient approach to extract spatiotemporal coherent structures from big data in a framework that scales with the intrinsic rank of the data, rather than the ambient measurement dimension. For this work we assume that the dynamics of the problem under consideration is evolving on a low-dimensional subspace that is well characterized by a fast decaying singular value spectrum

    Randomized methods for matrix computations

    Full text link
    The purpose of this text is to provide an accessible introduction to a set of recently developed algorithms for factorizing matrices. These new algorithms attain high practical speed by reducing the dimensionality of intermediate computations using randomized projections. The algorithms are particularly powerful for computing low-rank approximations to very large matrices, but they can also be used to accelerate algorithms for computing full factorizations of matrices. A key competitive advantage of the algorithms described is that they require less communication than traditional deterministic methods
    • 

    corecore