Search CORE

603 research outputs found

Conditioning of Leverage Scores and Computation by QR Decomposition

Author: Holodnak John T.
Ipsen Ilse C. F.
Wentworth Thomas A.
Publication venue
Publication date: 01/04/2015
Field of study

The leverage scores of a full-column rank matrix A are the squared row norms of any orthonormal basis for range(A). We show that corresponding leverage scores of two matrices A and A + \Delta A are close in the relative sense, if they have large magnitude and if all principal angles between the column spaces of A and A + \Delta A are small. We also show three classes of bounds that are based on perturbation results of QR decompositions. They demonstrate that relative differences between individual leverage scores strongly depend on the particular type of perturbation \Delta A. The bounds imply that the relative accuracy of an individual leverage score depends on: its magnitude and the two-norm condition of A, if \Delta A is a general perturbation; the two-norm condition number of A, if \Delta A is a perturbation with the same norm-wise row-scaling as A; (to first order) neither condition number nor leverage score magnitude, if \Delta A is a component-wise row-scaled perturbation. Numerical experiments confirm the qualitative and quantitative accuracy of our bounds.Comment: This version has been accepted to SIMAX but has not yet gone through copy editin

arXiv.org e-Print Archive

DSpace@MIT

Efficient Algorithms for CUR and Interpolative Matrix Decompositions

Author: Martinsson Per-Gunnar
Voronin Sergey
Publication venue
Publication date: 01/01/2016
Field of study

The manuscript describes efficient algorithms for the computation of the CUR and ID decompositions. The methods used are based on simple modifications to the classical truncated pivoted QR decomposition, which means that highly optimized library codes can be utilized for implementation. For certain applications, further acceleration can be attained by incorporating techniques based on randomized projections. Numerical experiments demonstrate advantageous performance compared to existing techniques for computing CUR factorizations

arXiv.org e-Print Archive

Oxford University Research Archive

A DEIM Induced CUR Factorization

Author: Embree M.
Sorensen D. C.
Publication venue
Publication date: 18/09/2015
Field of study

We derive a CUR matrix factorization based on the Discrete Empirical Interpolation Method (DEIM). For a given matrix

A

, such a factorization provides a low rank approximate decomposition of the form

A \approx C U R

, where

C

and

R

are subsets of the columns and rows of

A

, and

U

is constructed to make

CUR

a good approximation. Given a low-rank singular value decomposition

A \approx V S W^T

, the DEIM procedure uses

V

and

W

to select the columns and rows of

A

that form

C

and

R

. Through an error analysis applicable to a general class of CUR factorizations, we show that the accuracy tracks the optimal approximation error within a factor that depends on the conditioning of submatrices of

V

and

W

. For large-scale problems,

V

and

W

can be approximated using an incremental QR algorithm that makes one pass through

A

. Numerical examples illustrate the favorable performance of the DEIM-CUR method, compared to CUR approximations based on leverage scores

arXiv.org e-Print Archive

CiteSeerX

Crossref

DSpace at Rice University

Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression

Author: Mahoney Michael W.
Meng Xiangrui
Publication venue
Publication date: 01/01/2013
Field of study

Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix

A \in \R^{n \times d}

with

n \gg d

and a

p \in [1, 2)

, with a constant probability, we can construct a low-distortion embedding matrix \Pi \in \R^{O(\poly(d)) \times n} that embeds \A_p, the

\ell_p

subspace spanned by

A

's columns, into (\R^{O(\poly(d))}, \| \cdot \|_p); the distortion of our embeddings is only O(\poly(d)), and we can compute

\Pi A

in O(\nnz(A)) time, i.e., input-sparsity time. Our result generalizes the input-sparsity time

\ell_2

subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for

\ell_2

. These input-sparsity time

\ell_p

embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as

(1\pm \epsilon)

-distortion

\ell_p

subspace embedding and relative-error

\ell_p

regression. For

\ell_2

, we show that a

(1+\epsilon)

-approximate solution to the

\ell_2

regression problem specified by the matrix

A

and a vector

b \in \R^n

can be computed in O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2) time; and for

\ell_p

, via a subspace-preserving sampling procedure, we show that a

(1\pm \epsilon)

-distortion embedding of \A_p into \R^{O(\poly(d))} can be computed in O(\nnz(A) \cdot \log n) time, and we also show that a

(1+\epsilon)

-approximate solution to the

\ell_p

regression problem

\min_{x \in \R^d} \|A x - b\|_p

can be computed in O(\nnz(A) \cdot \log n + \poly(d) \log(1/\epsilon)/\epsilon^2) time. Moreover, we can improve the embedding dimension or equivalently the sample size to

O(d^{3+p/2} \log(1/\epsilon) / \epsilon^2)

without increasing the complexity.Comment: 22 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Algorithmic and Statistical Perspectives on Large-Scale Data Analysis

Author: Mahoney Michael W.
Publication venue
Publication date: 08/10/2010
Field of study

In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors, "Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201

arXiv.org e-Print Archive

CiteSeerX

A Statistical Perspective on Algorithmic Leveraging

Author: Ma Ping
Mahoney Michael W.
Yu Bin
Publication venue
Publication date: 22/06/2013
Field of study

One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales rows/columns of data matrices to reduce the data size before performing computations on the subproblem. This method has been successful in improving computational efficiency of algorithms for matrix problems such as least-squares approximation, least absolute deviations approximation, and low-rank matrix approximation. Existing work has focused on algorithmic issues such as worst-case running times and numerical issues associated with providing high-quality implementations, but none of it addresses statistical aspects of this method. In this paper, we provide a simple yet effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model with a fixed number of predictors. We show that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other. This result is particularly striking, given the well-known result that, from the algorithmic perspective of worst-case analysis, leverage-based sampling provides uniformly superior worst-case algorithmic results, when compared with uniform sampling. Based on these theoretical results, we propose and analyze two new leveraging algorithms. A detailed empirical evaluation of existing leverage-based methods as well as these two new methods is carried out on both synthetic and real data sets. The empirical results indicate that our theory is a good predictor of practical performance of existing and new leverage-based algorithms and that the new algorithms achieve improved performance.Comment: 44 pages, 17 figure

arXiv.org e-Print Archive

CiteSeerX

Randomized Dynamic Mode Decomposition

Author: Brunton Steven L.
Erichson N. Benjamin
Kutz J. Nathan
Mathelin Lionel
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 26/11/2019
Field of study

This paper presents a randomized algorithm for computing the near-optimal low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging techniques to compute low-rank matrix approximations at a fraction of the cost of deterministic algorithms, easing the computational challenges arising in the area of `big data'. The idea is to derive a small matrix from the high-dimensional data, which is then used to efficiently compute the dynamic modes and eigenvalues. The algorithm is presented in a modular probabilistic framework, and the approximation quality can be controlled via oversampling and power iterations. The effectiveness of the resulting randomized DMD algorithm is demonstrated on several benchmark examples of increasing complexity, providing an accurate and efficient approach to extract spatiotemporal coherent structures from big data in a framework that scales with the intrinsic rank of the data, rather than the ambient measurement dimension. For this work we assume that the dynamics of the problem under consideration is evolving on a low-dimensional subspace that is well characterized by a fast decaying singular value spectrum

arXiv.org e-Print Archive

Randomized methods for matrix computations

Author: Martinsson Per-Gunnar
Publication venue
Publication date: 01/01/2019
Field of study

The purpose of this text is to provide an accessible introduction to a set of recently developed algorithms for factorizing matrices. These new algorithms attain high practical speed by reducing the dimensionality of intermediate computations using randomized projections. The algorithms are particularly powerful for computing low-rank approximations to very large matrices, but they can also be used to accelerate algorithms for computing full factorizations of matrices. A key competitive advantage of the algorithms described is that they require less communication than traditional deterministic methods

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive