603 research outputs found
Conditioning of Leverage Scores and Computation by QR Decomposition
The leverage scores of a full-column rank matrix A are the squared row norms
of any orthonormal basis for range(A). We show that corresponding leverage
scores of two matrices A and A + \Delta A are close in the relative sense, if
they have large magnitude and if all principal angles between the column spaces
of A and A + \Delta A are small. We also show three classes of bounds that are
based on perturbation results of QR decompositions. They demonstrate that
relative differences between individual leverage scores strongly depend on the
particular type of perturbation \Delta A. The bounds imply that the relative
accuracy of an individual leverage score depends on: its magnitude and the
two-norm condition of A, if \Delta A is a general perturbation; the two-norm
condition number of A, if \Delta A is a perturbation with the same norm-wise
row-scaling as A; (to first order) neither condition number nor leverage score
magnitude, if \Delta A is a component-wise row-scaled perturbation. Numerical
experiments confirm the qualitative and quantitative accuracy of our bounds.Comment: This version has been accepted to SIMAX but has not yet gone through
copy editin
Efficient Algorithms for CUR and Interpolative Matrix Decompositions
The manuscript describes efficient algorithms for the computation of the CUR
and ID decompositions. The methods used are based on simple modifications to
the classical truncated pivoted QR decomposition, which means that highly
optimized library codes can be utilized for implementation. For certain
applications, further acceleration can be attained by incorporating techniques
based on randomized projections. Numerical experiments demonstrate advantageous
performance compared to existing techniques for computing CUR factorizations
A DEIM Induced CUR Factorization
We derive a CUR matrix factorization based on the Discrete Empirical
Interpolation Method (DEIM). For a given matrix , such a factorization
provides a low rank approximate decomposition of the form ,
where and are subsets of the columns and rows of , and is
constructed to make a good approximation. Given a low-rank singular value
decomposition , the DEIM procedure uses and to
select the columns and rows of that form and . Through an error
analysis applicable to a general class of CUR factorizations, we show that the
accuracy tracks the optimal approximation error within a factor that depends on
the conditioning of submatrices of and . For large-scale problems,
and can be approximated using an incremental QR algorithm that makes one
pass through . Numerical examples illustrate the favorable performance of
the DEIM-CUR method, compared to CUR approximations based on leverage scores
Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression
Low-distortion embeddings are critical building blocks for developing random
sampling and random projection algorithms for linear algebra problems. We show
that, given a matrix with and a , with a constant probability, we can construct a low-distortion embedding
matrix \Pi \in \R^{O(\poly(d)) \times n} that embeds \A_p, the
subspace spanned by 's columns, into (\R^{O(\poly(d))}, \| \cdot \|_p);
the distortion of our embeddings is only O(\poly(d)), and we can compute in O(\nnz(A)) time, i.e., input-sparsity time. Our result generalizes the
input-sparsity time subspace embedding by Clarkson and Woodruff
[STOC'13]; and for completeness, we present a simpler and improved analysis of
their construction for . These input-sparsity time embeddings
are optimal, up to constants, in terms of their running time; and the improved
running time propagates to applications such as -distortion
subspace embedding and relative-error regression. For
, we show that a -approximate solution to the
regression problem specified by the matrix and a vector can be
computed in O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2) time; and for
, via a subspace-preserving sampling procedure, we show that a -distortion embedding of \A_p into \R^{O(\poly(d))} can be
computed in O(\nnz(A) \cdot \log n) time, and we also show that a
-approximate solution to the regression problem can be computed in O(\nnz(A) \cdot \log n + \poly(d)
\log(1/\epsilon)/\epsilon^2) time. Moreover, we can improve the embedding
dimension or equivalently the sample size to without increasing the complexity.Comment: 22 page
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
A Statistical Perspective on Algorithmic Leveraging
One popular method for dealing with large-scale data sets is sampling. For
example, by using the empirical statistical leverage scores as an importance
sampling distribution, the method of algorithmic leveraging samples and
rescales rows/columns of data matrices to reduce the data size before
performing computations on the subproblem. This method has been successful in
improving computational efficiency of algorithms for matrix problems such as
least-squares approximation, least absolute deviations approximation, and
low-rank matrix approximation. Existing work has focused on algorithmic issues
such as worst-case running times and numerical issues associated with providing
high-quality implementations, but none of it addresses statistical aspects of
this method.
In this paper, we provide a simple yet effective framework to evaluate the
statistical properties of algorithmic leveraging in the context of estimating
parameters in a linear regression model with a fixed number of predictors. We
show that from the statistical perspective of bias and variance, neither
leverage-based sampling nor uniform sampling dominates the other. This result
is particularly striking, given the well-known result that, from the
algorithmic perspective of worst-case analysis, leverage-based sampling
provides uniformly superior worst-case algorithmic results, when compared with
uniform sampling. Based on these theoretical results, we propose and analyze
two new leveraging algorithms. A detailed empirical evaluation of existing
leverage-based methods as well as these two new methods is carried out on both
synthetic and real data sets. The empirical results indicate that our theory is
a good predictor of practical performance of existing and new leverage-based
algorithms and that the new algorithms achieve improved performance.Comment: 44 pages, 17 figure
Randomized Dynamic Mode Decomposition
This paper presents a randomized algorithm for computing the near-optimal
low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging
techniques to compute low-rank matrix approximations at a fraction of the cost
of deterministic algorithms, easing the computational challenges arising in the
area of `big data'. The idea is to derive a small matrix from the
high-dimensional data, which is then used to efficiently compute the dynamic
modes and eigenvalues. The algorithm is presented in a modular probabilistic
framework, and the approximation quality can be controlled via oversampling and
power iterations. The effectiveness of the resulting randomized DMD algorithm
is demonstrated on several benchmark examples of increasing complexity,
providing an accurate and efficient approach to extract spatiotemporal coherent
structures from big data in a framework that scales with the intrinsic rank of
the data, rather than the ambient measurement dimension. For this work we
assume that the dynamics of the problem under consideration is evolving on a
low-dimensional subspace that is well characterized by a fast decaying singular
value spectrum
Randomized methods for matrix computations
The purpose of this text is to provide an accessible introduction to a set of
recently developed algorithms for factorizing matrices. These new algorithms
attain high practical speed by reducing the dimensionality of intermediate
computations using randomized projections. The algorithms are particularly
powerful for computing low-rank approximations to very large matrices, but they
can also be used to accelerate algorithms for computing full factorizations of
matrices. A key competitive advantage of the algorithms described is that they
require less communication than traditional deterministic methods
- âŠ