765 research outputs found
Conditioning of Leverage Scores and Computation by QR Decomposition
The leverage scores of a full-column rank matrix A are the squared row norms
of any orthonormal basis for range(A). We show that corresponding leverage
scores of two matrices A and A + \Delta A are close in the relative sense, if
they have large magnitude and if all principal angles between the column spaces
of A and A + \Delta A are small. We also show three classes of bounds that are
based on perturbation results of QR decompositions. They demonstrate that
relative differences between individual leverage scores strongly depend on the
particular type of perturbation \Delta A. The bounds imply that the relative
accuracy of an individual leverage score depends on: its magnitude and the
two-norm condition of A, if \Delta A is a general perturbation; the two-norm
condition number of A, if \Delta A is a perturbation with the same norm-wise
row-scaling as A; (to first order) neither condition number nor leverage score
magnitude, if \Delta A is a component-wise row-scaled perturbation. Numerical
experiments confirm the qualitative and quantitative accuracy of our bounds.Comment: This version has been accepted to SIMAX but has not yet gone through
copy editin
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or
implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k))
floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
Low-rank matrix approximations, such as the truncated singular value
decomposition and the rank-revealing QR decomposition, play a central role in
data analysis and scientific computing. This work surveys and extends recent
research which demonstrates that randomization offers a powerful tool for
performing low-rank matrix approximation. These techniques exploit modern
computational architectures more fully than classical methods and open the
possibility of dealing with truly massive data sets.
This paper presents a modular framework for constructing randomized
algorithms that compute partial matrix decompositions. These methods use random
sampling to identify a subspace that captures most of the action of a matrix.
The input matrix is then compressed---either explicitly or implicitly---to this
subspace, and the reduced matrix is manipulated deterministically to obtain the
desired low-rank factorization. In many cases, this approach beats its
classical competitors in terms of accuracy, speed, and robustness. These claims
are supported by extensive numerical experiments and a detailed error analysis
Accurate and Efficient Expression Evaluation and Linear Algebra
We survey and unify recent results on the existence of accurate algorithms
for evaluating multivariate polynomials, and more generally for accurate
numerical linear algebra with structured matrices. By "accurate" we mean that
the computed answer has relative error less than 1, i.e., has some correct
leading digits. We also address efficiency, by which we mean algorithms that
run in polynomial time in the size of the input. Our results will depend
strongly on the model of arithmetic: Most of our results will use the so-called
Traditional Model (TM). We give a set of necessary and sufficient conditions to
decide whether a high accuracy algorithm exists in the TM, and describe
progress toward a decision procedure that will take any problem and provide
either a high accuracy algorithm or a proof that none exists. When no accurate
algorithm exists in the TM, it is natural to extend the set of available
accurate operations by a library of additional operations, such as , dot
products, or indeed any enumerable set which could then be used to build
further accurate algorithms. We show how our accurate algorithms and decision
procedure for finding them extend to this case. Finally, we address other
models of arithmetic, and the relationship between (im)possibility in the TM
and (in)efficient algorithms operating on numbers represented as bit strings.Comment: 49 pages, 6 figures, 1 tabl
Provable Deterministic Leverage Score Sampling
We explain theoretically a curious empirical phenomenon: "Approximating a
matrix by deterministically selecting a subset of its columns with the
corresponding largest leverage scores results in a good low-rank matrix
surrogate". To obtain provable guarantees, previous work requires randomized
sampling of the columns with probabilities proportional to their leverage
scores.
In this work, we provide a novel theoretical analysis of deterministic
leverage score sampling. We show that such deterministic sampling can be
provably as accurate as its randomized counterparts, if the leverage scores
follow a moderately steep power-law decay. We support this power-law assumption
by providing empirical evidence that such decay laws are abundant in real-world
data sets. We then demonstrate empirically the performance of deterministic
leverage score sampling, which many times matches or outperforms the
state-of-the-art techniques.Comment: 20th ACM SIGKDD Conference on Knowledge Discovery and Data Minin
A new perturbation bound for the LDU factorization of diagonally dominant matrices
This work introduces a new perturbation bound for the L factor of the LDU factorization
of (row) diagonally dominant matrices computed via the column diagonal dominance pivoting
strategy. This strategy yields L and U factors which are always well-conditioned and, so, the LDU
factorization is guaranteed to be a rank-revealing decomposition. The new bound together with
those for the D and U factors in [F. M. Dopico and P. Koev, Numer. Math., 119 (2011), pp. 337–
371] establish that if diagonally dominant matrices are parameterized via their diagonally dominant
parts and off-diagonal entries, then tiny relative componentwise perturbations of these parameters
produce tiny relative normwise variations of L and U and tiny relative entrywise variations of D when
column diagonal dominance pivoting is used. These results will allow us to prove in a follow-up work
that such perturbations also lead to strong perturbation bounds for many other problems involving
diagonally dominant matrices.Research supported in part by Ministerio de Economía y Competitividad
of Spain under grant MTM2012-32542.Publicad
- …