20,499 research outputs found
Fast computation of spectral projectors of banded matrices
We consider the approximate computation of spectral projectors for symmetric
banded matrices. While this problem has received considerable attention,
especially in the context of linear scaling electronic structure methods, the
presence of small relative spectral gaps challenges existing methods based on
approximate sparsity. In this work, we show how a data-sparse approximation
based on hierarchical matrices can be used to overcome this problem. We prove a
priori bounds on the approximation error and propose a fast algo- rithm based
on the QDWH algorithm, along the works by Nakatsukasa et al. Numerical
experiments demonstrate that the performance of our algorithm is robust with
respect to the spectral gap. A preliminary Matlab implementation becomes faster
than eig already for matrix sizes of a few thousand.Comment: 27 pages, 10 figure
Minimizing Communication in Linear Algebra
In 1981 Hong and Kung proved a lower bound on the amount of communication
needed to perform dense, matrix-multiplication using the conventional
algorithm, where the input matrices were too large to fit in the small, fast
memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and
extended it to the parallel case. In both cases the lower bound may be
expressed as (#arithmetic operations / ), where M is the size
of the fast memory (or local memory in the parallel case). Here we generalize
these results to a much wider variety of algorithms, including LU
factorization, Cholesky factorization, factorization, QR factorization,
algorithms for eigenvalues and singular values, i.e., essentially all direct
methods of linear algebra. The proof works for dense or sparse matrices, and
for sequential or parallel algorithms. In addition to lower bounds on the
amount of data moved (bandwidth) we get lower bounds on the number of messages
required to move it (latency). We illustrate how to extend our lower bound
technique to compositions of linear algebra operations (like computing powers
of a matrix), to decide whether it is enough to call a sequence of simpler
optimal algorithms (like matrix multiplication) to minimize communication, or
if we can do better. We give examples of both. We also show how to extend our
lower bounds to certain graph theoretic problems.
We point out recently designed algorithms for dense LU, Cholesky, QR,
eigenvalue and the SVD problems that attain these lower bounds; implementations
of LU and QR show large speedups over conventional linear algebra algorithms in
standard libraries like LAPACK and ScaLAPACK. Many open problems remain.Comment: 27 pages, 2 table
- …