938 research outputs found
On large-scale diagonalization techniques for the Anderson model of localization
We propose efficient preconditioning algorithms for an eigenvalue problem arising in quantum physics, namely the computation of a few interior eigenvalues and their associated eigenvectors for large-scale sparse real and symmetric indefinite matrices of the Anderson model
of localization. We compare the Lanczos algorithm in the 1987 implementation by Cullum and Willoughby with the shift-and-invert techniques in the implicitly restarted Lanczos method and in the Jacobi–Davidson method. Our preconditioning approaches for the shift-and-invert symmetric indefinite linear system are based on maximum weighted matchings and algebraic multilevel incomplete
LDLT factorizations. These techniques can be seen as a complement to the alternative idea of using more complete pivoting techniques for the highly ill-conditioned symmetric indefinite Anderson matrices. We demonstrate the effectiveness and the numerical accuracy of these algorithms. Our numerical examples reveal that recent algebraic multilevel preconditioning solvers can accelerate the computation of a large-scale eigenvalue problem corresponding to the Anderson model of localization
by several orders of magnitude
Computing the Rank Profile Matrix
The row (resp. column) rank profile of a matrix describes the staircase shape
of its row (resp. column) echelon form. In an ISSAC'13 paper, we proposed a
recursive Gaussian elimination that can compute simultaneously the row and
column rank profiles of a matrix as well as those of all of its leading
sub-matrices, in the same time as state of the art Gaussian elimination
algorithms. Here we first study the conditions making a Gaus-sian elimination
algorithm reveal this information. Therefore, we propose the definition of a
new matrix invariant, the rank profile matrix, summarizing all information on
the row and column rank profiles of all the leading sub-matrices. We also
explore the conditions for a Gaussian elimination algorithm to compute all or
part of this invariant, through the corresponding PLUQ decomposition. As a
consequence, we show that the classical iterative CUP decomposition algorithm
can actually be adapted to compute the rank profile matrix. Used, in a Crout
variant, as a base-case to our ISSAC'13 implementation, it delivers a
significant improvement in efficiency. Second, the row (resp. column) echelon
form of a matrix are usually computed via different dedicated triangular
decompositions. We show here that, from some PLUQ decompositions, it is
possible to recover the row and column echelon forms of a matrix and of any of
its leading sub-matrices thanks to an elementary post-processing algorithm
Computing with functions in spherical and polar geometries I. The sphere
A collection of algorithms is described for numerically computing with smooth
functions defined on the unit sphere. Functions are approximated to essentially
machine precision by using a structure-preserving iterative variant of Gaussian
elimination together with the double Fourier sphere method. We show that this
procedure allows for stable differentiation, reduces the oversampling of
functions near the poles, and converges for certain analytic functions.
Operations such as function evaluation, differentiation, and integration are
particularly efficient and can be computed by essentially one-dimensional
algorithms. A highlight is an optimal complexity direct solver for Poisson's
equation on the sphere using a spectral method. Without parallelization, we
solve Poisson's equation with million degrees of freedom in one minute on
a standard laptop. Numerical results are presented throughout. In a companion
paper (part II) we extend the ideas presented here to computing with functions
on the disk.Comment: 23 page
LU factorization with panel rank revealing pivoting and its communication avoiding version
We present the LU decomposition with panel rank revealing pivoting (LU_PRRP),
an LU factorization algorithm based on strong rank revealing QR panel
factorization. LU_PRRP is more stable than Gaussian elimination with partial
pivoting (GEPP). Our extensive numerical experiments show that the new
factorization scheme is as numerically stable as GEPP in practice, but it is
more resistant to pathological cases and easily solves the Wilkinson matrix and
the Foster matrix. We also present CALU_PRRP, a communication avoiding version
of LU_PRRP that minimizes communication. CALU_PRRP is based on tournament
pivoting, with the selection of the pivots at each step of the tournament being
performed via strong rank revealing QR factorization. CALU_PRRP is more stable
than CALU, the communication avoiding version of GEPP. CALU_PRRP is also more
stable in practice and is resistant to pathological cases on which GEPP and
CALU fail.Comment: No. RR-7867 (2012
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
Data Structures and Algorithms for Efficient Solution of Simultaneous Linear Equations from 3-D Ice Sheet Models
Two current software packages for solving large systems of sparse simultaneous l~neare equations are evaluated in terms of their applicability to solving systems of equations generated by the University of Maine Ice Sheet Model. SuperLU, the first package, has been developed by researchers at the University of California at Berkeley and the Lawrence Berkeley National Laboratory. UMFPACK, the second package, has been developed by T. A. Davis of the University of Florida who has ties with the U. C. Berkeley researchers as well as European researchers. Both packages are direct solvers that use LU factorization with forward and backward substitution. The University of Maine Ice Sheet Model uses the finite element method to solve partial differential equations that describe ice thickness, velocity,and temperature throughout glaciers as functions of position and t~me. The finite element method generates systems of linear equations having tens of thousands of variables and one hundred or so non-zero coefficients per equation. Matrices representing these systems of equations may be strictly banded or banded with right and lower borders. In order to efficiently Interface the software packages with the ice sheet model, a modified compressed column data structure and supporting routines were designed and written. The data structure interfaces directly with both software packages and allows the ice sheet model to access matrix coefficients by row and column number in roughly 100 nanoseconds while only storing non-zero entries of the matrix. No a priori knowledge of the matrix\u27s sparsity pattern is required. Both software packages were tested with matrices produced by the model and performance characteristics were measured arid compared with banded Gaussian elimination. When combined with high performance basic linear algebra subprograms (BLAS), the packages are as much as 5 to 7 times faster than banded Gaussian elimination. The BLAS produced by K. Goto of the University of Texas was used. Memory usage by the packages varted from slightly more than banded Gaussian elimination with UMFPACK, to as much as a 40% savings with SuperLU. In addition, the packages provide componentwise backward error measures and estimates of the matrix\u27s condition number. SuperLU is available for parallel computers as well as single processor computers. UMPACK is only for single processor computers. Both packages are also capable of efficiently solving the bordered matrix problem
- …