1,622 research outputs found
Parallel matrix inversion techniques
In this paper, we present techniques for inverting sparse, symmetric and positive definite matrices on parallel and distributed computers. We propose two algorithms, one for SIMD implementation and the other for MIMD implementation. These algorithms are modified versions of Gaussian elimination and they take into account the sparseness of the matrix. Our algorithms perform better than the general parallel Gaussian elimination algorithm. In order to demonstrate the usefulness of our technique, we implemented the snake problem using our sparse matrix algorithm. Our studies reveal that the proposed sparse matrix inversion algorithm significantly reduces the time taken for obtaining the solution of the snake problem. In this paper, we present the results of our experimental work
Symmetric indefinite triangular factorization revealing the rank profile matrix
We present a novel recursive algorithm for reducing a symmetric matrix to a
triangular factorization which reveals the rank profile matrix. That is, the
algorithm computes a factorization where is a permutation matrix,
is lower triangular with a unit diagonal and is
symmetric block diagonal with and antidiagonal
blocks. The novel algorithm requires arithmetic
operations. Furthermore, experimental results demonstrate that our algorithm
can even be slightly more than twice as fast as the state of the art
unsymmetric Gaussian elimination in most cases, that is it achieves
approximately the same computational speed. By adapting the pivoting strategy
developed in the unsymmetric case, we show how to recover the rank profile
matrix from the permutation matrix and the support of the block-diagonal
matrix. There is an obstruction in characteristic for revealing the rank
profile matrix which requires to relax the shape of the block diagonal by
allowing the 2-dimensional blocks to have a non-zero bottom-right coefficient.
This relaxed decomposition can then be transformed into a standard
decomposition at a
negligible cost
Linearly scaling direct method for accurately inverting sparse banded matrices
In many problems in Computational Physics and Chemistry, one finds a special
kind of sparse matrices, termed "banded matrices". These matrices, which are
defined as having non-zero entries only within a given distance from the main
diagonal, need often to be inverted in order to solve the associated linear
system of equations. In this work, we introduce a new O(n) algorithm for
solving such a system, being n X n the size of the matrix. We produce the
analytical recursive expressions that allow to directly obtain the solution, as
well as the pseudocode for its computer implementation. Moreover, we review the
different options for possibly parallelizing the method, we describe the
extension to deal with matrices that are banded plus a small number of non-zero
entries outside the band, and we use the same ideas to produce a method for
obtaining the full inverse matrix. Finally, we show that the New Algorithm is
competitive, both in accuracy and in numerical efficiency, when compared to a
standard method based in Gaussian elimination. We do this using sets of large
random banded matrices, as well as the ones that appear when one tries to solve
the 1D Poisson equation by finite differences.Comment: 24 pages, 5 figures, submitted to J. Comp. Phy
Computing the Rank Profile Matrix
The row (resp. column) rank profile of a matrix describes the staircase shape
of its row (resp. column) echelon form. In an ISSAC'13 paper, we proposed a
recursive Gaussian elimination that can compute simultaneously the row and
column rank profiles of a matrix as well as those of all of its leading
sub-matrices, in the same time as state of the art Gaussian elimination
algorithms. Here we first study the conditions making a Gaus-sian elimination
algorithm reveal this information. Therefore, we propose the definition of a
new matrix invariant, the rank profile matrix, summarizing all information on
the row and column rank profiles of all the leading sub-matrices. We also
explore the conditions for a Gaussian elimination algorithm to compute all or
part of this invariant, through the corresponding PLUQ decomposition. As a
consequence, we show that the classical iterative CUP decomposition algorithm
can actually be adapted to compute the rank profile matrix. Used, in a Crout
variant, as a base-case to our ISSAC'13 implementation, it delivers a
significant improvement in efficiency. Second, the row (resp. column) echelon
form of a matrix are usually computed via different dedicated triangular
decompositions. We show here that, from some PLUQ decompositions, it is
possible to recover the row and column echelon forms of a matrix and of any of
its leading sub-matrices thanks to an elementary post-processing algorithm
On the equivalence of Gaussian elimination and Gauss-Jordan reduction in solving linear equations
A novel general approach to round-off error analysis using the error complexity concepts is described. This is applied to the analysis of the Gaussian Elimination and Gauss-Jordan scheme for solving linear equations. The results show that the two algorithms are equivalent in terms of our error complexity measures. Thus the inherently parallel Gauss-Jordan scheme can be implemented with confidence if parallel computers are available
LU factorization with panel rank revealing pivoting and its communication avoiding version
We present the LU decomposition with panel rank revealing pivoting (LU_PRRP),
an LU factorization algorithm based on strong rank revealing QR panel
factorization. LU_PRRP is more stable than Gaussian elimination with partial
pivoting (GEPP). Our extensive numerical experiments show that the new
factorization scheme is as numerically stable as GEPP in practice, but it is
more resistant to pathological cases and easily solves the Wilkinson matrix and
the Foster matrix. We also present CALU_PRRP, a communication avoiding version
of LU_PRRP that minimizes communication. CALU_PRRP is based on tournament
pivoting, with the selection of the pivots at each step of the tournament being
performed via strong rank revealing QR factorization. CALU_PRRP is more stable
than CALU, the communication avoiding version of GEPP. CALU_PRRP is also more
stable in practice and is resistant to pathological cases on which GEPP and
CALU fail.Comment: No. RR-7867 (2012
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
- …