15 research outputs found

    Parallel and fully recursive multifrontal sparse Cholesky

    No full text
    We describe the design, implementation, and performance of a new parallel sparse Cholesky factorization code. The code uses a multifrontal factorization strategy. Operations on small dense submatrices are performed using new dense matrix subroutines that are part of the code, although the code can also use the blas and lapack. The new code is recursive at both the sparse and the dense levels, it uses a novel recursive data layout for dense submatrices, and it is parallelized using Cilk, an extension of C specifically designed to parallelize recursive codes. We demonstrate that the new code performs well and scales well on SMPs. In particular, on up to 16 processors, the code outperforms two state-of-the-art message-passing codes. The scalability and high performance that the code achieves imply that recursive schedules, blocked data layouts, and dynamic scheduling are effective in the implementation of sparse factorization codes

    PARALLEL UNSYMMETRIC-PATTEN MULTIFRONTAL SPARSE LU WITH COLUMN PREORDERING

    No full text
    Abstract. We present a new parallel sparse LU factorization algorithm and code. The algorithm uses a column-preordering partial-pivoting unsymmetric-pattern multifrontal approach. Our baseline sequential algorithm is based on umfpack 4 but is somewhat simpler and is often somewhat faster than umfpack version 4.0. Our parallel algorithm is designed for shared-memory machines with a small or moderate number of processors (we tested it on up to 32 processors). We experimentally compare our algorithm with SuperLU MT, an existing shared-memory sparse LU factorization with partial pivoting. SuperLU MT scales better than our new algorithm, but our algorithm is more reliable and is usually faster in absolute (on up to 16 processors; we were not able to run SuperLU MT on 32). More specifically, on large matrices our algorithm is always faster on up to 4 processors, and is usually faster on 8 and 16. The main contribution of this paper is showing that the column-preordering partial-pivoting unsymmetric-pattern multifrontal approach, developed as a sequential algorithm by Davis in several recent versions of umfpack, can be effectively parallelized. 1

    Combinatorial preconditioners for scalar elliptic finite-element problems

    No full text
    Abstract. We present a new preconditioner for linear systems arising from finite-element discretizations of scalar elliptic partial differential equations (PDE’s). The solver splits the collection {Ke} of element matrices into a subset of matrices that are approximable by diagonally dominant matrices and a subset of matrices that are not approximable. The approximable Ke’s are approximated by diagonally dominant matrices Le’s that are assembled to form a global diagonally dominant matrix L. A combinatorial graph algorithm then approximates L by another diagonally dominant matrix M that is easier to factor. Finally, M is added to the inapproximable elements to form the preconditioner, which is then factored. When all the element matrices are approximable, which is often the case, the preconditioner is provably efficient. Approximating element matrices by diagonally dominant ones is not a new idea, but we present a new approximation method which is both efficient and provably good. The splitting idea is simple and natural in the context of combinatorial preconditioners, but hard to exploit in other preconditioning paradigms. Experimental results show that on problems in which some of the Ke’s are ill conditioned, our new preconditioner is more effective than an algebraic multigrid solver, than an incomplete-factorization preconditioner, and than a direct solver. preconditioning, finite elements, support preconditioners, combinatorial pre-Key words. conditioner
    corecore