Search CORE

131 research outputs found

Improving multifrontal methods by means of block low-rank representations

Author: Amestoy Patrick
Ashcraft Cleve
Boiteau Olivier
Buttari Alfredo
L'Excellent Jean-Yves
Weisbecker Clément
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

Submitted for publication to SIAMMatrices coming from elliptic Partial Differential Equations (PDEs) have been shown to have a low-rank property: well defined off-diagonal blocks of their Schur complements can be approximated by low-rank products. Given a suitable ordering of the matrix which gives to the blocks a geometrical meaning, such approximations can be computed using an SVD or a rank-revealing QR factorization. The resulting representation offers a substantial reduction of the memory requirement and gives efficient ways to perform many of the basic dense algebra operations. Several strategies have been proposed to exploit this property. We propose a low-rank format called Block Low-Rank (BLR), and explain how it can be used to reduce the memory footprint and the complexity of direct solvers for sparse matrices based on the multifrontal method. We present experimental results that show how the BLR format delivers gains that are comparable to those obtained with hierarchical formats such as Hierarchical matrices (H matrices) and Hierarchically Semi-Separable (HSS matrices) but provides much greater flexibility and ease of use which are essential in the context of a general purpose, algebraic solver

HAL-ENS-LYON

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Hal-Diderot

Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients

Author: Chávez Gustavo
Keyes David
Turkiyyah George
Zampini Stefano
Publication venue: 'Elsevier BV'
Publication date: 23/12/2017
Field of study

We present a robust and scalable preconditioner for the solution of large-scale linear systems that arise from the discretization of elliptic PDEs amenable to rank compression. The preconditioner is based on hierarchical low-rank approximations and the cyclic reduction method. The setup and application phases of the preconditioner achieve log-linear complexity in memory footprint and number of operations, and numerical experiments exhibit good weak and strong scalability at large processor counts in a distributed memory environment. Numerical experiments with linear systems that feature symmetry and nonsymmetry, definiteness and indefiniteness, constant and variable coefficients demonstrate the preconditioner applicability and robustness. Furthermore, it is possible to control the number of iterations via the accuracy threshold of the hierarchical matrix approximations and their arithmetic operations, and the tuning of the admissibility condition parameter. Together, these parameters allow for optimization of the memory requirements and performance of the preconditioner.Comment: 24 pages, Elsevier Journal of Computational and Applied Mathematics, Dec 201

arXiv.org e-Print Archive

eScholarship - University of California

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet François-Henry
Publication venue
Publication date: 26/06/2015
Field of study

We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

A Direct Elliptic Solver Based on Hierarchically Low-rank Schur Complements

Author: A. Aminfar
B.L. Buzbee
I. Ibragimov
J. Xia
J. Xia
J. Xia
L. Grasedyck
P. Amestoy
P. Swarztrauber
P.G. Schmitz
P.G. Schmitz
R.W. Hockney
S. Ambikasaran
S. Chandrasekaran
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/04/2016
Field of study

A parallel fast direct solver for rank-compressible block tridiagonal linear systems is presented. Algorithmic synergies between Cyclic Reduction and Hierarchical matrix arithmetic operations result in a solver with

O(N \log^2 N)

arithmetic complexity and

O(N \log N)

memory footprint. We provide a baseline for performance and applicability by comparing with well known implementations of the

\mathcal{H}

-LU factorization and algebraic multigrid with a parallel implementation that leverages the concurrency features of the method. Numerical experiments reveal that this method is comparable with other fast direct solvers based on Hierarchical Matrices such as

\mathcal{H}

-LU and that it can tackle problems where algebraic multigrid fails to converge

arXiv.org e-Print Archive

Crossref

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California

Hierarchical interpolative factorization for elliptic operators: differential equations

Author: Ho Kenneth L.
Ying Lexing
Publication venue
Publication date: 20/04/2015
Field of study

This paper introduces the hierarchical interpolative factorization for elliptic partial differential equations (HIF-DE) in two (2D) and three dimensions (3D). This factorization takes the form of an approximate generalized LU/LDL decomposition that facilitates the efficient inversion of the discretized operator. HIF-DE is based on the multifrontal method but uses skeletonization on the separator fronts to sparsify the dense frontal matrices and thus reduce the cost. We conjecture that this strategy yields linear complexity in 2D and quasilinear complexity in 3D. Estimated linear complexity in 3D can be achieved by skeletonizing the compressed fronts themselves, which amounts geometrically to a recursive dimensional reduction scheme. Numerical experiments support our claims and further demonstrate the performance of our algorithm as a fast direct solver and preconditioner. MATLAB codes are freely available.Comment: 37 pages, 13 figures, 12 tables; to appear, Comm. Pure Appl. Math. arXiv admin note: substantial text overlap with arXiv:1307.266

arXiv.org e-Print Archive

CiteSeerX

Recent advances in sparse direct solvers

Author: Agullo Emmanuel
Amestoy Patrick
Buttari Alfredo
Guermouche Abdou
Joslin Guillaume
L'Excellent Jean-Yves
Li Xiaoye,
Napov Artem
Rouet François-Henry
Sid-Lakhdar Wissam M.
Wang Shen
Weisbecker Clement
Yamazaki Ichitaro
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceDirect methods for the solution of sparse systems of linear equations of the form A x = b are used in a wide range of numerical simulation applications. Such methods are based on the decomposition of the matrix into a product of triangular factors (e.g., A = L U ), followed by triangular solves. They are known for their numerical accuracy and robustness but are also characterized by a high memory consumption and a large amount of computations. Here we survey some research directions that are being investigated by the sparse direct solver community to alleviate these issues: memory-aware scheduling techniques, low-rank approximations, and distributed/shared memory hybrid programming

HAL-ENS-LYON

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

DI-fusion

Hal-Diderot

Recommended from our members

Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients

Author: Chávez G
Keyes D
Turkiyyah G
Zampini S
Publication venue: eScholarship, University of California
Publication date: 15/12/2018
Field of study

eScholarship - University of California