Search CORE

2,334 research outputs found

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

DI-fusion

Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers

Author: Engquist Björn
Ying Lexing
Publication venue
Publication date: 01/01/2010
Field of study

This paper introduces a new sweeping preconditioner for the iterative solution of the variable coefficient Helmholtz equation in two and three dimensions. The algorithms follow the general structure of constructing an approximate

LDL^t

factorization by eliminating the unknowns layer by layer starting from an absorbing layer or boundary condition. The central idea of this paper is to approximate the Schur complement matrices of the factorization using moving perfectly matched layers (PMLs) introduced in the interior of the domain. Applying each Schur complement matrix is equivalent to solving a quasi-1D problem with a banded LU factorization in the 2D case and to solving a quasi-2D problem with a multifrontal method in the 3D case. The resulting preconditioner has linear application cost and the preconditioned iterative solver converges in a number of iterations that is essentially indefinite of the number of unknowns or the frequency. Numerical results are presented in both two and three dimensions to demonstrate the efficiency of this new preconditioner.Comment: 25 page

arXiv.org e-Print Archive

CiteSeerX

Parallel Computation of Finite Element Navier-Stokes codes using MUMPS Solver

Author: Raju Mandhapati P.
Publication venue: International Journal of Computer Science Issues, IJCSI
Publication date: 01/01/2009
Field of study

The study deals with the parallelization of 2D and 3D finite element based Navier-Stokes codes using direct solvers. Development of sparse direct solvers using multifrontal solvers has significantly reduced the computational time of direct solution methods. Although limited by its stringent memory requirements, multifrontal solvers can be computationally efficient. First the performance of MUltifrontal Massively Parallel Solver (MUMPS) is evaluated for both 2D and 3D codes in terms of memory requirements and CPU times. The scalability of both Newton and modified Newton algorithms is tested

arXiv.org e-Print Archive

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Using a multifrontal sparse solver in a high performance, finite element code

Author: King Scott D.
Lucas Robert
Raefsky Arthur
Publication venue
Publication date
Field of study

We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP

NASA Technical Reports Server

A rapidly converging domain decomposition method for the Helmholtz equation

Author: Amestoy
Benamou
Berenger
Bollhöfer
Boubendir
Chevalier
Chew
Christiaan C. Stolk
Collino
Davis
Després
Després
Engquist
Engquist
Erlangga
Ernst
Gander
Gander
George
Schädle
Shaidurov
Wang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

A new domain decomposition method is introduced for the heterogeneous 2-D and 3-D Helmholtz equations. Transmission conditions based on the perfectly matched layer (PML) are derived that avoid artificial reflections and match incoming and outgoing waves at the subdomain interfaces. We focus on a subdivision of the rectangular domain into many thin subdomains along one of the axes, in combination with a certain ordering for solving the subdomain problems and a GMRES outer iteration. When combined with multifrontal methods, the solver has near-linear cost in examples, due to very small iteration numbers that are essentially independent of problem size and number of subdomains. It is to our knowledge only the second method with this property next to the moving PML sweeping method.Comment: 16 pages, 3 figures, 6 tables - v2 accepted for publication in the Journal of Computational Physic

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

Domain Decomposition Based High Performance Parallel Computing\ud

Author: Khaitan Siddhartha
Raju Mandhapati P.
Publication venue: International Journal of Computer Science Issues, IJCSI
Publication date: 01/10/2009
Field of study

The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Adaptive BDDC in Three Dimensions

Author: Amestoy
Bedřich Sousedík
Brenner
Demmel
Dohrmann
Farhat
Farhat
Fish
Fragakis
Jakub Šístek
Jan Mandel
Klawonn
Klawonn
Klawonn
Klawonn
Knyazev
Kruis
Le Tallec
Li
Mandel
Mandel
Mandel
Mandel
Mandel
Mandel
Mandel
Mandel
Pechstein
Pechstein
Poole
Smith
Sousedík
Toselli
Šístek
Publication venue: 'Elsevier BV'
Publication date: 28/02/2011
Field of study

The adaptive BDDC method is extended to the selection of face constraints in three dimensions. A new implementation of the BDDC method is presented based on a global formulation without an explicit coarse problem, with massive parallelism provided by a multifrontal solver. Constraints are implemented by a projection and sparsity of the projected operator is preserved by a generalized change of variables. The effectiveness of the method is illustrated on several engineering problems.Comment: 28 pages, 9 figures, 9 table

arXiv.org e-Print Archive

Crossref

Computational complexity and memory usage for multi-frontal direct solvers in structured mesh finite elements

Author: Calo Victor M.
Collier Nathan
Pardo David
Paszynski Maciej
Publication venue
Publication date: 01/01/2012
Field of study

The multi-frontal direct solver is the state-of-the-art algorithm for the direct solution of sparse linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from B-spline-based isogeometric finite elements, where the mesh is a structured grid. Specifically we provide the estimates for systems resulting from

C^{p-1}

polynomial B-spline spaces and compare them to those obtained using

C^0

spaces.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

espace@Curtin