2,334 research outputs found
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers
This paper introduces a new sweeping preconditioner for the iterative
solution of the variable coefficient Helmholtz equation in two and three
dimensions. The algorithms follow the general structure of constructing an
approximate factorization by eliminating the unknowns layer by layer
starting from an absorbing layer or boundary condition. The central idea of
this paper is to approximate the Schur complement matrices of the factorization
using moving perfectly matched layers (PMLs) introduced in the interior of the
domain. Applying each Schur complement matrix is equivalent to solving a
quasi-1D problem with a banded LU factorization in the 2D case and to solving a
quasi-2D problem with a multifrontal method in the 3D case. The resulting
preconditioner has linear application cost and the preconditioned iterative
solver converges in a number of iterations that is essentially indefinite of
the number of unknowns or the frequency. Numerical results are presented in
both two and three dimensions to demonstrate the efficiency of this new
preconditioner.Comment: 25 page
Parallel Computation of Finite Element Navier-Stokes codes using MUMPS Solver
The study deals with the parallelization of 2D and 3D finite element based Navier-Stokes codes using direct solvers. Development of sparse direct solvers using multifrontal solvers has significantly reduced the computational time of direct solution methods. Although limited by its stringent memory requirements, multifrontal solvers can be computationally efficient. First the performance of MUltifrontal Massively Parallel Solver (MUMPS) is evaluated for both 2D and 3D codes in terms of memory requirements and CPU times. The scalability of both Newton and modified Newton algorithms is tested
Using a multifrontal sparse solver in a high performance, finite element code
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP
A rapidly converging domain decomposition method for the Helmholtz equation
A new domain decomposition method is introduced for the heterogeneous 2-D and
3-D Helmholtz equations. Transmission conditions based on the perfectly matched
layer (PML) are derived that avoid artificial reflections and match incoming
and outgoing waves at the subdomain interfaces. We focus on a subdivision of
the rectangular domain into many thin subdomains along one of the axes, in
combination with a certain ordering for solving the subdomain problems and a
GMRES outer iteration. When combined with multifrontal methods, the solver has
near-linear cost in examples, due to very small iteration numbers that are
essentially independent of problem size and number of subdomains. It is to our
knowledge only the second method with this property next to the moving PML
sweeping method.Comment: 16 pages, 3 figures, 6 tables - v2 accepted for publication in the
Journal of Computational Physic
Domain Decomposition Based High Performance Parallel Computing\ud
The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested
Adaptive BDDC in Three Dimensions
The adaptive BDDC method is extended to the selection of face constraints in
three dimensions. A new implementation of the BDDC method is presented based on
a global formulation without an explicit coarse problem, with massive
parallelism provided by a multifrontal solver. Constraints are implemented by a
projection and sparsity of the projected operator is preserved by a generalized
change of variables. The effectiveness of the method is illustrated on several
engineering problems.Comment: 28 pages, 9 figures, 9 table
Computational complexity and memory usage for multi-frontal direct solvers in structured mesh finite elements
The multi-frontal direct solver is the state-of-the-art algorithm for the
direct solution of sparse linear systems. This paper provides computational
complexity and memory usage estimates for the application of the multi-frontal
direct solver algorithm on linear systems resulting from B-spline-based
isogeometric finite elements, where the mesh is a structured grid. Specifically
we provide the estimates for systems resulting from polynomial
B-spline spaces and compare them to those obtained using spaces.Comment: 8 pages, 2 figure
- …