792 research outputs found
A rapidly converging domain decomposition method for the Helmholtz equation
A new domain decomposition method is introduced for the heterogeneous 2-D and
3-D Helmholtz equations. Transmission conditions based on the perfectly matched
layer (PML) are derived that avoid artificial reflections and match incoming
and outgoing waves at the subdomain interfaces. We focus on a subdivision of
the rectangular domain into many thin subdomains along one of the axes, in
combination with a certain ordering for solving the subdomain problems and a
GMRES outer iteration. When combined with multifrontal methods, the solver has
near-linear cost in examples, due to very small iteration numbers that are
essentially independent of problem size and number of subdomains. It is to our
knowledge only the second method with this property next to the moving PML
sweeping method.Comment: 16 pages, 3 figures, 6 tables - v2 accepted for publication in the
Journal of Computational Physic
Using a multifrontal sparse solver in a high performance, finite element code
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP
Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers
This paper introduces a new sweeping preconditioner for the iterative
solution of the variable coefficient Helmholtz equation in two and three
dimensions. The algorithms follow the general structure of constructing an
approximate factorization by eliminating the unknowns layer by layer
starting from an absorbing layer or boundary condition. The central idea of
this paper is to approximate the Schur complement matrices of the factorization
using moving perfectly matched layers (PMLs) introduced in the interior of the
domain. Applying each Schur complement matrix is equivalent to solving a
quasi-1D problem with a banded LU factorization in the 2D case and to solving a
quasi-2D problem with a multifrontal method in the 3D case. The resulting
preconditioner has linear application cost and the preconditioned iterative
solver converges in a number of iterations that is essentially indefinite of
the number of unknowns or the frequency. Numerical results are presented in
both two and three dimensions to demonstrate the efficiency of this new
preconditioner.Comment: 25 page
High-performance direct solution of finite element problems on multi-core processors
A direct solution procedure is proposed and developed which exploits the parallelism that exists in current symmetric multiprocessing (SMP) multi-core processors. Several algorithms are proposed and developed to improve the performance of the direct solution of FE problems. A high-performance sparse direct solver is developed which allows experimentation with the newly developed and existing algorithms. The performance of the algorithms is investigated using a large set of FE problems. Furthermore, operation count estimations are developed to further assess various algorithms. An out-of-core version of the solver is developed to reduce the memory requirements for the solution. I/O is performed asynchronously without blocking the thread that makes the I/O request. Asynchronous I/O allows overlapping factorization and triangular solution computations with I/O. The performance of the developed solver is demonstrated on a large number of test problems. A problem with nearly 10 million degree of freedoms is solved on a low price desktop computer using the out-of-core version of the direct solver. Furthermore, the developed solver usually outperforms a commonly used shared memory solver.Ph.D.Committee Chair: Will, Kenneth; Committee Member: Emkin, Leroy; Committee Member: Kurc, Ozgur; Committee Member: Vuduc, Richard; Committee Member: White, Donal
- …