792 research outputs found

    A rapidly converging domain decomposition method for the Helmholtz equation

    Full text link
    A new domain decomposition method is introduced for the heterogeneous 2-D and 3-D Helmholtz equations. Transmission conditions based on the perfectly matched layer (PML) are derived that avoid artificial reflections and match incoming and outgoing waves at the subdomain interfaces. We focus on a subdivision of the rectangular domain into many thin subdomains along one of the axes, in combination with a certain ordering for solving the subdomain problems and a GMRES outer iteration. When combined with multifrontal methods, the solver has near-linear cost in examples, due to very small iteration numbers that are essentially independent of problem size and number of subdomains. It is to our knowledge only the second method with this property next to the moving PML sweeping method.Comment: 16 pages, 3 figures, 6 tables - v2 accepted for publication in the Journal of Computational Physic

    Using a multifrontal sparse solver in a high performance, finite element code

    Get PDF
    We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP

    Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers

    Full text link
    This paper introduces a new sweeping preconditioner for the iterative solution of the variable coefficient Helmholtz equation in two and three dimensions. The algorithms follow the general structure of constructing an approximate LDLtLDL^t factorization by eliminating the unknowns layer by layer starting from an absorbing layer or boundary condition. The central idea of this paper is to approximate the Schur complement matrices of the factorization using moving perfectly matched layers (PMLs) introduced in the interior of the domain. Applying each Schur complement matrix is equivalent to solving a quasi-1D problem with a banded LU factorization in the 2D case and to solving a quasi-2D problem with a multifrontal method in the 3D case. The resulting preconditioner has linear application cost and the preconditioned iterative solver converges in a number of iterations that is essentially indefinite of the number of unknowns or the frequency. Numerical results are presented in both two and three dimensions to demonstrate the efficiency of this new preconditioner.Comment: 25 page

    High-performance direct solution of finite element problems on multi-core processors

    Get PDF
    A direct solution procedure is proposed and developed which exploits the parallelism that exists in current symmetric multiprocessing (SMP) multi-core processors. Several algorithms are proposed and developed to improve the performance of the direct solution of FE problems. A high-performance sparse direct solver is developed which allows experimentation with the newly developed and existing algorithms. The performance of the algorithms is investigated using a large set of FE problems. Furthermore, operation count estimations are developed to further assess various algorithms. An out-of-core version of the solver is developed to reduce the memory requirements for the solution. I/O is performed asynchronously without blocking the thread that makes the I/O request. Asynchronous I/O allows overlapping factorization and triangular solution computations with I/O. The performance of the developed solver is demonstrated on a large number of test problems. A problem with nearly 10 million degree of freedoms is solved on a low price desktop computer using the out-of-core version of the direct solver. Furthermore, the developed solver usually outperforms a commonly used shared memory solver.Ph.D.Committee Chair: Will, Kenneth; Committee Member: Emkin, Leroy; Committee Member: Kurc, Ozgur; Committee Member: Vuduc, Richard; Committee Member: White, Donal
    • …
    corecore