Search CORE

1,412 research outputs found

Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients

Author: Chávez Gustavo
Keyes David
Turkiyyah George
Zampini Stefano
Publication venue: 'Elsevier BV'
Publication date: 23/12/2017
Field of study

We present a robust and scalable preconditioner for the solution of large-scale linear systems that arise from the discretization of elliptic PDEs amenable to rank compression. The preconditioner is based on hierarchical low-rank approximations and the cyclic reduction method. The setup and application phases of the preconditioner achieve log-linear complexity in memory footprint and number of operations, and numerical experiments exhibit good weak and strong scalability at large processor counts in a distributed memory environment. Numerical experiments with linear systems that feature symmetry and nonsymmetry, definiteness and indefiniteness, constant and variable coefficients demonstrate the preconditioner applicability and robustness. Furthermore, it is possible to control the number of iterations via the accuracy threshold of the hierarchical matrix approximations and their arithmetic operations, and the tuning of the admissibility condition parameter. Together, these parameters allow for optimization of the memory requirements and performance of the preconditioner.Comment: 24 pages, Elsevier Journal of Computational and Applied Mathematics, Dec 201

arXiv.org e-Print Archive

eScholarship - University of California

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California

Frequency-Domain Numerical Modelling of Visco-Acoustic Waves Based on Finite-Difference and Finite-Element Discontinuous Galerkin Methods

Author: Jean Virieux
Romain Brossier
Stephane Operto
Vincent Etienne
Publication venue: 'IntechOpen'
Publication date: 28/09/2010
Field of study

IntechOpen

Recommended from our members

Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients

Author: Chávez G
Keyes D
Turkiyyah G
Zampini S
Publication venue: eScholarship, University of California
Publication date: 15/12/2018
Field of study

eScholarship - University of California

HiFlow3 - A Flexible and Hardware- Aware Parallel Finite Element Package

Author: Anzt Hartwig
Augustin Werner
Baumann Martin
Bockelmann Hendryk
Gengenbach Thomas
Hahn Tobias
Heuveline Vincent
Ketelaer Eva
Lukarski Dimitar
Otzen Andrea
Ritterbusch Sebastian
Rocker Björn
Ronnas Staffan
Schick Michael
Subramanian Chandramowli
Weiss Jan-Philipp
Wilhelm Florian
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2010
Field of study

KITopen

High-performance Parallel Solver for Integral Equations of Electromagnetics Based on Galerkin Method

Author: Bloshanskaya Lidia
Kruglyakov Mikhail
Publication venue
Publication date: 07/01/2017
Field of study

A new parallel solver for the volumetric integral equations (IE) of electrodynamics is presented. The solver is based on the Galerkin method which ensures the convergent numerical solution. The main features include: (i) the memory usage is 8 times lower, compared to analogous IE based algorithms, without additional restriction on the background media; (ii) accurate and stable method to compute matrix coefficients corresponding to the IE; (iii) high degree of parallelism. The solver's computational efficiency is shown on a problem of magnetotelluric sounding of the high conductivity contrast media. A good agreement with the results obtained with the second order finite element method is demonstrated. Due to effective approach to parallelization and distributed data storage the program exhibits perfect scalability on different hardware platforms.Comment: The main results of this paper were presented at IAMG 2015 conference Frieberg, Germany. 28 pages, 11 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data