Search CORE

1,643 research outputs found

Domain Decomposition Based High Performance Parallel Computing\ud

Author: Khaitan Siddhartha
Raju Mandhapati P.
Publication venue: International Journal of Computer Science Issues, IJCSI
Publication date: 01/10/2009
Field of study

The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Author: Li Ang
Negrut Dan
Serban Radu
Publication venue
Publication date: 25/09/2015
Field of study

We discuss an approach for solving sparse or dense banded linear systems

{\bf A} {\bf x} = {\bf b}

on a Graphics Processing Unit (GPU) card. The matrix

{\bf A} \in {\mathbb{R}}^{N \times N}

is possibly nonsymmetric and moderately large; i.e.,

10000 \leq N \leq 500000

. The ${\it split\ and\ parallelize}

(

{\tt SaP}

) approach seeks to partition the matrix

{\bf A}

into diagonal sub-blocks

{\bf A}_i

,

i=1,\ldots,P

, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks

{\bf A}_i

. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called

{\tt SaP::GPU}

, which is compared in terms of efficiency with three commonly used sparse direct solvers:

{\tt PARDISO}

,

{\tt SuperLU}

, and

{\tt MUMPS}

.

{\tt SaP::GPU}

, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's

{\tt MKL}

,

{\tt SaP::GPU}

also fares well when used to solve dense banded systems that are close to being diagonally dominant.

{\tt SaP::GPU}$ is publicly available and distributed as open source under a permissive BSD3 license.Comment: 38 page

arXiv.org e-Print Archive

CiteSeerX

A domain decomposing parallel sparse linear system solver

Author: Amestoy
Amestoy
Amestoy
Amestoy
Barrett
Benzi
Benzi
Benzi
Berry
Chen
Dongarra
Dongarra
Gravvanis
Gravvanis
Gravvanis
Karypis
Karypis
Lawrie
Lawson
Li
Manguoglu
Manguoglu
Murat Manguoglu
Polizzi
Polizzi
Sameh
Schenk
Schenk
Schenk
Publication venue: 'Elsevier BV'
Publication date: 26/08/2011
Field of study

The solution of large sparse linear systems is often the most time-consuming part of many science and engineering applications. Computational fluid dynamics, circuit simulation, power network analysis, and material science are just a few examples of the application areas in which large sparse linear systems need to be solved effectively. In this paper we introduce a new parallel hybrid sparse linear system solver for distributed memory architectures that contains both direct and iterative components. We show that by using our solver one can alleviate the drawbacks of direct and iterative solvers, achieving better scalability than with direct solvers and more robustness than with classical preconditioned iterative solvers. Comparisons to well-known direct and iterative solvers on a parallel architecture are provided.Comment: To appear in Journal of Computational and Applied Mathematic

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Selected inversion as key to a stable Langevin evolution across the QCD phase boundary

Author: Bloch Jacques
Schenk Olaf
Publication venue: 'EDP Sciences'
Publication date: 27/07/2017
Field of study

We present new results of full QCD at nonzero chemical potential. In PRD 92, 094516 (2015) the complex Langevin method was shown to break down when the inverse coupling decreases and enters the transition region from the deconfined to the confined phase. We found that the stochastic technique used to estimate the drift term can be very unstable for indefinite matrices. This may be avoided by using the full inverse of the Dirac operator, which is, however, too costly for four-dimensional lattices. The major breakthrough in this work was achieved by realizing that the inverse elements necessary for the drift term can be computed efficiently using the selected inversion technique provided by the parallel sparse direct solver package PARDISO. In our new study we show that no breakdown of the complex Langevin method is encountered and that simulations can be performed across the phase boundary.Comment: 8 pages, 6 figures, Proceedings of the 35th International Symposium on Lattice Field Theory, Granada, Spai

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

FERM3D: A finite element R-matrix electron molecule scattering code

Author: Allan
Bathe
Bouchiha
Boudaiffa
Colle
Dill
Gianturco
Gianturco
Gianturco
Gianturco
Gianturco
Gibson
Granovsky
Greene
Hara
Lane
Lee
Lehoucq
Li
Lucchese
Lucchese
McCall
Morrison
Morrison
Pople
Press
Rescigno
Scheer
Stefano Tonzani
Tennyson
Tonzani
Tonzani
Tonzani
Tonzani
Trevisan
Werner
Publication venue: 'Elsevier BV'
Publication date: 06/07/2006
Field of study

FERM3D is a three-dimensional finite element program, for the elastic scattering of a low energy electron from a general polyatomic molecule, which is converted to a potential scattering problem. The code is based on tricubic polynomials in spherical coordinates. The electron-molecule interaction is treated as a sum of three terms: electrostatic, exchange. and polarisation. The electrostatic term can be extracted directly from ab initio codes ({\sc{GAUSSIAN 98}} in the work described here), while the exchange term is approximated using a local density functional. A local polarisation potential based on density functional theory [C. Lee, W. Yang and R. G. Parr, {Phys. Rev. B} {37}, (1988) 785] describes the long range attraction to the molecular target induced by the scattering electron. Photoionisation calculations are also possible and illustrated in the present work. The generality and simplicity of the approach is important in extending electron-scattering calculations to more complex targets than it is possible with other methods.Comment: 30 pages, 4 figures, preprint, Computer Physics Communications (in press

arXiv.org e-Print Archive

Crossref

CERN Document Server

A two step viscothermal acoustic FE method

Author: Boer André de
Kampinga Ronald
Wijnant Ysbrand
Publication venue
Publication date: 01/01/2009
Field of study

Previously, the authors presented a finite element for viscothermal acoustics. This element has the velocity vector, the temperature and the pressure as degrees of freedom. It can be used, for example, to model sound propagation in miniature acoustical transducers. Unfortunately, the large number of coupled degrees of freedom can make the models big and time consuming to solve. A method with reduced calculation time has been developed. It is possible to partially decouple the temperature degree of freedom, as result of the differences in the characteristic length scales of acoustics and heat conduction. This leads to a method that uses two sequential steps. In the first step, a scalar field containing information about the thermal effects is calculated (not the temperature). This is a relatively small FE calculation. In the second step, the actual viscothermal acoustical equations are solved. This calculation uses the field calculated in the first step and has the velocity vector and the pressure as the degrees of freedom. The temperature is not a degree of freedom anymore, but it can be easily calculated in a post processing step. The required computational effort is reduced significantly, while the difference in the results, compared to the fully coupled method, is negligible. Along with the theoretical basis for the method, a specific FE calculation is presented to illustrate its accuracy and improvement in calculation time

University of Twente Research Information

On large-scale diagonalization techniques for the Anderson model of localization

Author: Brandes T.
Cain P.
Golub Gene
Matthias Bollhöfer
Olaf Schenk
Parlett Beresford
Rudolf A. Römer
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2005
Field of study

We propose efficient preconditioning algorithms for an eigenvalue problem arising in quantum physics, namely the computation of a few interior eigenvalues and their associated eigenvectors for large-scale sparse real and symmetric indefinite matrices of the Anderson model of localization. We compare the Lanczos algorithm in the 1987 implementation by Cullum and Willoughby with the shift-and-invert techniques in the implicitly restarted Lanczos method and in the Jacobi–Davidson method. Our preconditioning approaches for the shift-and-invert symmetric indefinite linear system are based on maximum weighted matchings and algebraic multilevel incomplete LDLT factorizations. These techniques can be seen as a complement to the alternative idea of using more complete pivoting techniques for the highly ill-conditioned symmetric indefinite Anderson matrices. We demonstrate the effectiveness and the numerical accuracy of these algorithms. Our numerical examples reveal that recent algebraic multilevel preconditioning solvers can accelerate the computation of a large-scale eigenvalue problem corresponding to the Anderson model of localization by several orders of magnitude

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository