Search CORE

491 research outputs found

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

A Fast Parallel Poisson Solver on Irregular Domains Applied to Beam Dynamic Simulations

Author: A. Adelmann
Adams
Forsythe
Gluckstern
Greenbaum
Hackbusch
Hackbusch
Heroux
Hestenes
Hockney
Jomaa
Landau
LeVeque
McCorquodale
P. Arbenz
Pöplau
Qiang
Qiang
Saad
Sacherer
Serafini
Shortley
Struckmeier
Swarztrauber
Trottenberg
Trottenberg
van der Vorst
Vaněk
Wiedemann
Y. Ineichen
Young
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We discuss the scalable parallel solution of the Poisson equation within a Particle-In-Cell (PIC) code for the simulation of electron beams in particle accelerators of irregular shape. The problem is discretized by Finite Differences. Depending on the treatment of the Dirichlet boundary the resulting system of equations is symmetric or `mildly' nonsymmetric positive definite. In all cases, the system is solved by the preconditioned conjugate gradient algorithm with smoothed aggregation (SA) based algebraic multigrid (AMG) preconditioning. We investigate variants of the implementation of SA-AMG that lead to considerable improvements in the execution times. We demonstrate good scalability of the solver on distributed memory parallel processor with up to 2048 processors. We also compare our SAAMG-PCG solver with an FFT-based solver that is more commonly used for applications in beam dynamics

arXiv.org e-Print Archive

CiteSeerX

Crossref

Recommended from our members

The simulation of fluid flow processes using vector processors

Author: Ierotheou Constantinos Savvas
Publication venue
Publication date: 01/05/1990
Field of study

In this thesis the potential gains in vectorisation of linear and non-linear systems of equations are investigated. Previous studies carried out on the suitability of algorithms for vectorisation have been based on the solution of Poisson's equation. In accordance with this, a range of algorithms are explored and compared using a VA-1 pipeline processor attached to a MASSCOMP MC5400. Analysis shows that almost full vectorisation is possible leading to speed-up factors of up to 90. Based on these results the vectorised conjugate gradient with a Jacobi preconditioner (JCGV) is the best of the algorithms considered. This work is extended to the development of a two-dimensional fluid flow code which is used to solve the Navier-Stokes equations, SIMPLE is implemented to handle the non-linear nature of the equations. The first two problems are isothermal flows, viz, the 'moving lid cavity' and the 'sudden expansion in a duct' problem. A study of where the greatest computational effort is expended, and subsequent vectorisation leads to 98% of SIMPLE being modified. This results in speed-up factors of 6 for the cavity problem and 29 for the sudden expansion problem. In both problems the JCGV is marginally faster than the vectorised Jacobi with under-relaxation (JURY). However, the JCGV algorithm is not robust and it is necessary to relax carefully the approximation, otherwise high computation times or divergence is likely. Two further problems are considered each with increasing complexity, these include scalar quantities of temperature and characteristics of k-e turbulence. One problem is based on 'turbulent L-shaped flow in a duct' and the other on the 'natural convection in a square cavity'. A consequence of the higher scalar computation gives speed-up factors of 5 for the turbulent L-shaped flow and 11 for the natural convection problem. There is little to choose between the JCGV and JURV algorithms, however, the robustness problems with the JCGV algorithm remain. A multigrid method (ACM) is used to improve the convergence rate of the algorithms, particularly as the size of problem is increased. Although it is more effective in scalar, it also provides worthwhile improvements for the vectorised algorithms with overall factors of 8.5. Convergence difficulties with the JCG algorithm also prevents the combination with the ACM method. Therefore, the vectorised JUR algorithm with the ACM method is not only more efficient and reliable, but also has scope for improvement as the grid is increased. The potential gains in vectorisation of the SIMPLE family on pipeline architectures have been clearly demonstrated and indicate that such efforts on practical CFD codes should be well rewarded with regard to processor performance

Greenwich Academic Literature Archive

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

A Semicoarsening Multigrid Algorithm for SIMD Machines

Author: Dendy J. E., Jr.
Ida M. P.
Rutledge J. M.
Publication venue: 'The Japan Society for Industrial and Applied Mathematics'
Publication date: 01/11/1992
Field of study

A semicoarsening multigrid algorithm suitable for use on single instruction multiple data (SIMD) architectures has been implemented on the CM-2. The method performs well for strongly anisotropic problems and for problems with coefficients jumping by orders of magnitude across internal interfaces. The parallel efficiency of this method is analyzed, and its actual performance is compared with its performance on some other machines, both parallel and nonparallel

Caltech Authors

Parallel unstructured solvers for linear partial differential equations

Author: Becker Dulcenéia
Publication venue: Cranfield University
Publication date: 01/05/2006
Field of study

This thesis presents the development of a parallel algorithm to solve symmetric systems of linear equations and the computational implementation of a parallel partial differential equations solver for unstructured meshes. The proposed method, called distributive conjugate gradient - DCG, is based on a single-level domain decomposition method and the conjugate gradient method to obtain a highly scalable parallel algorithm. An overview on methods for the discretization of domains and partial differential equations is given. The partition and refinement of meshes is discussed and the formulation of the weighted residual method for two- and three-dimensions presented. Some of the methods to solve systems of linear equations are introduced, highlighting the conjugate gradient method and domain decomposition methods. A parallel unstructured PDE solver is proposed and its actual implementation presented. Emphasis is given to the data partition adopted and the scheme used for communication among adjacent subdomains is explained. A series of experiments in processor scalability is also reported. The derivation and parallelization of DCG are presented and the method validated throughout numerical experiments. The method capabilities and limitations were investigated by the solution of the Poisson equation with various source terms. The experimental results obtained using the parallel solver developed as part of this work show that the algorithm presented is accurate and highly scalable, achieving roughly linear parallel speed-up in many of the cases tested

Cranfield CERES