Search CORE

27 research outputs found

Parallel Unsmoothed Aggregation Algebraic Multigrid Algorithms on GPUs

Author: A Krechel
D Goddeke
G Haase
G Karypis
G Karypis
GE Blelloch
H Grossauer
H Sterck De
J Bolz
N Bell
O Axelsson
O Axelsson
PS Vassilevski
R Courant
TV Kolev
VE Henson
W Joubert
Publication venue
Publication date: 11/02/2013
Field of study

We design and implement a parallel algebraic multigrid method for isotropic graph Laplacian problems on multicore Graphical Processing Units (GPUs). The proposed AMG method is based on the aggregation framework. The setup phase of the algorithm uses a parallel maximal independent set algorithm in forming aggregates and the resulting coarse level hierarchy is then used in a K-cycle iteration solve phase with a

\ell^1

-Jacobi smoother. Numerical tests of a parallel implementation of the method for graphics processors are presented to demonstrate its effectiveness.Comment: 18 pages, 3 figure

arXiv.org e-Print Archive

Crossref

A Full-Depth Amalgamated Parallel 3D Geometric Multigrid Solver for GPU Clusters

Author: Brandt A.
Brandvik T.
Corrigan A.
Cwire
Cwire
Elsen E.
Fan Z.
Goodnight N.
Griebel M.
Gropp W. D.
Göddeke D.
Hempel R.
Kindratenko V.
Matsuoka S.
McBryan O. A.
Micikevicius P.
Owens J.D.
Press W. H.
Schive H.
Showerman M.
Thibault J. C.
Tokyo Institute
Wan D.C.
Publication venue: 'IUScholarWorks'
Publication date: 04/01/2011
Field of study

Numerical computations of incompressible flow equations with pressure-based algorithms necessitate the solution of an elliptic Poisson equation, for which multigrid methods are known to be very efficient. In our previous work we presented a dual-level (MPI-CUDA) parallel implementation of the Navier-Stokes equations to simulate buoyancy-driven incompressible fluid flows on GPU clusters with simple iterative methods while focusing on the scalability of the overall solver. In the present study we describe the implementation and performance of a multigrid method to solve the pressure Poisson equation within our MPI-CUDA parallel incompressible flow solver. Various design decisions and algorithmic choices for multigrid methods are explored in light of NVIDIA’s recent Fermi architecture. We discuss how unique aspects of an MPI-CUDA implementation for GPU clusters is related to the software choices made to implement the multigrid method. We propose a new coarse grid solution method of embedded multigrid with amalgamation and show that the parallel implementation retains the numerical efficiency of the multigrid method. Performance measurements on the NCSA Lincoln and TACC Longhorn clusters are presented for up to 64 GPUs

Crossref

Boise State University - ScholarWorks

A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing

Author: Ayachit U
Burstedde C
Carslaw HS
Cole KD
Ern A
Kaufman L
Kergaßner A
Lindgren LE
Mozaffar M
Schroeder WJ
Wohlers Associates Inc
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

This work introduces an innovative parallel, fully-distributed finite element framework for growing geometries and its application to metal additive manufacturing. It is well-known that virtual part design and qualification in additive manufacturing requires highly-accurate multiscale and multiphysics analyses. Only high performance computing tools are able to handle such complexity in time frames compatible with time-to-market. However, efficiency, without loss of accuracy, has rarely held the centre stage in the numerical community. Here, in contrast, the framework is designed to adequately exploit the resources of high-end distributed-memory machines. It is grounded on three building blocks: (1) Hierarchical adaptive mesh refinement with octree-based meshes; (2) a parallel strategy to model the growth of the geometry; (3) state-of-the-art parallel iterative linear solvers. Computational experiments consider the heat transfer analysis at the part scale of the printing process by powder-bed technologies. After verification against a 3D benchmark, a strong-scaling analysis assesses performance and identifies major sources of parallel overhead. A third numerical example examines the efficiency and robustness of (2) in a curved 3D shape. Unprecedented parallelism and scalability were achieved in this work. Hence, this framework contributes to take on higher complexity and/or accuracy, not only of part-scale simulations of metal or polymer additive manufacturing, but also in welding, sedimentation, atherosclerosis, or any other physical problem where the physical domain of interest grows in time

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Scipedia

Large-scale adaptive mantle convection simulation

Author: Alisic Laura
Burstedde Carsten
Ghattas Omar
Gurnis Michael
Stadler Georg
Tan Eh
Wilcox Lucas C.
Publication venue: Royal Astronomical Society
Publication date: 01/01/2013
Field of study

A new generation, parallel adaptive-mesh mantle convection code, Rhea, is described and benchmarked. Rhea targets large-scale mantle convection simulations on parallel computers, and thus has been developed with a strong focus on computational efficiency and parallel scalability of both mesh handling and numerical solvers. Rhea builds mantle convection solvers on a collection of parallel octree-based adaptive finite element libraries that support new distributed data structures and parallel algorithms for dynamic coarsening, refinement, rebalancing and repartitioning of the mesh. In this study we demonstrate scalability to 122 880 compute cores and verify correctness of the implementation. We present the numerical approximation and convergence properties using 3-D benchmark problems and other tests for variable-viscosity Stokes flow and thermal convection

Crossref

Caltech Authors

Calhoun, Institutional Archive of the Naval Postgraduate School