Search CORE

41 research outputs found

Block-asynchronous multigrid smoothers for GPU-accelerated systems

Author: Anzt H.
Dongarra J.
Gates M.
Tomov S.
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2011
Field of study

This paper explores the need for asynchronous iteration algorithms as smoothers in multigrid methods. The hardware target for the new algorithms is top-of-the-line, highly parallel hybrid architectures -- multicore-based systems enhanced with GPGPUs. These architectures are the most likely candidates for future high-end supercomputers. To pave the road for their efficient use, challenges related to the established notion that "data movement, not FLOPS, is the bottleneck to performance" must be resolved. Our work is in this direction -- we designed block-asynchronous multigrid smoothers that perform more flops in order to reduce synchronization, and hence data movement. We show that the extra flops are done for "free", while synchronization is reduced and the convergence properties of multigrid with classical smoothers like Gauss-Seidel are preserved

KITopen

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California

Asynchronous and Multiprecision Linear Solvers - Scalable and Fault-Tolerant Numerics for Energy Efficient High Performance Computing

Author: Anzt Hartwig
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2012
Field of study

Asynchronous methods minimize idle times by removing synchronization barriers, and therefore allow the efficient usage of computer systems. The implied high tolerance with respect to communication latencies improves the fault tolerance. As asynchronous methods also enable the usage of the power and energy saving mechanisms provided by the hardware, they are suitable candidates for the highly parallel and heterogeneous hardware platforms that are expected for the near future

KITopen

GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement

Author: Anzt H.
Dongarra J.
Heuveline Vincent
Luszczek P.
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2011
Field of study

In hardware-aware high performance computing, block-asynchronous iteration and mixed precision iterative refinement are two techniques that may be used to leverage the computing power of SIMD accelerators like GPUs in the iterative solution of linear equation systems. although they use a very different approach for this purpose, they share the basic idea of compensating the convergence properties of an inferior numerical algorithm by a more efficient usage of the provided computing power. In this paper, we analyze the potential of combining both techniques. Therefore, we derive a mixed precision iterative refinement algorithm using a block-asynchronous iteration as an error correction solver, and compare its performance with a pure implementation of a block-asynchronous iteration and an iterative refinement method using double precision for the error correction solver. For matrices from the University of Florida Matrix collection, we report the convergence behaviour and provide the total solver runtime using different GPU architectures

KITopen

A block-asynchronous relaxation method for graphics processing units

Author: Ashby
Bagnara
Bai
Bridges
Cappello
Chazan
Dongarra
Du
Frommer
Frommer
Hartwig Anzt
Henze
Hubbard
Jack Dongarra
Karniadakis
Kelley
Meier Yang
Powell
Saad
Stanimire Tomov
Strikwerda
Vincent Heuveline
Üeresin
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement

Author: A. Buttari
A. Frommer
D. Chazan
D. Göddeke
H. Anzt
M. Baboulin
U. Aydin
Z.-Z. Bai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

KITopen

The Coffee-table Book of Pseudospectra

Author: Heuveline Vincent
Subramanian Chandramowli
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2012
Field of study

KITopen

Architecture-Aware Algorithms for Scalable Performance and Resilience on Heterogeneous Architectures

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

A unified Energy Footprint for Simulation Software

Author: Anzt H.
Beglarian A.
Chilingaryan S.
Ferrone A.
Heuveline V.
Kopmann A.
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2012
Field of study

KITopen