Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP
  Optimisation

G. Goumas; G. Schubert; G. Wellein; M. Butler; M.D. Piggott; N. Bell; P. Balaji; S. Williams

research

Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP Optimisation

Authors: G. Goumas
G. Schubert
G. Wellein
M. Butler
M.D. Piggott
N. Bell
P. Balaji
S. Williams
Publication date: 1 January 2013
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

The increasing number of processing elements and decreas- ing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scientific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP parallelisation to optimise parallel sparse matrix-vector multiplication in PETSc, a widely used scientific library for the scalable solution of partial differential equations. Using large matrices generated by Fluidity, an open source CFD application code which uses PETSc as its linear solver engine, we evaluate the effect of explicit communication overlap using task-based parallelism and show how to further improve performance by explicitly load balancing threads within MPI processes. We demonstrate a significant speedup over the pure-MPI mode and efficient strong scaling of sparse matrix-vector multiplication on Fujitsu PRIMEHPC FX10 and Cray XE6 systems