Towards CFD at Exascale: Hybrid Multicore/Manycore Massively Parallel High-Order Navier-Stokes Solver

Abstract

A GPU-accelerated high-order massively-parallel 3D Navier-Stokes solver has been developed for heart valve simulation. It is optimized for a Cray XC50 supercomputer by distributing the workload to different MPI processes and by offloading double-precision kernels to GPUs. The GPU kernels are written in CUDA C and are called by the FORTRAN legacy code. For a high-order finite-difference gradient kernel speedups of 5x (Tesla K20x) and 20x (Tesla P100) were achieved. In combination with 16 MPI threads on a single node of the Cray XC50, a peak speedup of 33x was achieved using CUDA MPS. Similar performance was also achieved for other differential operators, demonstrating the potential of GPU technology for bringing biomedical CFD to exascale computing

    Similar works