1 research outputs found
O2ATH: An OpenMP Offloading Toolkit for the Sunway Heterogeneous Manycore Platform
The next generation Sunway supercomputer employs the SW26010pro processor,
which features a specialized on-chip heterogeneous architecture. Applications
with significant hotspots can benefit from the great computation capacity
improvement of Sunway many-core architectures by carefully making intensive
manual many-core parallelization efforts. However, some legacy projects with
large codebases, such as CESM, ROMS and WRF, contain numerous lines of code and
do not have significant hotspots. The cost of manually porting such
applications to the Sunway architecture is almost unaffordable. To overcome
such a challenge, we have developed a toolkit named O2ATH. O2ATH forwards GNU
OpenMP runtime library calls to Sunway's Athread library, which greatly
simplifies the parallelization work on the Sunway architecture.O2ATH enables
users to write both MPE and CPE code in a single file, and parallelization can
be achieved by utilizing OpenMP directives and attributes. In practice, O2ATH
has helped us to port two large projects, CESM and ROMS, to the CPEs of the
next generation Sunway supercomputers via the OpenMP offload method. In the
experiments, kernel speedups range from 3 to 15 times, resulting in 3 to 6
times whole application speedups.Furthermore, O2ATH requires significantly
fewer code modifications compared to manually crafting CPE functions.This
indicates that O2ATH can greatly enhance development efficiency when porting or
optimizing large software projects on Sunway supercomputers.Comment: 15 pages, 6 figures, 5 tables