New challenges in Astronomy and Astrophysics (AA) are urging the need for a
large number of exceptionally computationally intensive simulations. "Exascale"
(and beyond) computational facilities are mandatory to address the size of
theoretical problems and data coming from the new generation of observational
facilities in AA. Currently, the High Performance Computing (HPC) sector is
undergoing a profound phase of innovation, in which the primary challenge to
the achievement of the "Exascale" is the power-consumption. The goal of this
work is to give some insights about performance and energy footprint of
contemporary architectures for a real astrophysical application in an HPC
context. We use a state-of-the-art N-body application that we re-engineered and
optimized to exploit the heterogeneous underlying hardware fully. We
quantitatively evaluate the impact of computation on energy consumption when
running on four different platforms. Two of them represent the current HPC
systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster
based on ARM-MPSoC, and one is a "prototype towards Exascale" equipped with
ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the
different devices where the high-end GPUs excel in terms of time-to-solution
while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience
reveals that considering FPGAs for computationally intensive application seems
very promising, as their performance is improving to meet the requirements of
scientific applications. This work can be a reference for future platforms
development for astrophysics applications where computationally intensive
calculations are required.Comment: 15 pages, 4 figures, 3 tables; Preprint (V2) submitted to MDPI
(Special Issue: Energy-Efficient Computing on Parallel Architectures