7 research outputs found
Computational Fluid and Particle Dynamics Simulations for Respiratory System: Runtime Optimization on an Arm Cluster
Computational fluid and particle dynamics simulations (CFPD) are of paramount importance for studying and improving drug effectiveness. Computational requirements of CFPD codes involves high-performance computing (HPC) resources. For these reasons we introduce and evaluate in this paper system software techniques for improving performance and tolerate load imbalance on a state-of-the-art production CFPD code. We demonstrate benefits of these techniques on both Intel- and Arm-based HPC clusters showing the importance of using mechanisms applied at runtime to improve the performance independently of the underlying architecture. We run a real CFPD simulation of particle tracking on the human respiratory system, showing performance improvements of up to 2X, keeping the computational resources constant.This work is partially supported by the Spanish
Government (SEV-2015-0493), by the Spanish Ministry of Science and Technology project (TIN2015-65316-P), by the Generalitat
de Catalunya (2017-SGR-1414), and by the European Mont-Blanc projects (288777, 610402 and 671697).Peer ReviewedPostprint (author's final draft
Runtime Mechanisms to Survive New HPC Architectures: A Use-Case in Human Respiratory Simulations
Computational Fluid and Particle Dynamics (CFPD) simulations are of paramount importance for studying and improving drug effectiveness. Computational requirements of CFPD codes demand high-performance computing (HPC) resources. For these reasons we introduce and evaluate in this paper system software techniques for improving performance and tolerate load imbalance on a state-of-the-art production CFPD code. We demonstrate benefits of these techniques on Intel-, IBM-, and Arm-based HPC technologies ranked in the Top500 supercomputers, showing the importance of using mechanisms applied at runtime to improve the performance independently of
the underlying architecture. We run a real CFPD simulation of particle tracking on the human respiratory system, showing performance improvements of up to 2x, across different architectures, while applying runtime techniques and keeping constant the computational resources.This work is partially supported by the Spanish Government (SEV-2015-0493), by the Spanish Ministry of Science and Technology project (TIN2015-65316-P), by the Generalitat de Catalunya (2017-SGR-1414), and by the European Mont-Blanc projects (288777, 610402 and 671697).Peer ReviewedPreprin
Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU
In this paper, we analyze the performance and energy consumption of an
Arm-based high-performance computing (HPC) system developed within the European
project Mont-Blanc 3. This system, called Dibona, has been integrated by
ATOS/Bull, and it is powered by the latest Marvell's CPU, ThunderX2. This CPU
is the same one that powers the Astra supercomputer, the first Arm-based
supercomputer entering the Top500 in November 2018. We study from
micro-benchmarks up to large production codes. We include an interdisciplinary
evaluation of three scientific applications (a finite-element fluid dynamics
code, a smoothed particle hydrodynamics code, and a lattice Boltzmann code) and
the Graph 500 benchmark, focusing on parallel and energy efficiency as well as
studying their scalability up to thousands of Armv8 cores. For comparison, we
run the same tests on state-of-the-art x86 nodes included in Dibona and the
Tier-0 supercomputer MareNostrum4. Our experiments show that the ThunderX2 has
a 25% lower performance on average, mainly due to its small vector unit yet
somewhat compensated by its 30% wider links between the CPU and the main
memory. We found that the software ecosystem of the Armv8 architecture is
comparable to the one available for Intel. Our results also show that ThunderX2
delivers similar or better energy-to-solution and scalability, proving that
Arm-based chips are legitimate contenders in the market of next-generation HPC
systems
Computational Fluid and Particle Dynamics Simulations for Respiratory System: Runtime Optimization on an Arm Cluster
Computational fluid and particle dynamics simulations (CFPD) are of paramount importance for studying and improving drug effectiveness. Computational requirements of CFPD codes involves high-performance computing (HPC) resources. For these reasons we introduce and evaluate in this paper system software techniques for improving performance and tolerate load imbalance on a state-of-the-art production CFPD code. We demonstrate benefits of these techniques on both Intel- and Arm-based HPC clusters showing the importance of using mechanisms applied at runtime to improve the performance independently of the underlying architecture. We run a real CFPD simulation of particle tracking on the human respiratory system, showing performance improvements of up to 2X, keeping the computational resources constant.This work is partially supported by the Spanish
Government (SEV-2015-0493), by the Spanish Ministry of Science and Technology project (TIN2015-65316-P), by the Generalitat
de Catalunya (2017-SGR-1414), and by the European Mont-Blanc projects (288777, 610402 and 671697).Peer Reviewe
Runtime Mechanisms to Survive New HPC Architectures: A Use-Case in Human Respiratory Simulations
Computational Fluid and Particle Dynamics (CFPD) simulations are of paramount importance for studying and improving drug effectiveness. Computational requirements of CFPD codes demand high-performance computing (HPC) resources. For these reasons we introduce and evaluate in this paper system software techniques for improving performance and tolerate load imbalance on a state-of-the-art production CFPD code. We demonstrate benefits of these techniques on Intel-, IBM-, and Arm-based HPC technologies ranked in the Top500 supercomputers, showing the importance of using mechanisms applied at runtime to improve the performance independently of
the underlying architecture. We run a real CFPD simulation of particle tracking on the human respiratory system, showing performance improvements of up to 2x, across different architectures, while applying runtime techniques and keeping constant the computational resources.This work is partially supported by the Spanish Government (SEV-2015-0493), by the Spanish Ministry of Science and Technology project (TIN2015-65316-P), by the Generalitat de Catalunya (2017-SGR-1414), and by the European Mont-Blanc projects (288777, 610402 and 671697).Peer Reviewe