39,661 research outputs found
Towards a Mini-App for Smoothed Particle Hydrodynamics at Exascale
The smoothed particle hydrodynamics (SPH) technique is a purely Lagrangian
method, used in numerical simulations of fluids in astrophysics and
computational fluid dynamics, among many other fields. SPH simulations with
detailed physics represent computationally-demanding calculations. The
parallelization of SPH codes is not trivial due to the absence of a structured
grid. Additionally, the performance of the SPH codes can be, in general,
adversely impacted by several factors, such as multiple time-stepping,
long-range interactions, and/or boundary conditions. This work presents
insights into the current performance and functionalities of three SPH codes:
SPHYNX, ChaNGa, and SPH-flow. These codes are the starting point of an
interdisciplinary co-design project, SPH-EXA, for the development of an
Exascale-ready SPH mini-app. To gain such insights, a rotating square patch
test was implemented as a common test simulation for the three SPH codes and
analyzed on two modern HPC systems. Furthermore, to stress the differences with
the codes stemming from the astrophysics community (SPHYNX and ChaNGa), an
additional test case, the Evrard collapse, has also been carried out. This work
extrapolates the common basic SPH features in the three codes for the purpose
of consolidating them into a pure-SPH, Exascale-ready, optimized, mini-app.
Moreover, the outcome of this serves as direct feedback to the parent codes, to
improve their performance and overall scalability.Comment: 18 pages, 4 figures, 5 tables, 2018 IEEE International Conference on
Cluster Computing proceedings for WRAp1
Recommended from our members
Generic support for policy-based self-adaptive systems
This paper presents a policy definition language which forms part of a generic policy toolkit for autonomic computing systems in which the policies themselves can be modified dynamically and automatically. Targeted enhancements to the current state of practice include: policy self-adaptation where the policy itself is dynamically modified to match environmental conditions; improved support for non autonomics-expert developers; and facilitating easy deployment of adaptive policies into legacy code.
The policy definition language permits powerful expression of self-managing behaviours and facilitates a diverse policy behaviour space. Features include support for multiple versions of a given policy type, multiple configuration templates, and meta policies to dynamically select between policy instances.
An example deployment scenario illustrates advanced functionality in the context of a multi policy stock trading system which is sensitive to environmental volatility
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
Starting from a high-level problem description in terms of partial
differential equations using abstract tensor notation, the Chemora framework
discretizes, optimizes, and generates complete high performance codes for a
wide range of compute architectures. Chemora extends the capabilities of
Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient
manner for complex applications, without low-level code tuning. Chemora
achieves parallelism through MPI and multi-threading, combining OpenMP and
CUDA. Optimizations include high-level code transformations, efficient loop
traversal strategies, dynamically selected data and instruction cache usage
strategies, and JIT compilation of GPU code tailored to the problem
characteristics. The discretization is based on higher-order finite differences
on multi-block domains. Chemora's capabilities are demonstrated by simulations
of black hole collisions. This problem provides an acid test of the framework,
as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific
Programmin
An Extensible Timing Infrastructure for Adaptive Large-scale Applications
Real-time access to accurate and reliable timing information is necessary to
profile scientific applications, and crucial as simulations become increasingly
complex, adaptive, and large-scale. The Cactus Framework provides flexible and
extensible capabilities for timing information through a well designed
infrastructure and timing API. Applications built with Cactus automatically
gain access to built-in timers, such as gettimeofday and getrusage,
system-specific hardware clocks, and high-level interfaces such as PAPI. We
describe the Cactus timer interface, its motivation, and its implementation. We
then demonstrate how this timing information can be used by an example
scientific application to profile itself, and to dynamically adapt itself to a
changing environment at run time
SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App
Numerical simulations of fluids in astrophysics and computational fluid
dynamics (CFD) are among the most computationally-demanding calculations, in
terms of sustained floating-point operations per second, or FLOP/s. It is
expected that these numerical simulations will significantly benefit from the
future Exascale computing infrastructures, that will perform 10^18 FLOP/s. The
performance of the SPH codes is, in general, adversely impacted by several
factors, such as multiple time-stepping, long-range interactions, and/or
boundary conditions. In this work an extensive study of three SPH
implementations SPHYNX, ChaNGa, and XXX is performed, to gain insights and to
expose any limitations and characteristics of the codes. These codes are the
starting point of an interdisciplinary co-design project, SPH-EXA, for the
development of an Exascale-ready SPH mini-app. We implemented a rotating square
patch as a joint test simulation for the three SPH codes and analyzed their
performance on a modern HPC system, Piz Daint. The performance profiling and
scalability analysis conducted on the three parent codes allowed to expose
their performance issues, such as load imbalance, both in MPI and OpenMP.
Two-level load balancing has been successfully applied to SPHYNX to overcome
its load imbalance. The performance analysis shapes and drives the design of
the SPH-EXA mini-app towards the use of efficient parallelization methods,
fault-tolerance mechanisms, and load balancing approaches.Comment: arXiv admin note: substantial text overlap with arXiv:1809.0801
- …