Search CORE

14 research outputs found

The Peano software---parallel, automaton-based, dynamically adaptive grid traversals

Author: Weinzierl Tobias
Publication venue: Association for Computing Machinery (ACM)
Publication date: 02/12/2018
Field of study

We discuss the design decisions, design alternatives, and rationale behind the third generation of Peano, a framework for dynamically adaptive Cartesian meshes derived from spacetrees. Peano ties the mesh traversal to the mesh storage and supports only one element-wise traversal order resulting from space-filling curves. The user is not free to choose a traversal order herself. The traversal can exploit regular grid subregions and shared memory as well as distributed memory systems with almost no modifications to a serial application code. We formalize the software design by means of two interacting automata—one automaton for the multiscale grid traversal and one for the application-specific algorithmic steps. This yields a callback-based programming paradigm. We further sketch the supported application types and the two data storage schemes realized before we detail high-performance computing aspects and lessons learned. Special emphasis is put on observations regarding the used programming idioms and algorithmic concepts. This transforms our report from a “one way to implement things” code description into a generic discussion and summary of some alternatives, rationale, and design decisions to be made for any tree-based adaptive mesh refinement software

arXiv.org e-Print Archive

Durham Research Online

On-the-fly memory compression for multibody algorithms

Author: Eckhardt W.
Glas R.
Joubert G.R.
Korzh D.
Leather H.
Parsons M.
Peters F.
Sawyer M.
Wallner S.
Weinzierl T.
Publication venue
Publication date: 01/04/2016
Field of study

Memory and bandwidth demands challenge developers of particle-based codes that have to scale on new architectures, as the growth of concurrency outperforms improvements in memory access facilities, as the memory per core tends to stagnate, and as communication networks cannot increase bandwidth arbitrary. We propose to analyse each particle of such a code to find out whether a hierarchical data representation storing data with reduced precision caps the memory demands without exceeding given error bounds. For admissible candidates, we perform this compression and thus reduce the pressure on the memory subsystem, lower the total memory footprint and reduce the data to be exchanged via MPI. Notably, our analysis and transformation changes the data compression dynamically, i.e. the choice of data format follows the solution characteristics, and it does not require us to alter the core simulation code

Durham Research Online

SFC-based Communication Metadata Encoding for Adaptive Mesh

Author: Bungartz H-J
Schreiber M
Weinzierl T
Publication venue: 'IOS Press'
Publication date: 31/03/2016
Field of study

This volume of the series “Advances in Parallel Computing” contains the proceedings of the International Conference on Parallel Programming – ParCo 2013 – held from 10 to 13 September 2013 in Garching, Germany. The conference was hosted by the Technische Universität München (Department of Informatics) and the Leibniz Supercomputing Centre.The present paper studies two adaptive mesh refinement (AMR) codes whose grids rely on recursive subdivison in combination with space-filling curves (SFCs). A non-overlapping domain decomposition based upon these SFCs yields several well-known advantageous properties with respect to communication demands, balancing, and partition connectivity. However, the administration of the meta data, i.e. to track which partitions exchange data in which cardinality, is nontrivial due to the SFC’s fractal meandering and the dynamic adaptivity. We introduce an analysed tree grammar for the meta data that restricts it without loss of information hierarchically along the subdivision tree and applies run length encoding. Hence, its meta data memory footprint is very small, and it can be computed and maintained on-the-fly even for permanently changing grids. It facilitates a forkjoin pattern for shared data parallelism. And it facilitates replicated data parallelism tackling latency and bandwidth constraints respectively due to communication in the background and reduces memory requirements by avoiding adjacency information stored per element. We demonstrate this at hands of shared and distributed parallelized domain decompositions.This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing (SFB/TR 89). It is partially based on work supported by Award No. UK-c0020, made by the King Abdullah University of Science and Technology (KAUST)

Parallel Multiscale Contact Dynamics for Rigid Non-spherical Bodies

Author: KRESTENITIS KONSTANTINOS
Publication venue
Publication date: 01/01/2018
Field of study

The simulation of large numbers of rigid bodies of non-analytical shapes or vastly varying sizes which collide with each other is computationally challenging. The fundamental problem is the identification of all contact points between all particles at every time step. In the Discrete Element Method (DEM), this is particularly difficult for particles of arbitrary geometry that exhibit sharp features (e.g. rock granulates). While most codes avoid non-spherical or non-analytical shapes due to the computational complexity, we introduce an iterative-based contact detection method for triangulated geometries. The new method is an improvement over a naive brute force approach which checks all possible geometric constellations of contact and thus exhibits a lot of execution branching. Our iterative approach has limited branching and high floating point operations per processed byte. It thus is suitable for modern Single Instruction Multiple Data (SIMD) CPU hardware. As only the naive brute force approach is robust and always yields a correct solution, we propose a hybrid solution that combines the best of the two worlds to produce fast and robust contacts. In terms of the DEM workflow, we furthermore propose a multilevel tree-based data structure strategy that holds all particles in the domain on multiple scales in grids. Grids reduce the total computational complexity of the simulation. The data structure is combined with the DEM phases to form a single touch tree-based traversal that identifies both contact points between particle pairs and introduces concurrency to the system during particle comparisons in one multiscale grid sweep. Finally, a reluctant adaptivity variant is introduced which enables us to realise an improved time stepping scheme with larger time steps than standard adaptivity while we still minimise the grid administration overhead. Four different parallelisation strategies that exploit multicore architectures are discussed for the triad of methodological ingredients. Each parallelisation scheme exhibits unique behaviour depending on the grid and particle geometry at hand. The fusion of them into a task-based parallelisation workflow yields promising speedups. Our work shows that new computer architecture can push the boundary of DEM computability but this is only possible if the right data structures and algorithms are chosen

Durham e-Theses

Asynchronous Teams and Tasks in a Message Passing Environment

Author: HAZELWOOD BENJAMIN
Publication venue
Publication date: 01/01/2019
Field of study

As the discipline of scientific computing grows, so too does the "skills gap" between the increasingly complex scientific applications and the efficient algorithms required. Increasing demand for computational power on the march towards exascale requires innovative approaches. Closing the skills gap avoids the many pitfalls that lead to poor utilisation of resources and wasted investment. This thesis tackles two challenges: asynchronous algorithms for parallel computing and fault tolerance. First I present a novel asynchronous task invocation methodology for Discontinuous Galerkin codes called enclave tasking. The approach modifies the parallel ordering of tasks that allows for efficient scaling on dynamic meshes up to 756 cores. It ensures high levels of concurrency and intermixes tasks of different computational properties. Critical tasks along domain boundaries are prioritised for an overlap of computation and communication. The second contribution is the teaMPI library, forming teams of MPI processes exchanging consistency data through an asynchronous "heartbeat". In contrast to previous approaches, teaMPI operates fully asynchronously with reduced overhead. It is also capable of detecting individually slow or failing ranks and inconsistent data among replicas. Finally I provide an outlook into how asynchronous teams using enclave tasking can be combined into an advanced team-based diffusive load balancing scheme. Both concepts are integrated into and contribute towards the ExaHyPE project, a next generation code that solves hyperbolic equation systems on dynamically adaptive cartesian grids

Durham e-Theses

The EU Center of Excellence for Exascale in Solid Earth (ChEESE): Implementation, results, and roadmap for the second phase

Author: Abril Claudia
Afanasiev Michael
Amati Giorgio
Aniko Wirp Sara
Bader Michael
Badia Rosa M.
Barsotti Sara
Basili Roberto
Bayraktar Hafize B.
Bernardi Fabrizio
Boehm Christian
Brizuela Beatriz
Brogi Federico
Cabrera Eduardo
Casarotti Emanuele
Castro Manuel J.
Cerminara Matteo
Cheptsov Alexey
Cirella Antonella
Conejero Javier
Costa Antonio
de la Asunción Marc
de la Puente Josep
Djuric Marco
Dorozhinskii Ravil
Espinosa Gabriela
Esposti-Ongaro Tomaso
Farnós Joan
Favretto-Cristini Nathalie
Fichtner Andreas
Folch Arnau
Fournier Alexandre
Gabriel Alice-Agnes
Gallard Jean-Matthieu
Gibbons Steven John
Glimsdal Sylfest
González-Vida José Manuel
Gracia Jose
Gregorio Rose
Gutierrez Natalia
Halldorsson Benedikt
Hamitou Okba
Houzeaux Guillaume
Jaure Stephan
Kessar Mouloud
Krenz Lukas
Krischer Lion
Laforet Soline
Lanucara Piero
Li Bo
Lorenzino Maria Concetta
Lorito Stefano
Løvholt Finn
Macedonio Giovanni
Macías Jorge
Martínez Montesinos Beatriz
Marín Guillermo
Mingari Leonardo
Moguilny Geneviève
Montellier Vadim
Monterrubio-Velasco Marisol
Moulard Georges Emmanuel
Nagaso Masaru
Nazaria Massimo
Niethammer Christoph
Pardini Federica
Pienkowska Marta
Pizzimenti Luca
Poiata Natalia
Rannabauer Leonhard
Rodriguez Juan Esteban
Rojas Otilio
Romano Fabrizio
Rudyy Oleksandr
Ruggiero Vittorio
Samfass Philipp
Sanchez Sabrina
Sandri Laura
Scala Antonio
Schaeffer Nathanael
Schuchart Joseph
Selva Jacopo
Sergeant Amadine
Stallone Angela
Sánchez-Linares Carlos
Taroni Matteo
Thrastarson Soelvi
Titos Manuel
Tonelllo Nadia
Tonini Roberto
Ulrich Thomas
Vilotte Jean-Pierre
Volpe Manuela
Vöge Malte
Wössner Uwe
Publication venue
Publication date: 01/01/2023
Field of study

publishedVersio

HAL AMU

Norwegian Geotechnical Institute (NGI) Digital Archive