Search CORE

194 research outputs found

Dynamic task fusion for a block-structured finite volume solver over a dynamically adaptive mesh with local time stepping

Author: Li Baojiu
Schulz Holger
Weinzierl Tobias
Zhang Han
Publication venue: Springer Verlag
Publication date: 29/05/2022
Field of study

Load balancing of generic wave equation solvers over dynamically adaptive meshes with local time stepping is dicult, as the load changes with every time step. Task-based programming promises to mitigate the load balancing problem. We study a Finite Volume code over dynamically adaptive block-structured meshes for two astrophysics simulations, where the patches (blocks) dene tasks. They are classied into urgent and low priority tasks. Urgent tasks are algorithmically latencysensitive. They are processed directly as part of our bulk-synchronous mesh traversals. Non-urgent tasks are held back in an additional task queue on top of the task runtime system. If they lack global side-eects, i.e. do not alter the global solver state, we can generate optimised compute kernels for these tasks. Furthermore, we propose to use the additional queue to merge tasks without side-eects into task assemblies, and to balance out imbalanced bulk synchronous processing phases

Durham Research Online

SFC-based Communication Metadata Encoding for Adaptive Mesh

Author: Bungartz H-J
Schreiber M
Weinzierl T
Publication venue: 'IOS Press'
Publication date: 31/03/2016
Field of study

This volume of the series “Advances in Parallel Computing” contains the proceedings of the International Conference on Parallel Programming – ParCo 2013 – held from 10 to 13 September 2013 in Garching, Germany. The conference was hosted by the Technische Universität München (Department of Informatics) and the Leibniz Supercomputing Centre.The present paper studies two adaptive mesh refinement (AMR) codes whose grids rely on recursive subdivison in combination with space-filling curves (SFCs). A non-overlapping domain decomposition based upon these SFCs yields several well-known advantageous properties with respect to communication demands, balancing, and partition connectivity. However, the administration of the meta data, i.e. to track which partitions exchange data in which cardinality, is nontrivial due to the SFC’s fractal meandering and the dynamic adaptivity. We introduce an analysed tree grammar for the meta data that restricts it without loss of information hierarchically along the subdivision tree and applies run length encoding. Hence, its meta data memory footprint is very small, and it can be computed and maintained on-the-fly even for permanently changing grids. It facilitates a forkjoin pattern for shared data parallelism. And it facilitates replicated data parallelism tackling latency and bandwidth constraints respectively due to communication in the background and reduces memory requirements by avoiding adjacency information stored per element. We demonstrate this at hands of shared and distributed parallelized domain decompositions.This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing (SFB/TR 89). It is partially based on work supported by Award No. UK-c0020, made by the King Abdullah University of Science and Technology (KAUST)

Open Research Exeter

A parallel adaptive method for simulating shock-induced combustion with detailed chemical kinetics in complex domains

Author: Azarenok
Bell
Benkiewicz
Berger
Berger
Berger
Courant
Crutchfield
Deiterding
Einfeldt
Fedkiw
Fickett
Grossmann
Harten
Henshaw
Hu
Janenko
Kaps
Khokhlov
Khokhlov
Larrouturou
LeVeque
Loth
Mader
Mittal
Nettleton
Oran
Oran
Osher
Quirk
Ralf Deiterding
Rendleman
Sanders
Sharpe
Strehlow
Strehlow
Thomas
Toro
van Leer
Westbrook
Williams
Yuan
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Generation of initial molecular dynamics configurations in arbitrary geometries and in parallel

Author: Borg M.K.
Macpherson G.B.
Reese J.M.
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2007
Field of study

A computational pre-processing tool for generating initial configurations of molecules for molecular dynamics simulations in geometries described by a mesh of unstructured arbitrary polyhedra is described. The mesh is divided into separate zones and each can be filled with a single crystal lattice of atoms. Each zone is filled by creating an expanding cube of crystal unit cells, initiated from an anchor point for the lattice. Each unit cell places the appropriate atoms for the user-specified crystal structure and orientation. The cube expands until the entire zone is filled with the lattice; zones with concave and disconnected volumes may be filled. When the mesh is spatially decomposed into portions for distributed parallel processing, each portion may be filled independently, meaning that the entire molecular system never needs to fit onto a single processor, allowing very large systems to be created. The computational time required to fill a zone with molecules scales linearly with the number of cells in the zone for a fixed number of molecules, and better than linearly with the number of molecules for a fixed number of mesh cells. Our tool, molConfig, has been implemented in the open source C++ code OpenFOAM

University of Strathclyde Institutional Repository

Recommended from our members

Computational Fluid Dynamics with Embedded Cut Cells on Graphics Hardware

Author: Roosing Alo
Publication venue: University of Cambridge
Publication date: 31/10/2019
Field of study

The advent of general purpose computing on graphics cards has led to significant software speedup in many fields. Designing code for GPUs, however, requires careful consideration of the underlying hardware. This thesis explores the implementation of fluid dynamics simulations featuring embedded cut cells using the CUDA programming platform. We demonstrate efficient generation and handling of geometric surface data in rectilinear computational grids. This is added to a split Euler solver to define piecewise linear cut cells describing solid surfaces in fluid flows. To reduce the memory footprint of embedded boundaries, we present a system of compressed data structures. The software is extended to run on multiple graphics cards and shows good scaling. Simulating embedded boundaries requires a description of object surfaces. We implement a fast and robust narrow band signed distance field generator for graphics cards based on the Characteristic/Scan Conversion algorithm for stereolithography files. The thesis presents an augmented approach to handle commonly occurring complex configurations and we show that the method is correct for all closed surfaces. We discuss efficient feature construction and work scheduling and demonstrate high-speed distance generation for complex geometries. At the core of our simulation implementation is a split Euler solver for high-speed flow. We present a one-dimensional method that achieves coalesced memory access and uses shared memory caching to best harness the potential of GPU hardware. Multidimensional simulations use a framework of data transposes to align data with sweep dimensions to maintain optimal memory access. Analysis of the solver shows that compute resources are used efficiently. The solver is extended to include cut cells describing solid boundaries in the domain. We present a compression and mapping method to reduce the memory footprint of the surface information. The cut cell solver is validated with different flow regimes and we simulate shock wave interaction with complex geometries to demonstrate the stability of the implementation. We conclude with multi-card parallelisation and analyse existing literature on domain segmentation and GPU communication. We present a system of domain splitting and message passing with overlapping compute and communication streams. A comparison of naïve and GPU-aware Open MPI shows the benefits of using CUDA specific library calls. The complete software pipeline demonstrates good scaling for up to thirty-two cards on a GPU cluster

Apollo (Cambridge)

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Author: Bader M
Brito Gadeschi G
Weinzierl T
Wille M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to offload sets of patches to the GPU. Our studies show that multithreaded, concurrent, non-deterministic access to the GPU leads to performance breakdowns, since the GPU memory bookkeeping as offered through OpenMP’s map clause, i.e., the allocation and freeing, becomes another runtime challenge besides expensive data transfer and actual computation. We, therefore, propose to retain the memory management responsibility on the host: A caching mechanism acquires memory on the accelerator for all CPU threads, keeps hold of this memory and hands it out to the offloading threads upon demand. We show that this user-managed, CPU-based memory administration helps us to overcome the GPU memory bookkeeping bottleneck and speeds up the time-to-solution of Finite Volume kernels by more than an order of magnitude

Durham Research Online

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Author: Baboulin M.
Bader M
Bhatele A.
Brito Gadeschi G
Hammond J.
Kruse C.
Weinzierl T
Wille M
Publication venue: Springer Verlag
Publication date: 10/05/2023
Field of study

Durham Research Online

Study of interpolation methods for high-accuracy computations on overlapping grids

Author: CHICHEPORTICHE Jérémie
GLOERFELT Xavier
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Overset strategy can be an efficient way to keep high-accuracy discretization by decomposing a complex geometry in topologically simple subdomains. Apart from the grid assembly algorithm, the key point of overset technique lies in the interpolation processes which ensure the communications between the overlapping grids. The family of explicit Lagrange and optimized interpolation schemes is studied. The a priori interpolation error is analyzed in the Fourier space, and combined with the error of the chosen discretization to highlight the modification of the numerical error. When high-accuracy algorithms are used an optimization of the interpolation coefficients can enhance the resolvality, which can be useful when high-frequency waves or small turbulent scales need to be supported by a grid. For general curvilinear grids in more than one space dimension, a mapping in a computational space followed by a tensorization of 1-D interpolations is preferred to a direct evaluation of the coefficient in the physical domain. A high-order extension of the isoparametric mapping is accurate and robust since it avoids the inversion of a matrix which may be ill-conditioned. A posteriori error analyses indicate that the interpolation stencil size must be tailored to the accuracy of the discretization scheme. For well discretized wavelengthes, the results show that the choice of a stencil smaller than the stencil of the corresponding finite-difference scheme can be acceptable. Besides the gain of optimization to capture high-frequency phenomena is also underlined. Adding order constraints to the optimization allows an interesting trade-off when a large range of scales is considered. Finally, the ability of the present overset strategy to preserve accuracy is illustrated by the diffraction of an acoustic source by two cylinders, and the generation of acoustic tones in a rotor–stator interaction. Some recommandations are formulated in the closing section

Crossref

HAL Descartes

SAM : Science Arts et Métiers

Hal-Diderot

Spectral/hp element methods: recent developments, applications, and perspectives

Author: Cantwell Chris D.
Engsig-Karup Allan P.
Eskilsson Claes
Monteserin Carlos
Sherwin Spencer J.
Xu Hui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The spectral/hp element method combines the geometric flexibility of the classical h-type finite element technique with the desirable numerical properties of spectral methods, employing high-degree piecewise polynomial basis functions on coarse finite element-type meshes. The spatial approximation is based upon orthogonal polynomials, such as Legendre or Chebychev polynomials, modified to accommodate C0-continuous expansions. Computationally and theoretically, by increasing the polynomial order p, high-precision solutions and fast convergence can be obtained and, in particular, under certain regularity assumptions an exponential reduction in approximation error between numerical and exact solutions can be achieved. This method has now been applied in many simulation studies of both fundamental and practical engineering flows. This paper briefly describes the formulation of the spectral/hp element method and provides an overview of its application to computational fluid dynamics. In particular, it focuses on the use the spectral/hp element method in transitional flows and ocean engineering. Finally, some of the major challenges to be overcome in order to use the spectral/hp element method in more complex science and engineering applications are discussed

arXiv.org e-Print Archive

Crossref

VBN

Online Research Database In Technology

Chaste: a test-driven approach to software development for biological modelling

Author: Bernabeu M O
Bordas R
Byrne H. M.
Chapman S. J.
Cooper J
Fletcher A G
Garny A.
Gavaghan D. J.
Maini P. K.
Mirams G R
Murray P J
Osbourne J
Pathmanathan P.
Pitt-Francis J
Rodriguez B
van Leeuwen I. M. M.
Walter A
Waters S. L.
Whiteley J. P.
Publication venue: Elsevier
Publication date: 01/01/2009
Field of study

Chaste (‘Cancer, heart and soft-tissue environment’) is a software library and a set of test suites for computational simulations in the domain of biology. Current functionality has arisen from modelling in the fields of cancer, cardiac physiology and soft-tissue mechanics. It is released under the LGPL 2.1 licence.\ud \ud Chaste has been developed using agile programming methods. The project began in 2005 when it was reasoned that the modelling of a variety of physiological phenomena required both a generic mathematical modelling framework, and a generic computational/simulation framework. The Chaste project evolved from the Integrative Biology (IB) e-Science Project, an inter-institutional project aimed at developing a suitable IT infrastructure to support physiome-level computational modelling, with a primary focus on cardiac and cancer modelling

Crossref

Repository@Nottingham

UCL Discovery

Oxford University Research Archive

University of Dundee Online Publications