Search CORE

237 research outputs found

Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

Author: Guo Xu
Mueller Eike
Scheichl Robert
Shi Sinan
Publication venue
Publication date: 01/01/2013
Field of study

Many problems in geophysical and atmospheric modelling require the fast solution of elliptic partial differential equations (PDEs) in "flat" three dimensional geometries. In particular, an anisotropic elliptic PDE for the pressure correction has to be solved at every time step in the dynamical core of many numerical weather prediction models, and equations of a very similar structure arise in global ocean models, subsurface flow simulations and gas and oil reservoir modelling. The elliptic solve is often the bottleneck of the forecast, and an algorithmically optimal method has to be used and implemented efficiently. Graphics Processing Units have been shown to be highly efficient for a wide range of applications in scientific computing, and recently iterative solvers have been parallelised on these architectures. We describe the GPU implementation and optimisation of a Preconditioned Conjugate Gradient (PCG) algorithm for the solution of a three dimensional anisotropic elliptic PDE for the pressure correction in NWP. Our implementation exploits the strong vertical anisotropy of the elliptic operator in the construction of a suitable preconditioner. As the algorithm is memory bound, performance can be improved significantly by reducing the amount of global memory access. We achieve this by using a matrix-free implementation which does not require explicit storage of the matrix and instead recalculates the local stencil. Global memory access can also be reduced by rewriting the algorithm using loop fusion and we show that this further reduces the runtime on the GPU. We demonstrate the performance of our matrix-free GPU code by comparing it to a sequential CPU implementation and to a matrix-explicit GPU code which uses existing libraries. The absolute performance of the algorithm for different problem sizes is quantified in terms of floating point throughput and global memory bandwidth.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

OPUS

Crossref

Spectral methods for CFD

Author: Hussaini M. Yousuff
Streett Craig L.
Zang Thomas A.
Publication venue
Publication date
Field of study

One of the objectives of these notes is to provide a basic introduction to spectral methods with a particular emphasis on applications to computational fluid dynamics. Another objective is to summarize some of the most important developments in spectral methods in the last two years. The fundamentals of spectral methods for simple problems will be covered in depth, and the essential elements of several fluid dynamical applications will be sketched

NASA Technical Reports Server

An adaptive Cartesian embedded boundary approach for fluid simulations of two- and three-dimensional low temperature plasma filaments in complex geometries

Author: Marskar Robert
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

We review a scalable two- and three-dimensional computer code for low-temperature plasma simulations in multi-material complex geometries. Our approach is based on embedded boundary (EB) finite volume discretizations of the minimal fluid-plasma model on adaptive Cartesian grids, extended to also account for charging of insulating surfaces. We discuss the spatial and temporal discretization methods, and show that the resulting overall method is second order convergent, monotone, and conservative (for smooth solutions). Weak scalability with parallel efficiencies over 70\% are demonstrated up to 8192 cores and more than one billion cells. We then demonstrate the use of adaptive mesh refinement in multiple two- and three-dimensional simulation examples at modest cores counts. The examples include two-dimensional simulations of surface streamers along insulators with surface roughness; fully three-dimensional simulations of filaments in experimentally realizable pin-plane geometries, and three-dimensional simulations of positive plasma discharges in multi-material complex geometries. The largest computational example uses up to

800

million mesh cells with billions of unknowns on

4096

computing cores. Our use of computer-aided design (CAD) and constructive solid geometry (CSG) combined with capabilities for parallel computing offers possibilities for performing three-dimensional transient plasma-fluid simulations, also in multi-material complex geometries at moderate pressures and comparatively large scale.Comment: 40 pages, 21 figure

arXiv.org e-Print Archive

SINTEF Open

Recommended from our members

Schnelle Löser für Partielle Differentialgleichungen

Author
Publication venue: Zürich : EMS Publ. House
Publication date: 01/01/2014
Field of study

This workshop was well attended by 52 participants with broad geographic representation from 11 countries and 3 continents. It was a nice blend of researchers with various backgrounds

Repositorium für Naturwissenschaften und Technik

A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow

Author: Fehn Niklas
Krank Benjamin
Kronbichler Martin
Wall Wolfgang A.
Publication venue: 'Elsevier BV'
Publication date: 05/07/2016
Field of study

We present an efficient discontinuous Galerkin scheme for simulation of the incompressible Navier-Stokes equations including laminar and turbulent flow. We consider a semi-explicit high-order velocity-correction method for time integration as well as nodal equal-order discretizations for velocity and pressure. The non-linear convective term is treated explicitly while a linear system is solved for the pressure Poisson equation and the viscous term. The key feature of our solver is a consistent penalty term reducing the local divergence error in order to overcome recently reported instabilities in spatially under-resolved high-Reynolds-number flows as well as small time steps. This penalty method is similar to the grad-div stabilization widely used in continuous finite elements. We further review and compare our method to several other techniques recently proposed in literature to stabilize the method for such flow configurations. The solver is specifically designed for large-scale computations through matrix-free linear solvers including efficient preconditioning strategies and tensor-product elements, which have allowed us to scale this code up to 34.4 billion degrees of freedom and 147,456 CPU cores. We validate our code and demonstrate optimal convergence rates with laminar flows present in a vortex problem and flow past a cylinder and show applicability of our solver to direct numerical simulation as well as implicit large-eddy simulation of turbulent channel flow at

Re_{\tau}=180

as well as

590

.Comment: 28 pages, in preparation for submission to Journal of Computational Physic

arXiv.org e-Print Archive

OPUS Augsburg

Crossref