237 research outputs found
Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs
Many problems in geophysical and atmospheric modelling require the fast
solution of elliptic partial differential equations (PDEs) in "flat" three
dimensional geometries. In particular, an anisotropic elliptic PDE for the
pressure correction has to be solved at every time step in the dynamical core
of many numerical weather prediction models, and equations of a very similar
structure arise in global ocean models, subsurface flow simulations and gas and
oil reservoir modelling. The elliptic solve is often the bottleneck of the
forecast, and an algorithmically optimal method has to be used and implemented
efficiently. Graphics Processing Units have been shown to be highly efficient
for a wide range of applications in scientific computing, and recently
iterative solvers have been parallelised on these architectures. We describe
the GPU implementation and optimisation of a Preconditioned Conjugate Gradient
(PCG) algorithm for the solution of a three dimensional anisotropic elliptic
PDE for the pressure correction in NWP. Our implementation exploits the strong
vertical anisotropy of the elliptic operator in the construction of a suitable
preconditioner. As the algorithm is memory bound, performance can be improved
significantly by reducing the amount of global memory access. We achieve this
by using a matrix-free implementation which does not require explicit storage
of the matrix and instead recalculates the local stencil. Global memory access
can also be reduced by rewriting the algorithm using loop fusion and we show
that this further reduces the runtime on the GPU. We demonstrate the
performance of our matrix-free GPU code by comparing it to a sequential CPU
implementation and to a matrix-explicit GPU code which uses existing libraries.
The absolute performance of the algorithm for different problem sizes is
quantified in terms of floating point throughput and global memory bandwidth.Comment: 18 pages, 7 figure
Spectral methods for CFD
One of the objectives of these notes is to provide a basic introduction to spectral methods with a particular emphasis on applications to computational fluid dynamics. Another objective is to summarize some of the most important developments in spectral methods in the last two years. The fundamentals of spectral methods for simple problems will be covered in depth, and the essential elements of several fluid dynamical applications will be sketched
An adaptive Cartesian embedded boundary approach for fluid simulations of two- and three-dimensional low temperature plasma filaments in complex geometries
We review a scalable two- and three-dimensional computer code for
low-temperature plasma simulations in multi-material complex geometries. Our
approach is based on embedded boundary (EB) finite volume discretizations of
the minimal fluid-plasma model on adaptive Cartesian grids, extended to also
account for charging of insulating surfaces. We discuss the spatial and
temporal discretization methods, and show that the resulting overall method is
second order convergent, monotone, and conservative (for smooth solutions).
Weak scalability with parallel efficiencies over 70\% are demonstrated up to
8192 cores and more than one billion cells. We then demonstrate the use of
adaptive mesh refinement in multiple two- and three-dimensional simulation
examples at modest cores counts. The examples include two-dimensional
simulations of surface streamers along insulators with surface roughness; fully
three-dimensional simulations of filaments in experimentally realizable
pin-plane geometries, and three-dimensional simulations of positive plasma
discharges in multi-material complex geometries. The largest computational
example uses up to million mesh cells with billions of unknowns on
computing cores. Our use of computer-aided design (CAD) and constructive solid
geometry (CSG) combined with capabilities for parallel computing offers
possibilities for performing three-dimensional transient plasma-fluid
simulations, also in multi-material complex geometries at moderate pressures
and comparatively large scale.Comment: 40 pages, 21 figure
Recommended from our members
Schnelle Löser für Partielle Differentialgleichungen
This workshop was well attended by 52 participants with broad geographic representation from 11 countries and 3 continents. It was a nice blend of researchers with various backgrounds
A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow
We present an efficient discontinuous Galerkin scheme for simulation of the
incompressible Navier-Stokes equations including laminar and turbulent flow. We
consider a semi-explicit high-order velocity-correction method for time
integration as well as nodal equal-order discretizations for velocity and
pressure. The non-linear convective term is treated explicitly while a linear
system is solved for the pressure Poisson equation and the viscous term. The
key feature of our solver is a consistent penalty term reducing the local
divergence error in order to overcome recently reported instabilities in
spatially under-resolved high-Reynolds-number flows as well as small time
steps. This penalty method is similar to the grad-div stabilization widely used
in continuous finite elements. We further review and compare our method to
several other techniques recently proposed in literature to stabilize the
method for such flow configurations. The solver is specifically designed for
large-scale computations through matrix-free linear solvers including efficient
preconditioning strategies and tensor-product elements, which have allowed us
to scale this code up to 34.4 billion degrees of freedom and 147,456 CPU cores.
We validate our code and demonstrate optimal convergence rates with laminar
flows present in a vortex problem and flow past a cylinder and show
applicability of our solver to direct numerical simulation as well as implicit
large-eddy simulation of turbulent channel flow at as well as
.Comment: 28 pages, in preparation for submission to Journal of Computational
Physic
- …