237 research outputs found

    Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

    Get PDF
    Many problems in geophysical and atmospheric modelling require the fast solution of elliptic partial differential equations (PDEs) in "flat" three dimensional geometries. In particular, an anisotropic elliptic PDE for the pressure correction has to be solved at every time step in the dynamical core of many numerical weather prediction models, and equations of a very similar structure arise in global ocean models, subsurface flow simulations and gas and oil reservoir modelling. The elliptic solve is often the bottleneck of the forecast, and an algorithmically optimal method has to be used and implemented efficiently. Graphics Processing Units have been shown to be highly efficient for a wide range of applications in scientific computing, and recently iterative solvers have been parallelised on these architectures. We describe the GPU implementation and optimisation of a Preconditioned Conjugate Gradient (PCG) algorithm for the solution of a three dimensional anisotropic elliptic PDE for the pressure correction in NWP. Our implementation exploits the strong vertical anisotropy of the elliptic operator in the construction of a suitable preconditioner. As the algorithm is memory bound, performance can be improved significantly by reducing the amount of global memory access. We achieve this by using a matrix-free implementation which does not require explicit storage of the matrix and instead recalculates the local stencil. Global memory access can also be reduced by rewriting the algorithm using loop fusion and we show that this further reduces the runtime on the GPU. We demonstrate the performance of our matrix-free GPU code by comparing it to a sequential CPU implementation and to a matrix-explicit GPU code which uses existing libraries. The absolute performance of the algorithm for different problem sizes is quantified in terms of floating point throughput and global memory bandwidth.Comment: 18 pages, 7 figure

    Spectral methods for CFD

    Get PDF
    One of the objectives of these notes is to provide a basic introduction to spectral methods with a particular emphasis on applications to computational fluid dynamics. Another objective is to summarize some of the most important developments in spectral methods in the last two years. The fundamentals of spectral methods for simple problems will be covered in depth, and the essential elements of several fluid dynamical applications will be sketched

    An adaptive Cartesian embedded boundary approach for fluid simulations of two- and three-dimensional low temperature plasma filaments in complex geometries

    Get PDF
    We review a scalable two- and three-dimensional computer code for low-temperature plasma simulations in multi-material complex geometries. Our approach is based on embedded boundary (EB) finite volume discretizations of the minimal fluid-plasma model on adaptive Cartesian grids, extended to also account for charging of insulating surfaces. We discuss the spatial and temporal discretization methods, and show that the resulting overall method is second order convergent, monotone, and conservative (for smooth solutions). Weak scalability with parallel efficiencies over 70\% are demonstrated up to 8192 cores and more than one billion cells. We then demonstrate the use of adaptive mesh refinement in multiple two- and three-dimensional simulation examples at modest cores counts. The examples include two-dimensional simulations of surface streamers along insulators with surface roughness; fully three-dimensional simulations of filaments in experimentally realizable pin-plane geometries, and three-dimensional simulations of positive plasma discharges in multi-material complex geometries. The largest computational example uses up to 800800 million mesh cells with billions of unknowns on 40964096 computing cores. Our use of computer-aided design (CAD) and constructive solid geometry (CSG) combined with capabilities for parallel computing offers possibilities for performing three-dimensional transient plasma-fluid simulations, also in multi-material complex geometries at moderate pressures and comparatively large scale.Comment: 40 pages, 21 figure

    A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow

    Full text link
    We present an efficient discontinuous Galerkin scheme for simulation of the incompressible Navier-Stokes equations including laminar and turbulent flow. We consider a semi-explicit high-order velocity-correction method for time integration as well as nodal equal-order discretizations for velocity and pressure. The non-linear convective term is treated explicitly while a linear system is solved for the pressure Poisson equation and the viscous term. The key feature of our solver is a consistent penalty term reducing the local divergence error in order to overcome recently reported instabilities in spatially under-resolved high-Reynolds-number flows as well as small time steps. This penalty method is similar to the grad-div stabilization widely used in continuous finite elements. We further review and compare our method to several other techniques recently proposed in literature to stabilize the method for such flow configurations. The solver is specifically designed for large-scale computations through matrix-free linear solvers including efficient preconditioning strategies and tensor-product elements, which have allowed us to scale this code up to 34.4 billion degrees of freedom and 147,456 CPU cores. We validate our code and demonstrate optimal convergence rates with laminar flows present in a vortex problem and flow past a cylinder and show applicability of our solver to direct numerical simulation as well as implicit large-eddy simulation of turbulent channel flow at Reτ=180Re_{\tau}=180 as well as 590590.Comment: 28 pages, in preparation for submission to Journal of Computational Physic
    corecore