168 research outputs found

    Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

    Get PDF
    Many problems in geophysical and atmospheric modelling require the fast solution of elliptic partial differential equations (PDEs) in "flat" three dimensional geometries. In particular, an anisotropic elliptic PDE for the pressure correction has to be solved at every time step in the dynamical core of many numerical weather prediction models, and equations of a very similar structure arise in global ocean models, subsurface flow simulations and gas and oil reservoir modelling. The elliptic solve is often the bottleneck of the forecast, and an algorithmically optimal method has to be used and implemented efficiently. Graphics Processing Units have been shown to be highly efficient for a wide range of applications in scientific computing, and recently iterative solvers have been parallelised on these architectures. We describe the GPU implementation and optimisation of a Preconditioned Conjugate Gradient (PCG) algorithm for the solution of a three dimensional anisotropic elliptic PDE for the pressure correction in NWP. Our implementation exploits the strong vertical anisotropy of the elliptic operator in the construction of a suitable preconditioner. As the algorithm is memory bound, performance can be improved significantly by reducing the amount of global memory access. We achieve this by using a matrix-free implementation which does not require explicit storage of the matrix and instead recalculates the local stencil. Global memory access can also be reduced by rewriting the algorithm using loop fusion and we show that this further reduces the runtime on the GPU. We demonstrate the performance of our matrix-free GPU code by comparing it to a sequential CPU implementation and to a matrix-explicit GPU code which uses existing libraries. The absolute performance of the algorithm for different problem sizes is quantified in terms of floating point throughput and global memory bandwidth.Comment: 18 pages, 7 figure

    Numerical solution of 3-D electromagnetic problems in exploration geophysics and its implementation on massively parallel computers

    Get PDF
    The growing significance, technical development and employment of electromagnetic (EM) methods in exploration geophysics have led to the increasing need for reliable and fast techniques of interpretation of 3-D EM data sets acquired in complex geological environments. The first and most important step to creating an inversion method is the development of a solver for the forward problem. In order to create an efficient, reliable and practical 3-D EM inversion, it is necessary to have a 3-D EM modelling code that is highly accurate, robust and very fast. This thesis focuses precisely on this crucial and very demanding step to building a 3-D EM interpretation method. The thesis presents as its main contribution a highly accurate, robust, very fast and extremely scalable numerical method for 3-D EM modelling in geophysics that is based on finite elements (FE) and designed to run on massively parallel computing platforms. Thanks to the fact that the FE approach supports completely unstructured tetrahedral meshes as well as local mesh refinements, the presented solver is able to represent complex geometries of subsurface structures very precisely and thus improve the solution accuracy and avoid misleading artefacts in images. Consequently, it can be successfully used in geological environments of arbitrary geometrical complexities. The parallel implementation of the method, which is based on the domain decomposition and a hybrid MPI-OpenMP scheme, has proved to be highly scalable - the achieved speed-up is close to the linear for more than a thousand processors. Thanks to this, the code is able to deal with extremely large problems, which may have hundreds of millions of degrees of freedom, in a very efficient way. The importance of having this forward-problem solver lies in the fact that it is now possible to create a 3-D EM inversion that can deal with data obtained in extremely complex geological environments in a way that is realistic for practical use in industry. So far, such imaging tool has not been proposed due to a lack of efficient, parallel FE solutions as well as the limitations of efficient solvers based on finite differences. In addition, the thesis discusses physical, mathematical and numerical aspects and challenges of 3-D EM modelling, which have been studied during my research in order to properly design the presented software for EM field simulations on 3-D areas of the Earth. Through this work, a physical problem formulation based on the secondary Coulomb-gauged EM potentials has been validated, proving that it can be successfully used with the standard nodal FE method to give highly accurate numerical solutions. Also, this work has shown that Krylov subspace iterative methods are the best solution for solving linear systems that arise after FE discretisation of the problem under consideration. More precisely, it has been discovered empirically that the best iterative method for this kind of problems is biconjugate gradient stabilised with an elaborate preconditioner. Since most commonly used preconditioners proved to be either unable to improve the convergence of the implemented solvers to the desired extent, or impractical in the parallel context, I have proposed a preconditioning technique for Krylov methods that is based on algebraic multigrid. Tests for various problems with different conductivity structures and characteristics have shown that the new preconditioner greatly improves the convergence of different Krylov subspace methods, which significantly reduces the total execution time of the program and improves the solution quality. Furthermore, the preconditioner is very practical for parallel implementation. Finally, it has been concluded that there are not any restrictions in employing classical parallel programming models, MPI and OpenMP, for parallelisation of the presented FE solver. Moreover, they have proved to be enough to provide an excellent scalability for it

    Solving the Poisson equation on small aspect ratio domains using unstructured meshes

    Full text link
    We discuss the ill conditioning of the matrix for the discretised Poisson equation in the small aspect ratio limit, and motivate this problem in the context of nonhydrostatic ocean modelling. Efficient iterative solvers for the Poisson equation in small aspect ratio domains are crucial for the successful development of nonhydrostatic ocean models on unstructured meshes. We introduce a new multigrid preconditioner for the Poisson problem which can be used with finite element discretisations on general unstructured meshes; this preconditioner is motivated by the fact that the Poisson problem has a condition number which is independent of aspect ratio when Dirichlet boundary conditions are imposed on the top surface of the domain. This leads to the first level in an algebraic multigrid solver (which can be extended by further conventional algebraic multigrid stages), and an additive smoother. We illustrate the method with numerical tests on unstructured meshes, which show that the preconditioner makes a dramatic improvement on a more standard multigrid preconditioner approach, and also show that the additive smoother produces better results than standard SOR smoothing. This new solver method makes it feasible to run nonhydrostatic unstructured mesh ocean models in small aspect ratio domains.Comment: submitted to Ocean Modellin

    Parallel 3-D marine controlled-source electromagnetic modelling using high-order tetrahedral Nédélec elements

    Get PDF
    We present a parallel and high-order Nédélec finite element solution for the marine controlled-source electromagnetic (CSEM) forward problem in 3-D media with isotropic conductivity. Our parallel Python code is implemented on unstructured tetrahedral meshes, which support multiple-scale structures and bathymetry for general marine 3-D CSEM modelling applications. Based on a primary/secondary field approach, we solve the diffusive form of Maxwell’s equations in the low-frequency domain. We investigate the accuracy and performance advantages of our new high-order algorithm against a low-order implementation proposed in our previous work. The numerical precision of our high-order method has been successfully verified by comparisons against previously published results that are relevant in terms of scale and geological properties. A convergence study confirms that high-order polynomials offer a better trade-off between accuracy and computation time. However, the optimum choice of the polynomial order depends on both the input model and the required accuracy as revealed by our tests. Also, we extend our adaptive-meshing strategy to high-order tetrahedral elements. Using adapted meshes to both physical parameters and high-order schemes, we are able to achieve a significant reduction in computational cost without sacrificing accuracy in the modelling. Furthermore, we demonstrate the excellent performance and quasi-linear scaling of our implementation in a state-of-the-art high-performance computing architecture.This project has received funding from the European Union's Horizon 2020 programme under the Marie Sklodowska-Curie grant agreement No. 777778. Furthermore, the research leading to these results has received funding from the European Union's Horizon 2020 programme under the ChEESE Project (https://cheese-coe.eu/ ), grant agreement No. 823844. In addition, the authors would also like to thank the support of the Ministerio de Educación y Ciencia (Spain) under Projects TEC2016-80386-P and TIN2016-80957-P. The authors would like to thank the Editors-in-Chief and to both reviewers, Dr. Martin Cuma and Dr. Raphael Rochlitz, for their valuable comments and suggestions which helped to improve the quality of the manuscript. This work benefited from the valuable suggestions, comments, and proofreading of Dr. Otilio Rojas (BSC). Last but not least, Octavio Castillo-Reyes thanks Natalia Gutierrez (BSC) for her support in CSEM modeling with BSIT.Peer ReviewedPostprint (author's final draft

    The LifeV library: engineering mathematics beyond the proof of concept

    Get PDF
    LifeV is a library for the finite element (FE) solution of partial differential equations in one, two, and three dimensions. It is written in C++ and designed to run on diverse parallel architectures, including cloud and high performance computing facilities. In spite of its academic research nature, meaning a library for the development and testing of new methods, one distinguishing feature of LifeV is its use on real world problems and it is intended to provide a tool for many engineering applications. It has been actually used in computational hemodynamics, including cardiac mechanics and fluid-structure interaction problems, in porous media, ice sheets dynamics for both forward and inverse problems. In this paper we give a short overview of the features of LifeV and its coding paradigms on simple problems. The main focus is on the parallel environment which is mainly driven by domain decomposition methods and based on external libraries such as MPI, the Trilinos project, HDF5 and ParMetis. Dedicated to the memory of Fausto Saleri.Comment: Review of the LifeV Finite Element librar

    Robust and efficient primal-dual Newton-Krylov solvers for viscous-plastic sea-ice models

    Full text link
    We present a Newton-Krylov solver for a viscous-plastic sea-ice model. This constitutive relation is commonly used in climate models to describe the material properties of sea ice. Due to the strong nonlinearity introduced by the material law in the momentum equation, the development of fast, robust and scalable solvers is still a substantial challenge. In this paper, we propose a novel primal-dual Newton linearization for the implicitly-in-time discretized momentum equation. Compared to existing methods, it converges faster and more robustly with respect to mesh refinement, and thus enables numerically converged sea-ice simulations at high resolutions. Combined with an algebraic multigrid-preconditioned Krylov method for the linearized systems, which contain strongly varying coefficients, the resulting solver scales well and can be used in parallel. We present experiments for two challenging test problems and study solver performance for problems with up to 8.4 million spatial unknowns.Comment: 18 pages, 7 figure
    corecore