68 research outputs found

    The Dune framework: Basic concepts and recent developments

    Get PDF
    This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007 and the main papers describing that state Bastian etal. (2008a, 2008b). This discussion is accompanied with a description of various advanced features, such as coupling of domains and cut cells, grid modifications such as adaptation and moving domains, high order discretizations and node level performance, non-smooth multigrid methods, and multiscale methods. A brief discussion on current and future development directions of the framework concludes the paper

    Fast Matrix-Free Evaluation of Discontinuous Galerkin Finite Element Operators

    No full text

    AIMES: advanced computation and I/O methods for earth-system simulations

    Get PDF
    Dealing with extreme scale Earth-system models is challenging from the computer science perspective, as the required computing power and storage capacity are steadily increasing. Scientists perform runs with growing resolution or aggregate results from many similar smaller-scale runs with slightly different initial conditions (the so-called ensemble runs). In the fifth Coupled Model Intercomparison Project (CMIP5), the produced datasets require more than three Petabytes of storage and the compute and storage requirements are increasing significantly for CMIP6. Climate scientists across the globe are developing next-generation models based on improved numerical formulation leading to grids that are discretized in alternative forms such as an icosahedral (geodesic) grid. The developers of these models face similar problems in scaling, maintaining and optimizing code. Performance portability and the maintainability of code are key concerns of scientists as, compared to industry projects, model code is continuously revised and extended to incorporate further levels of detail. This leads to a rapidly growing code base that is rarely refactored. However, code modernization is important to maintain productivity of the scientist working with the code and for utilizing performance provided by modern and future architectures. The need for performance optimization is motivated by the evolution of the parallel architecture landscape from homogeneous flat machines to heterogeneous combinations of processors with deep memory hierarchy. Notably, the rise of many-core, throughput-oriented accelerators, such as GPUs, requires non-trivial code changes at minimum and, even worse, may necessitate a substantial rewrite of the existing codebase. At the same time, the code complexity increases the difficulty for computer scientists and vendors to understand and optimize the code for a given system. Storing the products of climate predictions requires a large storage and archival system which is expensive. Often, scientists restrict the number of scientific variables and write interval to keep the costs balanced. Compression algorithms can reduce the costs significantly but can also increase the scientific yield of simulation runs. In the AIMES project, we addressed the key issues of programmability, computational efficiency and I/O limitations that are common in next-generation icosahedral earth-system models. The project focused on the separation of concerns between domain scientist, computational scientists, and computer scientists

    Matrix-free multigrid block-preconditioners for higher order Discontinuous Galerkin discretisations

    Get PDF
    Efficient and suitably preconditioned iterative solvers for elliptic partial differential equations (PDEs) of the convection-diffusion type are used in all fields of science and engineering. To achieve optimal performance, solvers have to exhibit high arithmetic intensity and need to exploit every form of parallelism available in modern manycore CPUs. The computationally most expensive components of the solver are the repeated applications of the linear operator and the preconditioner. For discretisations based on higher-order Discontinuous Galerkin methods, sum-factorisation results in a dramatic reduction of the computational complexity of the operator application while, at the same time, the matrix-free implementation can run at a significant fraction of the theoretical peak floating point performance. Multigrid methods for high order methods often rely on block-smoothers to reduce high-frequency error components within one grid cell. Traditionally, this requires the assembly and expensive dense matrix solve in each grid cell, which counteracts any improvements achieved in the fast matrix-free operator application. To overcome this issue, we present a new matrix-free implementation of block-smoothers. Inverting the block matrices iteratively avoids storage and factorisation of the matrix and makes it is possible to harness the full power of the CPU. We implemented a hybrid multigrid algorithm with matrix-free block-smoothers in the high order DG space combined with a low order coarse grid correction using algebraic multigrid where only low order components are explicitly assembled. The effectiveness of this approach is demonstrated by solving a set of representative elliptic PDEs of increasing complexity, including a convection dominated problem and the stationary SPE10 benchmark.Comment: 28 pages, 10 figures, 10 tables; accepted for publication in Journal of Computational Physic

    Composable code generation for high order, compatible finite element methods

    Get PDF
    It has been widely recognised in the HPC communities across the world, that exploiting modern computer architectures, including exascale machines, to a full extent requires software commu- nities to adapt their algorithms. Computational methods with a high ratio of floating point op- erations to bandwidth are favorable. For solving partial differential equations, which can model many physical problems, high order finite element methods can calculate approximations with a high efficiency when a good solver is employed. Matrix-free algorithms solve the corresponding equations with a high arithmetic intensity. Vectorisation speeds up the operations by calculating one instruction on multiple data elements. Another recent development for solving partial differential are compatible (mimetic) finite ele- ment methods. In particular with application to geophysical flows, compatible discretisations ex- hibit desired numerical properties required for accurate approximations. Among others, this has been recognised by the UK Met office and their new dynamical core for weather and climate fore- casting is built on a compatible discretisation. Hybridisation has been proven to be an efficient solver for the corresponding equation systems, because it removes some inter-elemental coupling and localises expensive operations. This thesis combines the recent advances on vectorised, matrix-free, high order finite element methods in the HPC community on the one hand and hybridised, compatible discretisations in the geophysical community on the other. In previous work, a code generation framework has been developed to support the localised linear algebra required for hybridisation. First, the framework is adapted to support vectorisation and further, extended so that the equations can be solved fully matrix-free. Promising performance results are completing the thesis.Open Acces

    New approaches for efficient on-the-fly FE operator assembly in a high-performance mantle convection framework

    Get PDF
    • …
    corecore