68 research outputs found
The Dune framework: Basic concepts and recent developments
This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007 and the main papers describing that state Bastian etal. (2008a, 2008b). This discussion is accompanied with a description of various advanced features, such as coupling of domains and cut cells, grid modifications such as adaptation and moving domains, high order discretizations and node level performance, non-smooth multigrid methods, and multiscale methods. A brief discussion on current and future development directions of the framework concludes the paper
AIMES: advanced computation and I/O methods for earth-system simulations
Dealing with extreme scale Earth-system models is challenging from the computer science perspective, as the required computing power and storage capacity are steadily increasing.
Scientists perform runs with growing resolution or aggregate results from many similar smaller-scale runs with slightly different initial conditions (the so-called ensemble runs).
In the fifth Coupled Model Intercomparison Project (CMIP5), the produced datasets require more than three Petabytes of storage and the compute and storage requirements are increasing significantly for CMIP6.
Climate scientists across the globe are developing next-generation models based on improved numerical formulation leading to grids that are discretized in alternative forms such as an icosahedral (geodesic) grid.
The developers of these models face similar problems in scaling, maintaining and optimizing code.
Performance portability and the maintainability of code are key concerns of scientists as, compared to industry projects, model code is continuously revised and extended to incorporate further levels of detail.
This leads to a rapidly growing code base that is rarely refactored.
However, code modernization is important to maintain productivity of the scientist working
with the code and for utilizing performance provided by modern and future architectures.
The need for performance optimization is motivated by the evolution of the parallel architecture landscape from
homogeneous flat machines to heterogeneous combinations of processors with deep memory hierarchy.
Notably, the rise of many-core, throughput-oriented accelerators, such as GPUs, requires non-trivial code changes at minimum and, even worse, may necessitate a substantial rewrite of the existing codebase.
At the same time, the code complexity increases the difficulty for computer scientists and vendors to understand and optimize the code for a given system.
Storing the products of climate predictions requires a large storage and archival system which is expensive.
Often, scientists restrict the number of scientific variables and write interval to keep the costs
balanced.
Compression algorithms can reduce the costs significantly but can also increase the scientific yield of simulation runs.
In the AIMES project, we addressed the key issues of programmability, computational efficiency and I/O limitations that are common in next-generation icosahedral earth-system models.
The project focused on the separation of concerns between domain scientist, computational scientists, and computer scientists
Matrix-free multigrid block-preconditioners for higher order Discontinuous Galerkin discretisations
Efficient and suitably preconditioned iterative solvers for elliptic partial
differential equations (PDEs) of the convection-diffusion type are used in all
fields of science and engineering. To achieve optimal performance, solvers have
to exhibit high arithmetic intensity and need to exploit every form of
parallelism available in modern manycore CPUs. The computationally most
expensive components of the solver are the repeated applications of the linear
operator and the preconditioner. For discretisations based on higher-order
Discontinuous Galerkin methods, sum-factorisation results in a dramatic
reduction of the computational complexity of the operator application while, at
the same time, the matrix-free implementation can run at a significant fraction
of the theoretical peak floating point performance. Multigrid methods for high
order methods often rely on block-smoothers to reduce high-frequency error
components within one grid cell. Traditionally, this requires the assembly and
expensive dense matrix solve in each grid cell, which counteracts any
improvements achieved in the fast matrix-free operator application. To overcome
this issue, we present a new matrix-free implementation of block-smoothers.
Inverting the block matrices iteratively avoids storage and factorisation of
the matrix and makes it is possible to harness the full power of the CPU. We
implemented a hybrid multigrid algorithm with matrix-free block-smoothers in
the high order DG space combined with a low order coarse grid correction using
algebraic multigrid where only low order components are explicitly assembled.
The effectiveness of this approach is demonstrated by solving a set of
representative elliptic PDEs of increasing complexity, including a convection
dominated problem and the stationary SPE10 benchmark.Comment: 28 pages, 10 figures, 10 tables; accepted for publication in Journal
of Computational Physic
Composable code generation for high order, compatible finite element methods
It has been widely recognised in the HPC communities across the world, that exploiting modern
computer architectures, including exascale machines, to a full extent requires software commu-
nities to adapt their algorithms. Computational methods with a high ratio of floating point op-
erations to bandwidth are favorable. For solving partial differential equations, which can model
many physical problems, high order finite element methods can calculate approximations with a
high efficiency when a good solver is employed. Matrix-free algorithms solve the corresponding
equations with a high arithmetic intensity. Vectorisation speeds up the operations by calculating
one instruction on multiple data elements.
Another recent development for solving partial differential are compatible (mimetic) finite ele-
ment methods. In particular with application to geophysical flows, compatible discretisations ex-
hibit desired numerical properties required for accurate approximations. Among others, this has
been recognised by the UK Met office and their new dynamical core for weather and climate fore-
casting is built on a compatible discretisation. Hybridisation has been proven to be an efficient
solver for the corresponding equation systems, because it removes some inter-elemental coupling
and localises expensive operations.
This thesis combines the recent advances on vectorised, matrix-free, high order finite element
methods in the HPC community on the one hand and hybridised, compatible discretisations in
the geophysical community on the other. In previous work, a code generation framework has been
developed to support the localised linear algebra required for hybridisation. First, the framework
is adapted to support vectorisation and further, extended so that the equations can be solved fully
matrix-free. Promising performance results are completing the thesis.Open Acces
- …