5 research outputs found
Efficient distributed matrix-free multigrid methods on locally refined meshes for FEM computations
This work studies three multigrid variants for matrix-free finite-element
computations on locally refined meshes: geometric local smoothing, geometric
global coarsening, and polynomial global coarsening. We have integrated the
algorithms into the same framework-the open-source finite-element library
deal.II-, which allows us to make fair comparisons regarding their
implementation complexity, computational efficiency, and parallel scalability
as well as to compare the measurements with theoretically derived performance
models. Serial simulations and parallel weak and strong scaling on up to
147,456 CPU cores on 3,072 compute nodes are presented. The results obtained
indicate that global coarsening algorithms show a better parallel behavior for
comparable smoothers due to the better load balance particularly on the
expensive fine levels. In the serial case, the costs of applying hanging-node
constraints might be significant, leading to advantages of local smoothing,
even though the number of solver iterations needed is slightly higher.Comment: 34 pages, 17 figure
Multigrid – adaptive local refinement solver for incompressible flows
A non-linear multigrid solver for incompressible Navier-Stokes equations, exploiting finite volume discretization of the equations, is extended by adaptive local refinement. The multigrid is the outer iterative cycle, while the SIMPLE algorithm is used as a smoothing procedure. Error indicators are used to define the refinement subdomain. A special implementation approach is used, which allows to perform unstructured local refinement in conjunction with the finite volume discretization. The multigrid - adaptive local refinement algorithm is tested on 2D Poisson equation and further is applied to a lid-driven flows in a cavity (2D and 3D case), comparing the results with bench-mark data. The software design principles of the solver are also discussed
Multigrid – adaptive local refinement solver for incompressible flows
A non-linear multigrid solver for incompressible Navier-Stokes equations, exploiting finite volume discretization of the equations, is extended by adaptive local refinement. The multigrid is the outer iterative cycle, while the SIMPLE algorithm is used as a smoothing procedure. Error indicators are used to define the refinement subdomain. A special implementation approach is used, which allows to perform unstructured local refinement in conjunction with the finite volume discretization. The multigrid - adaptive local refinement algorithm is tested on 2D Poisson equation and further is applied to a lid-driven flows in a cavity (2D and 3D case), comparing the results with bench-mark data. The software design principles of the solver are also discussed
Matrix-free finite-element computations at extreme scale and for challenging applications
For numerical computations based on finite element methods (FEM), it is common practice to assemble the system matrix related to the discretized system and to pass this matrix to an iterative solver. However, the assembly step can be costly and the matrix might become locally dense, e.g., in the context of high-order, high-dimensional, or strongly coupled multicomponent FEM, leading to high costs when applying the matrix due to limited bandwidth on modern CPU- and GPU-based hardware. Matrix-free algorithms are a means of accelerating FEM computations on HPC systems, by applying the effect of the system matrix without assembling it. Despite convincing arguments for
matrix-free computations as a means of improving performance, their usage still tends to be an exception at the time of writing of this thesis, not least because they have not yet proven their applicability in all areas of computational science, e.g., solid mechanics.
In this thesis, we further develop a state-of-the-art matrix-free framework for high-order FEM computations with focus on the preconditioning and adopt it in novel application fields. In the context of high-order FEM, we develop means of improving cache efficiency by interleaving cell loops with vector updates, which we use to increase the throughput of preconditioned conjugate gradient methods and of block smoothers based on additive Schwarz methods; we also propose an algorithm for the fast application of hanging-node constraints in 3D for up to 137 refinement configurations. We develop efficient geometric and polynomial multigrid solvers with optimized transfer operators, whose performance is experimentally investigated in detail in the context of locally refined meshes, indicating the superiority of global-coarsening algorithms. We apply the developed solvers in the context of novel stage-parallel implicit Runge–Kutta methods and demonstrate the benefit of stage–parallel solvers in decreasing the time to solution at the scaling limit. Novel challenging application fields of matrix-free computations include high-dimensional computational plasma physics, solid-state-sintering simulations with a high and dynamically changing number of strongly coupled components, and coupled multiphysics problems with evaluation and integration at arbitrary points. In the context of these fields, we detail computational challenges, propose modified versions of the standard matrix-free algorithms for high-performance
computing, and discuss preconditioning-related topics.
The efficiency of the derived algorithms on the node level and at extreme scales is demonstrated experimentally on SuperMUC-NG, one of Germany’s leading supercomputers, with up to 150k processes and by solving systems of up to 5 × 1012 unknowns. Such problem sizes would not be conceivable for equivalent matrix-based algorithms. The major achievements of this thesis allow to run larger simulations faster and more efficiently, enabling progress and new possibilities for a range of application fields in computational science