264 research outputs found

    The deal.II Library, Version 9.1

    Get PDF
    This paper provides an overview of the new features of the finite element library deal.II, version 9.1

    A Parallel Geometric Multigrid Method for Adaptive Finite Elements

    Get PDF
    Applications in a variety of scientific disciplines use systems of Partial Differential Equations (PDEs) to model physical phenomena. Numerical solutions to these models are often found using the Finite Element Method (FEM), where the problem is discretized and the solution of a large linear system is required, containing millions or even billions of unknowns. Often times, the domain of these solves will contain localized features that require very high resolution of the underlying finite element mesh to accurately solve, while a mesh with uniform resolution would require far too much computational time and memory overhead to be feasible on a modern machine. Therefore, techniques like adaptive mesh refinement, where one increases the resolution of the mesh only where it is necessary, must be used. Even with adaptive mesh refinement, these systems can still be on the order of much more than a million unknowns (large mantle convection applications like the ones in [90] show simulations on over 600 billion unknowns), and attempting to solve on a single processing unit is infeasible due to limited computational time and memory required. For this reason, any application code aimed at solving large problems must be built using a parallel framework, allowing the concurrent use of multiple processing units to solve a single problem, and the code must exhibit efficient scaling to large amounts of processing units. Multigrid methods are currently the only known optimal solvers for linear systems arising from discretizations of elliptic boundary valued problems. These methods can be represented as an iterative scheme with contraction number less than one, independent of the resolution of the discretization [24, 54, 25, 103], with optimal complexity in the number of unknowns in the system [29]. Geometric multigrid (GMG) methods, where the hierarchy of spaces are defined by linear systems of finite element discretizations on meshes of decreasing resolution, have been shown to be robust for many different problem formulations, giving mesh independent convergence for highly adaptive meshes [26, 61, 83, 18], but these methods require specific implementations for each type of equation, boundary condition, mesh, etc., required by the specific application. The implementation in a massively parallel environment is not obvious, and research into this topic is far from exhaustive. We present an implementation of a massively parallel, adaptive geometric multigrid (GMG) method used in the open-source finite element library deal.II [5], and perform extensive tests showing scaling of the v-cycle application on systems with up to 137 billion unknowns run on up to 65,536 processors, and demonstrating low communication overhead of the algorithms proposed. We then show the flexibility of the GMG by applying the method to four different PDE systems: the Poisson equation, linear elasticity, advection-diffusion, and the Stokes equations. For the Stokes equations, we implement a fully matrix-free, adaptive, GMG-based solver in the mantle convection code ASPECT [13], and give a comparison to the current matrix-based method used. We show improvements in robustness, parallel scaling, and memory consumption for simulations with up to 27 billion unknowns and 114,688 processors. Finally, we test the performance of IDR(s) methods compared to the FGMRES method currently used in ASPECT, showing the effects of the flexible preconditioning used for the Stokes solves in ASPECT, and the demonstrating the possible reduction in memory consumption for IDR(s) and the potential for solving large scale problems. Parts of the work in this thesis has been submitted to peer reviewed journals in the form of two publications ([36] and [34]), and the implementations discussed have been integrated into two open-source codes, deal.II and ASPECT. From the contributions to deal.II, including a full length tutorial program, Step-63 [35], the author is listed as a contributing author to the newest deal.II release (see [5]). The implementation into ASPECT is based on work from the author and Timo Heister. The goal for the work here is to enable the community of geoscientists using ASPECT to solve larger problems than currently possible. Over the course of this thesis, the author was partially funded by the NSF Award OAC-1835452 and by the Computational Infrastructure in Geodynamics initiative (CIG), through the NSF under Award EAR-0949446 and EAR-1550901 and The University of California -- Davis

    End-to-end GPU acceleration of low-order-refined preconditioning for high-order finite element discretizations

    Full text link
    In this paper, we present algorithms and implementations for the end-to-end GPU acceleration of matrix-free low-order-refined preconditioning of high-order finite element problems. The methods described here allow for the construction of effective preconditioners for high-order problems with optimal memory usage and computational complexity. The preconditioners are based on the construction of a spectrally equivalent low-order discretization on a refined mesh, which is then amenable to, for example, algebraic multigrid preconditioning. The constants of equivalence are independent of mesh size and polynomial degree. For vector finite element problems in H(curl)H({\rm curl}) and H(div)H({\rm div}) (e.g. for electromagnetic or radiation diffusion problems) a specially constructed interpolation-histopolation basis is used to ensure fast convergence. Detailed performance studies are carried out to analyze the efficiency of the GPU algorithms. The kernel throughput of each of the main algorithmic components is measured, and the strong and weak parallel scalability of the methods is demonstrated. The different relative weighting and significance of the algorithmic components on GPUs and CPUs is discussed. Results on problems involving adaptively refined nonconforming meshes are shown, and the use of the preconditioners on a large-scale magnetic diffusion problem using all spaces of the finite element de Rham complex is illustrated.Comment: 23 pages, 13 figure

    Adaptive control in rollforward recovery for extreme scale multigrid

    Full text link
    With the increasing number of compute components, failures in future exa-scale computer systems are expected to become more frequent. This motivates the study of novel resilience techniques. Here, we extend a recently proposed algorithm-based recovery method for multigrid iterations by introducing an adaptive control. After a fault, the healthy part of the system continues the iterative solution process, while the solution in the faulty domain is re-constructed by an asynchronous on-line recovery. The computations in both the faulty and healthy subdomains must be coordinated in a sensitive way, in particular, both under and over-solving must be avoided. Both of these waste computational resources and will therefore increase the overall time-to-solution. To control the local recovery and guarantee an optimal re-coupling, we introduce a stopping criterion based on a mathematical error estimator. It involves hierarchical weighted sums of residuals within the context of uniformly refined meshes and is well-suited in the context of parallel high-performance computing. The re-coupling process is steered by local contributions of the error estimator. We propose and compare two criteria which differ in their weights. Failure scenarios when solving up to 6.9â‹…10116.9\cdot10^{11} unknowns on more than 245\,766 parallel processes will be reported on a state-of-the-art peta-scale supercomputer demonstrating the robustness of the method

    Efficient distributed matrix-free multigrid methods on locally refined meshes for FEM computations

    Get PDF
    This work studies three multigrid variants for matrix-free finite-element computations on locally refined meshes: geometric local smoothing, geometric global coarsening, and polynomial global coarsening. We have integrated the algorithms into the same framework-the open-source finite-element library deal.II-, which allows us to make fair comparisons regarding their implementation complexity, computational efficiency, and parallel scalability as well as to compare the measurements with theoretically derived performance models. Serial simulations and parallel weak and strong scaling on up to 147,456 CPU cores on 3,072 compute nodes are presented. The results obtained indicate that global coarsening algorithms show a better parallel behavior for comparable smoothers due to the better load balance particularly on the expensive fine levels. In the serial case, the costs of applying hanging-node constraints might be significant, leading to advantages of local smoothing, even though the number of solver iterations needed is slightly higher.Comment: 34 pages, 17 figure

    A New Paradigm for Parallel Adaptive Meshing Algorithms

    Full text link
    • …
    corecore