10 research outputs found
Chaotic multigrid methods for the solution of elliptic equations
Supercomputer power has been doubling approximately every 14 months for several decades, increasing the capabilities of scientific modelling at a similar rate. However, to utilize these machines effectively for applications such as computational fluid dynamics, improvements to strong scalability are required. Here, the particular focus is on semi-implicit, viscous-flow CFD, where the largest bottleneck to strong scalability is the parallel solution of the linear pressure-correction equation â an elliptic Poisson equation. State-of-the-art linear solvers, such as Krylov subspace or multigrid methods, provide excellent numerical performance for elliptic equations, but do not scale efficiently due to frequent synchronization between processes. Complete desynchronization is possible for basic, Jacobi-like solvers using the theory of âchaotic relaxationsâ. These non-deterministic, chaotic solvers scale superbly, as demonstrated herein, but lack the numerical performance to converge elliptic equations â even with the relatively lax convergence requirements of the example CFD application. However, these chaotic principles can also be applied to multigrid solvers. In this paper, a âchaotic-cycleâ algebraic multigrid method is described and implemented as an open-source library. It is tested on a model Poisson equation, and also within the context of CFD. Two CFD test cases are used: the canonical lid-driven cavity flow and the flow simulation of a ship (KVLCC2). The chaotic-cycle multigrid shows good scalability and numerical performance compared to classical V-, W- and F-cycles. On 2048 cores the chaotic-cycle multigrid solver performs up to faster than Flexible-GMRES and faster than classical V-cycle multigrid. Further improvements to chaotic-cycle multigrid can be made, relating to coarse-grid communications and desynchronized residual computations. It is expected that the chaotic-cycle multigrid could be applied to other scientific fields, wherever a scalable elliptic-equation solver is required
Nonlinear FETI-DP and BDDC Methods
In the simulation of deformation processes in material science the consideration of a microscopic material structure is often necessary, as in the simulation of modern high strength steels. A straightforward finite element discretization of the complete deformed body resolving the microscopic structure leads to very large nonlinear problems and a solution is out of reach, even on modern supercomputers. In homogenization approaches, as the computational scale bridging approach FE2, the macroscopic scale of the deformed object is decoupled from the microscopic scale of the material structure. These approaches only consider the microstructure in a localized fashion on independent and parallel representative volume elements (RVEs). This introduces massive parallelism on the macroscopic level and is thus ideal for modern computer architectures with large numbers of parallel computational cores.
Nevertheless, the discretization of an RVE can still result in large nonlinear problems and thus highly scalable parallel solvers are necessary. In this context, nonlinear FETI-DP (Finite Element Tearing and
Interconnecting - Dual-Primal) and BDDC (Balancing Domain Decomposition by Constraints) domain decomposition methods are discussed in this thesis, which are parallel solution methods
for nonlinear problems arising from a finite element discretization. These approaches can be viewed as a strategies to further localize the computational work and to extend the parallel scalability of classical FETI-DP and BDDC methods
towards extreme-scale supercomputers. Also variants providing an inexact solution of the FETI-DP
coarse problem are considered in this thesis, combining two successful paradigms, i.e., nonlinear domain decomposition and AMG (Algebraic Multigrid). An efficient implementation of the resulting inexact reduced Nonlinear-FETI-DP-1 method is presented and scalability beyond 200,000 computational cores is showed.
Finally, a highly scalable FE2 implementation using recent inexact reduced FETI-DP methods to solve the RVE problems on the microscopic level is presented and scalability on all 458,752 cores of the JUQUEEN BlueGene/Q system at Forschungszentrum JĂŒlich is demonstrated
Recommended from our members
Seeking Space Aliens and the Strong Approximation Property: a (Disjoint) Study in Dust Plumes on Planetary Satellites and Nonsymmetric Algebraic Multigrid
PART I: One of the most fascinating questions to humans has long been whether life exists outside of our planet. To our knowledge, water is a fundamental building block of life, which makes liquid water on other bodies in the universe a topic of great interest. In fact, there are large bodies of water right here in our solar system, underneath the icy crust of moons around Saturn and Jupiter. The NASA-ESA Cassini Mission spent two decades studying the Saturnian system. One of the many exciting discoveries was a âplumeâ on the south pole of Enceladus, emitting hundreds of kg/s of water vapor and frozen water-ice particles from Enceladusâ subsurface ocean. It has since been determined that Enceladus likely has a global liquid water ocean separating its rocky core from icy surface, with conditions that are relatively favorable to support life. The plume is of particular interest because it gives direct access to ocean particles from space, by flying through the plume. Recently, evidence has been found for similar geological activity occurring on Jupiterâs moon Europa, long considered one of the most likely candidate bodies to support life in our solar system. Here, a model for plume-particle dynamics is developed based on studies of the Enceladus plume and data from the Cassini Cosmic Dust Analyzer. A C++, OpenMP/MPI parallel software package is then built to run large scale simulations of dust plumes on planetary satellites. In the case of Enceladus, data from simulations and the Cassini mission provide insight into the structure of emissions on the surface, the total mass production of the plume, and the distribution of particles being emitted. Each of these are fundamental to understanding the plume and, for Europa and Enceladus, simulation data provide important results for the planning of future missions to these icy moons. In particular, this work has contributed to the Europa Clipper mission and proposed Enceladus Life Finder.PART II: Solving large, sparse linear systems arises often in the modeling of biological and physical phenomenon, data analysis through graphs and networks, and other scientific applications. This work focusesprimarily on linear systems resulting from the discretization of partial differential equations (PDEs). Because solving linear systems is the bottleneck of many large simulation codes, there is a rich field of research in developing âfastâ solvers, with the ultimate goal being a method that solves an n Ă n linear system in O(n) operations. One of the most effective classes of solvers is algebraic multigrid (AMG), which is a multilevel iterative method based on projecting the problem into progressively smaller spaces, and scales like O(n) or O(nlogn) for certain classes of problems. The field of AMG is well-developed for symmetric positive definite matrices, and is typically most effective on linear systems resulting from the discretization of scalar elliptic PDEs, such as the heat equation. Systems of PDEs can add additional difficulties, but the underlying linear algebraic theory is consistent and, in many cases, an elliptic system of PDEs can be handled well by AMG with appropriate modifications of the solver. Solving general, nonsymmetric linear systems remains the wild west of AMG (and other fast solvers), lacking significant results in convergence theory as well as robust methods. Here, we develop new theoretical motivation and practical variations of AMG to solve nonsymmetric linear systems, often resulting from the discretization of hyperbolic PDEs. In particular, multilevel convergence of AMG for nonsymmetric systems is proven for the first time. A new nonsymmetric AMG solver is also developed based on an approximate ideal restriction, referred to as AIR, which is able to solve advection-dominated, hyperbolic-type problems that are outside the scope of existing AMG solvers and other fast iterative methods. AIR demonstrate
Algebraic analysis of aggregation-based multigrid
A convergence analysis of two-grid methods based on coarsening by (unsmoothed) aggregation is presented. For diagonally dominant symmetric (M-)matrices, it is shown that the analysis can be conducted locally; that is, the convergence factor can be bounded above by computing separately for each aggregate a parameter, which in some sense measures its quality. The procedure is purely algebraic and can be used to control a posteriori the quality of automatic coarsening algorithms. Assuming the aggregation pattern is sufficiently regular, it is further shown that the resulting bound is asymptotically sharp for a large class of elliptic boundary value problems, including problems with variable and discontinuous coefficients. In particular, the analysis of typical examples shows that the convergence rate is insensitive to discontinuities under some reasonable assumptions on the aggregation scheme
Analysis of an aggregationâbased algebraic twoâgrid method for a rotated anisotropic diffusion problem
A twoâgrid convergence analysis based on the paper [Algebraic analysis of aggregationâbased multigrid, by A. Napov and Y. Notay, Numer. Lin. Alg. Appl. 18 (2011), pp. 539â564] is derived for various aggregation schemes applied to a finite element discretization of a rotated anisotropic diffusion equation. As expected, it is shown that the best aggregation scheme is one in which aggregates are aligned with the anisotropy. In practice, however, this is not what automatic aggregation procedures do. We suggest approaches for determining appropriate aggregates based on eigenvectors associated with small eigenvalues of a block splitting matrix or based on minimizing a quantity related to the spectral radius of the iteration matrix
Analysis of an Aggregation-based Algebraic Multigrid Method and its Parallelization
Thesis (Ph.D.)--University of Washington, 2014The interests of this thesis are twofold. First, a two-grid convergence analysis based on the paper [ \textit{Algebraic analysis of aggregation-based multigrid } by A. Napov and Y. Notay, Numer. Lin. Alg. Appl. 18 (2011), pp. 539-564 ] is derived for various aggregation schemes applied to a finite element discretization of a rotated anisotropic diffusion equation. As expected, it is shown that the best aggregation scheme is one in which aggregates are aligned with the anisotropy. In practice, however, this is not what automatic aggregation procedures do. We suggest an approach for determining appropriate aggregates based on eigenvectors associated with small eigenvalues of a block splitting matrix. In the second part of the thesis several issues regarding the parallel implementation of aggregation-based multigrid methods are discussed. The coarsest grid solving stage of multigrid cycles has been a bottleneck for parallel multigrid algorithms to attain a good speedup. A comparison between a parallel linear system direct solver (MUMPS) and a few steps of preconditioned conjugate gradient (PCG) methods for solving the coarsest grid system is carried out and tested on TACC Lonestar multi-processor machine. Regarding the preconditioner of conjugate gradient iterations, a parallel sparse approximate inverse (SAI) algorithm is used to construct an approximate inverse of the original matrix in order to replace the preconditioner solving step, which is inherently sequential, by matrix-vector multiplications. The linear systems tested arise from discretization of 2D or 3D partial differential equations, which are symmetric positive definite. The results exhibit that using PCG on the coarsest grid attains better speedup and overall better performance than MUMPS when the number of processors is greater than about 100. The effects of different decompositions of the physical domain (rows/slab versus blocks/pencils) on the scaling and efficiency of aggregation-based algebraic multigrid are also studied and one sees that the blocks/pencils decomposition of the physical domain reduces the amount of communication and hence has better performance