20 research outputs found

    Two novel aggregation-based algebraic multigrid methods

    Get PDF
    In the last two decades, substantial effort has been devoted to solve large systems of linear equations with algebraic multigrid (AMG) method. Usually, these systems arise from discretizing partial differential equations (PDE) which we encounter in engineering problems. The main principle of this methodology focuses on the elimination of the so-called algebraic smooth error after the smoother has been applied. Smoothed aggregation style multigrid is a particular class of AMG method whose coarsening process differs from the classic AMG. It is also a very popular and effective iterative solver and preconditioner for many problems. In this paper, we present two kinds of novel methods which both focus on the modification of the aggregation algorithm, and both lead a better performance while apply to several problems, such as Helmholtz equation

    Graph coarsening: From scientific computing to machine learning

    Full text link
    The general method of graph coarsening or graph reduction has been a remarkably useful and ubiquitous tool in scientific computing and it is now just starting to have a similar impact in machine learning. The goal of this paper is to take a broad look into coarsening techniques that have been successfully deployed in scientific computing and see how similar principles are finding their way in more recent applications related to machine learning. In scientific computing, coarsening plays a central role in algebraic multigrid methods as well as the related class of multilevel incomplete LU factorizations. In machine learning, graph coarsening goes under various names, e.g., graph downsampling or graph reduction. Its goal in most cases is to replace some original graph by one which has fewer nodes, but whose structure and characteristics are similar to those of the original graph. As will be seen, a common strategy in these methods is to rely on spectral properties to define the coarse graph

    Chaotic multigrid methods for the solution of elliptic equations

    Get PDF
    Supercomputer power has been doubling approximately every 14 months for several decades, increasing the capabilities of scientific modelling at a similar rate. However, to utilize these machines effectively for applications such as computational fluid dynamics, improvements to strong scalability are required. Here, the particular focus is on semi-implicit, viscous-flow CFD, where the largest bottleneck to strong scalability is the parallel solution of the linear pressure-correction equation — an elliptic Poisson equation. State-of-the-art linear solvers, such as Krylov subspace or multigrid methods, provide excellent numerical performance for elliptic equations, but do not scale efficiently due to frequent synchronization between processes. Complete desynchronization is possible for basic, Jacobi-like solvers using the theory of ‘chaotic relaxations’. These non-deterministic, chaotic solvers scale superbly, as demonstrated herein, but lack the numerical performance to converge elliptic equations — even with the relatively lax convergence requirements of the example CFD application. However, these chaotic principles can also be applied to multigrid solvers. In this paper, a ‘chaotic-cycle’ algebraic multigrid method is described and implemented as an open-source library. It is tested on a model Poisson equation, and also within the context of CFD. Two CFD test cases are used: the canonical lid-driven cavity flow and the flow simulation of a ship (KVLCC2). The chaotic-cycle multigrid shows good scalability and numerical performance compared to classical V-, W- and F-cycles. On 2048 cores the chaotic-cycle multigrid solver performs up to faster than Flexible-GMRES and faster than classical V-cycle multigrid. Further improvements to chaotic-cycle multigrid can be made, relating to coarse-grid communications and desynchronized residual computations. It is expected that the chaotic-cycle multigrid could be applied to other scientific fields, wherever a scalable elliptic-equation solver is required

    Algebraic Multigrid for Stokes Equations

    Full text link

    Nonlinear FETI-DP and BDDC Methods

    Get PDF
    In the simulation of deformation processes in material science the consideration of a microscopic material structure is often necessary, as in the simulation of modern high strength steels. A straightforward finite element discretization of the complete deformed body resolving the microscopic structure leads to very large nonlinear problems and a solution is out of reach, even on modern supercomputers. In homogenization approaches, as the computational scale bridging approach FE2, the macroscopic scale of the deformed object is decoupled from the microscopic scale of the material structure. These approaches only consider the microstructure in a localized fashion on independent and parallel representative volume elements (RVEs). This introduces massive parallelism on the macroscopic level and is thus ideal for modern computer architectures with large numbers of parallel computational cores. Nevertheless, the discretization of an RVE can still result in large nonlinear problems and thus highly scalable parallel solvers are necessary. In this context, nonlinear FETI-DP (Finite Element Tearing and Interconnecting - Dual-Primal) and BDDC (Balancing Domain Decomposition by Constraints) domain decomposition methods are discussed in this thesis, which are parallel solution methods for nonlinear problems arising from a finite element discretization. These approaches can be viewed as a strategies to further localize the computational work and to extend the parallel scalability of classical FETI-DP and BDDC methods towards extreme-scale supercomputers. Also variants providing an inexact solution of the FETI-DP coarse problem are considered in this thesis, combining two successful paradigms, i.e., nonlinear domain decomposition and AMG (Algebraic Multigrid). An efficient implementation of the resulting inexact reduced Nonlinear-FETI-DP-1 method is presented and scalability beyond 200,000 computational cores is showed. Finally, a highly scalable FE2 implementation using recent inexact reduced FETI-DP methods to solve the RVE problems on the microscopic level is presented and scalability on all 458,752 cores of the JUQUEEN BlueGene/Q system at Forschungszentrum Jülich is demonstrated
    corecore