70 research outputs found

    Adaptive Aggregation Based Domain Decomposition Multigrid for the Lattice Wilson Dirac Operator

    Get PDF
    In lattice QCD computations a substantial amount of work is spent in solving discretized versions of the Dirac equation. Conventional Krylov solvers show critical slowing down for large system sizes and physically interesting parameter regions. We present a domain decomposition adaptive algebraic multigrid method used as a precondtioner to solve the "clover improved" Wilson discretization of the Dirac equation. This approach combines and improves two approaches, namely domain decomposition and adaptive algebraic multigrid, that have been used seperately in lattice QCD before. We show in extensive numerical test conducted with a parallel production code implementation that considerable speed-up over conventional Krylov subspace methods, domain decomposition methods and other hierarchical approaches for realistic system sizes can be achieved.Comment: Additional comparison to method of arXiv:1011.2775 and to mixed-precision odd-even preconditioned BiCGStab. Results of numerical experiments changed slightly due to more systematic use of odd-even preconditionin

    An algebraic multigrid method for Q2Q1Q_2-Q_1 mixed discretizations of the Navier-Stokes equations

    Full text link
    Algebraic multigrid (AMG) preconditioners are considered for discretized systems of partial differential equations (PDEs) where unknowns associated with different physical quantities are not necessarily co-located at mesh points. Specifically, we investigate a Q2Q1Q_2-Q_1 mixed finite element discretization of the incompressible Navier-Stokes equations where the number of velocity nodes is much greater than the number of pressure nodes. Consequently, some velocity degrees-of-freedom (dofs) are defined at spatial locations where there are no corresponding pressure dofs. Thus, AMG approaches leveraging this co-located structure are not applicable. This paper instead proposes an automatic AMG coarsening that mimics certain pressure/velocity dof relationships of the Q2Q1Q_2-Q_1 discretization. The main idea is to first automatically define coarse pressures in a somewhat standard AMG fashion and then to carefully (but automatically) choose coarse velocity unknowns so that the spatial location relationship between pressure and velocity dofs resembles that on the finest grid. To define coefficients within the inter-grid transfers, an energy minimization AMG (EMIN-AMG) is utilized. EMIN-AMG is not tied to specific coarsening schemes and grid transfer sparsity patterns, and so it is applicable to the proposed coarsening. Numerical results highlighting solver performance are given on Stokes and incompressible Navier-Stokes problems.Comment: Submitted to a journa

    Efficient Nonlinear Solvers for Nodal High-Order Finite Elements in 3D

    Get PDF
    Conventional high-order finite element methods are rarely used for industrial problems because the Jacobian rapidly loses sparsity as the order is increased, leading to unaffordable solve times and memory requirements. This effect typically limits order to at most quadratic, despite the favorable accuracy and stability properties offered by quadratic and higher order discretizations. We present a method in which the action of the Jacobian is applied matrix-free exploiting a tensor product basis on hexahedral elements, while much sparser matrices based on Q 1 sub-elements on the nodes of the high-order basis are assembled for preconditioning. With this "dual-order” scheme, storage is independent of spectral order and a natural taping scheme is available to update a full-accuracy matrix-free Jacobian during residual evaluation. Matrix-free Jacobian application circumvents the memory bandwidth bottleneck typical of sparse matrix operations, providing several times greater floating point performance and better use of multiple cores with shared memory bus. Computational results for the p-Laplacian and Stokes problem, using block preconditioners and AMG, demonstrate mesh-independent convergence rates and weak (bounded) dependence on order, even for highly deformed meshes and nonlinear systems with several orders of magnitude dynamic range in coefficients. For spectral orders around 5, the dual-order scheme requires half the memory and similar time to assembled quadratic (Q 2) elements, making it very affordable for general us

    Performance of an MPI-only semiconductor device simulator on a quad socket/quad core InfiniBand platform.

    Full text link
    corecore