33 research outputs found

    Augmented Block-Arnoldi Recycling CFD Solvers

    Full text link
    One of the limitations of recycled GCRO methods is the large amount of computation required to orthogonalize the basis vectors of the newly generated Krylov subspace for the approximate solution when combined with those of the recycle subspace. Recent advancements in low synchronization Gram-Schmidt and generalized minimal residual algorithms, Swirydowicz et al.~\cite{2020-swirydowicz-nlawa}, Carson et al. \cite{Carson2022}, and Lund \cite{Lund2022}, can be incorporated, thereby mitigating the loss of orthogonality of the basis vectors. An augmented Arnoldi formulation of recycling leads to a matrix decomposition and the associated algorithm can also be viewed as a {\it block} Krylov method. Generalizations of both classical and modified block Gram-Schmidt algorithms have been proposed, Carson et al.~\cite{Carson2022}. Here, an inverse compact WYWY modified Gram-Schmidt algorithm is applied for the inter-block orthogonalization scheme with a block lower triangular correction matrix TkT_k at iteration kk. When combined with a weighted (oblique inner product) projection step, the inverse compact WYWY scheme leads to significant (over 10Ă—\times in certain cases) reductions in the number of solver iterations per linear system. The weight is also interpreted in terms of the angle between restart residuals in LGMRES, as defined by Baker et al.\cite{Baker2005}. In many cases, the recycle subspace eigen-spectrum can substitute for a preconditioner

    Block Iterative Methods and Recycling for Improved Scalability of Linear Solvers

    Get PDF
    International audienceContemporary large-scale Partial Differential Equation (PDE) simulations usually require the solution of large and sparse linear systems. Moreover, it is often needed to solve these linear systems with different or multiple Right-Hand Sides (RHSs). In this paper, various strategies will be presented to extend the scalability of existing linear solvers using appropriate recycling strategies or block methods—i.e., by treating multiple right-hand sides simultaneously. The scalability of this work is assessed by performing simulations on up to 8,192 cores for solving linear systems arising from various physical phenomena modeled by Poisson's equation, the system of linear elasticity, or Maxwell's equation. This work is shipped as part of on open-source software, readily available and usable in any C, C++, or Python code. In particular, some simulations are performed on top of a well-established library, PETSc, and it is shown how our approaches can be used to decrease time to solution down by 30%

    Block Iterative Methods and Recycling for Improved Scalability of Linear Solvers

    Get PDF
    International audienceContemporary large-scale Partial Differential Equation (PDE) simulations usually require the solution of large and sparse linear systems. Moreover, it is often needed to solve these linear systems with different or multiple Right-Hand Sides (RHSs). In this paper, various strategies will be presented to extend the scalability of existing linear solvers using appropriate recycling strategies or block methods—i.e., by treating multiple right-hand sides simultaneously. The scalability of this work is assessed by performing simulations on up to 8,192 cores for solving linear systems arising from various physical phenomena modeled by Poisson's equation, the system of linear elasticity, or Maxwell's equation. This work is shipped as part of on open-source software, readily available and usable in any C, C++, or Python code. In particular, some simulations are performed on top of a well-established library, PETSc, and it is shown how our approaches can be used to decrease time to solution down by 30%

    A General Algorithm for Reusing Krylov Subspace Information. I. Unsteady Navier-Stokes

    Get PDF
    A general algorithm is developed that reuses available information to accelerate the iterative convergence of linear systems with multiple right-hand sides A x = b (sup i), which are commonly encountered in steady or unsteady simulations of nonlinear equations. The algorithm is based on the classical GMRES algorithm with eigenvector enrichment but also includes a Galerkin projection preprocessing step and several novel Krylov subspace reuse strategies. The new approach is applied to a set of test problems, including an unsteady turbulent airfoil, and is shown in some cases to provide significant improvement in computational efficiency relative to baseline approaches

    Improving Pseudo-Time Stepping Convergence for CFD Simulations With Neural Networks

    Full text link
    Computational fluid dynamics (CFD) simulations of viscous fluids described by the Navier-Stokes equations are considered. Depending on the Reynolds number of the flow, the Navier-Stokes equations may exhibit a highly nonlinear behavior. The system of nonlinear equations resulting from the discretization of the Navier-Stokes equations can be solved using nonlinear iteration methods, such as Newton's method. However, fast quadratic convergence is typically only obtained in a local neighborhood of the solution, and for many configurations, the classical Newton iteration does not converge at all. In such cases, so-called globalization techniques may help to improve convergence. In this paper, pseudo-transient continuation is employed in order to improve nonlinear convergence. The classical algorithm is enhanced by a neural network model that is trained to predict a local pseudo-time step. Generalization of the novel approach is facilitated by predicting the local pseudo-time step separately on each element using only local information on a patch of adjacent elements as input. Numerical results for standard benchmark problems, including flow through a backward facing step geometry and Couette flow, show the performance of the machine learning-enhanced globalization approach; as the software for the simulations, the CFD module of COMSOL Multiphysics is employed

    Contribution to the study of efficient iterative methods for the numerical solution of partial differential equations

    Get PDF
    Multigrid and domain decomposition methods provide efficient algorithms for the numerical solution of partial differential equations arising in the modelling of many applications in Computational Science and Engineering. This manuscript covers certain aspects of modern iterative solution methods for the solution of large-scale problems issued from the discretization of partial differential equations. More specifically, we focus on geometric multigrid methods, non-overlapping substructuring methods and flexible Krylov subspace methods with a particular emphasis on their combination. Firstly, the combination of multigrid and Krylov subspace methods is investigated on a linear partial differential equation modelling wave propagation in heterogeneous media. Secondly, we focus on non-overlapping domain decomposition methods for a specific finite element discretization known as the hp finite element, where unrefinement/refinement is allowed both by decreasing/increasing the step size h or by decreasing/increasing the polynomial degree p of the approximation on each element. Results on condition number bounds for the domain decomposition preconditioned operators are given and illustrated by numerical results on academic problems in two and three dimensions. Thirdly, we review recent advances related to a class of Krylov subspace methods allowing variable preconditioning. We examine in detail flexible Krylov subspace methods including augmentation and/or spectral deflation, where deflation aims at capturing approximate invariant subspace information. We also present flexible Krylov subspace methods for the solution of linear systems with multiple right-hand sides given simultaneously. The efficiency of the numerical methods is demonstrated on challenging applications in seismics requiring the solution of huge linear systems of equations with multiple right-hand sides on parallel distributed memory computers. Finally, we expose current and future prospectives towards the design of efficient algorithms on extreme scale machines for the solution of problems coming from the discretization of partial differential equations

    Kernel solver design of FPGA-based real-time simulator for active distribution networks

    Get PDF
    The field-programmable gate array (FPGA)-based real-time simulator takes advantage of many merits of FPGA, such as small time-step, high simulation precision, rich I/O interface resources, and low cost. The sparse linear equations formed by the node conductance matrix need to be solved repeatedly within each time-step, which introduces great challenges to the performance of the real-time simulator. In this paper, a fine-grained solver of the FPGA-based real-time simulator for active distribution networks is designed to meet the computational demand. The framework of the solver, offline process design on PC and online process design on FPGA are proposed in detail. The modified IEEE 33-node system with photovoltaics is simulated on a 4-FPGA-based real-time simulator. Simulation results are compared with PSCAD/EMTDC under the same conditions to validate the solver design
    corecore