256 research outputs found

    Analyzing the effect of local rounding error propagation on the maximal attainable accuracy of the pipelined Conjugate Gradient method

    Get PDF
    Pipelined Krylov subspace methods typically offer improved strong scaling on parallel HPC hardware compared to standard Krylov subspace methods for large and sparse linear systems. In pipelined methods the traditional synchronization bottleneck is mitigated by overlapping time-consuming global communications with useful computations. However, to achieve this communication hiding strategy, pipelined methods introduce additional recurrence relations for a number of auxiliary variables that are required to update the approximate solution. This paper aims at studying the influence of local rounding errors that are introduced by the additional recurrences in the pipelined Conjugate Gradient method. Specifically, we analyze the impact of local round-off effects on the attainable accuracy of the pipelined CG algorithm and compare to the traditional CG method. Furthermore, we estimate the gap between the true residual and the recursively computed residual used in the algorithm. Based on this estimate we suggest an automated residual replacement strategy to reduce the loss of attainable accuracy on the final iterative solution. The resulting pipelined CG method with residual replacement improves the maximal attainable accuracy of pipelined CG, while maintaining the efficient parallel performance of the pipelined method. This conclusion is substantiated by numerical results for a variety of benchmark problems.Comment: 26 pages, 6 figures, 2 tables, 4 algorithm

    On choice of preconditioner for minimum residual methods for nonsymmetric matrices

    Get PDF
    Existing convergence bounds for Krylov subspace methods such as GMRES for nonsymmetric linear systems give little mathematical guidance for the choice of preconditioner. Here, we establish a desirable mathematical property of a preconditioner which guarantees that convergence of a minimum residual method will essentially depend only on the eigenvalues of the preconditioned system, as is true in the symmetric case. Our theory covers only a subset of nonsymmetric coefficient matrices but computations indicate that it might be more generally applicable

    A framework for deflated and augmented Krylov subspace methods

    Get PDF
    We consider deflation and augmentation techniques for accelerating the convergence of Krylov subspace methods for the solution of nonsingular linear algebraic systems. Despite some formal similarity, the two techniques are conceptually different from preconditioning. Deflation (in the sense the term is used here) "removes" certain parts from the operator making it singular, while augmentation adds a subspace to the Krylov subspace (often the one that is generated by the singular operator); in contrast, preconditioning changes the spectrum of the operator without making it singular. Deflation and augmentation have been used in a variety of methods and settings. Typically, deflation is combined with augmentation to compensate for the singularity of the operator, but both techniques can be applied separately. We introduce a framework of Krylov subspace methods that satisfy a Galerkin condition. It includes the families of orthogonal residual (OR) and minimal residual (MR) methods. We show that in this framework augmentation can be achieved either explicitly or, equivalently, implicitly by projecting the residuals appropriately and correcting the approximate solutions in a final step. We study conditions for a breakdown of the deflated methods, and we show several possibilities to avoid such breakdowns for the deflated MINRES method. Numerical experiments illustrate properties of different variants of deflated MINRES analyzed in this paper.Comment: 24 pages, 3 figure

    Numerically Stable Recurrence Relations for the Communication Hiding Pipelined Conjugate Gradient Method

    Full text link
    Pipelined Krylov subspace methods (also referred to as communication-hiding methods) have been proposed in the literature as a scalable alternative to classic Krylov subspace algorithms for iteratively computing the solution to a large linear system in parallel. For symmetric and positive definite system matrices the pipelined Conjugate Gradient method outperforms its classic Conjugate Gradient counterpart on large scale distributed memory hardware by overlapping global communication with essential computations like the matrix-vector product, thus hiding global communication. A well-known drawback of the pipelining technique is the (possibly significant) loss of numerical stability. In this work a numerically stable variant of the pipelined Conjugate Gradient algorithm is presented that avoids the propagation of local rounding errors in the finite precision recurrence relations that construct the Krylov subspace basis. The multi-term recurrence relation for the basis vector is replaced by two-term recurrences, improving stability without increasing the overall computational cost of the algorithm. The proposed modification ensures that the pipelined Conjugate Gradient method is able to attain a highly accurate solution independently of the pipeline length. Numerical experiments demonstrate a combination of excellent parallel performance and improved maximal attainable accuracy for the new pipelined Conjugate Gradient algorithm. This work thus resolves one of the major practical restrictions for the useability of pipelined Krylov subspace methods.Comment: 15 pages, 5 figures, 1 table, 2 algorithm

    Predict-and-recompute conjugate gradient variants

    Get PDF
    The standard implementation of the conjugate gradient algorithm suffers from communication bottlenecks on parallel architectures, due primarily to the two global reductions required every iteration. In this paper, we introduce several predict-and-recompute type conjugate gradient variants, which decrease the runtime per iteration by overlapping global synchronizations, and in the case of our pipelined variants, matrix vector products. Through the use of a predict-and-recompute scheme, whereby recursively updated quantities are first used as a predictor for their true values and then recomputed exactly at a later point in the iteration, our variants are observed to have convergence properties nearly as good as the standard conjugate gradient problem implementation on every problem we tested. It is also verified experimentally that our variants do indeed reduce runtime per iteration in practice, and that they scale similarly to previously studied communication hiding variants. Finally, because our variants achieve good convergence without the use of any additional input parameters, they have the potential to be used in place of the standard conjugate gradient implementation in a range of applications.Comment: This material is based upon work supported by the NSF GRFP. Code for reproducing all figures and tables in the this paper can be found here: https://github.com/tchen01/new_cg_variant
    • …