256 research outputs found
Analyzing the effect of local rounding error propagation on the maximal attainable accuracy of the pipelined Conjugate Gradient method
Pipelined Krylov subspace methods typically offer improved strong scaling on
parallel HPC hardware compared to standard Krylov subspace methods for large
and sparse linear systems. In pipelined methods the traditional synchronization
bottleneck is mitigated by overlapping time-consuming global communications
with useful computations. However, to achieve this communication hiding
strategy, pipelined methods introduce additional recurrence relations for a
number of auxiliary variables that are required to update the approximate
solution. This paper aims at studying the influence of local rounding errors
that are introduced by the additional recurrences in the pipelined Conjugate
Gradient method. Specifically, we analyze the impact of local round-off effects
on the attainable accuracy of the pipelined CG algorithm and compare to the
traditional CG method. Furthermore, we estimate the gap between the true
residual and the recursively computed residual used in the algorithm. Based on
this estimate we suggest an automated residual replacement strategy to reduce
the loss of attainable accuracy on the final iterative solution. The resulting
pipelined CG method with residual replacement improves the maximal attainable
accuracy of pipelined CG, while maintaining the efficient parallel performance
of the pipelined method. This conclusion is substantiated by numerical results
for a variety of benchmark problems.Comment: 26 pages, 6 figures, 2 tables, 4 algorithm
On choice of preconditioner for minimum residual methods for nonsymmetric matrices
Existing convergence bounds for Krylov subspace methods such as GMRES for nonsymmetric linear systems give little mathematical guidance for the choice of preconditioner. Here, we establish a desirable mathematical property of a preconditioner which guarantees that convergence of a minimum residual method will essentially depend only on the eigenvalues of the preconditioned system, as is true in the symmetric case. Our theory covers only a subset of nonsymmetric coefficient matrices but computations indicate that it might be more generally applicable
A framework for deflated and augmented Krylov subspace methods
We consider deflation and augmentation techniques for accelerating the
convergence of Krylov subspace methods for the solution of nonsingular linear
algebraic systems. Despite some formal similarity, the two techniques are
conceptually different from preconditioning. Deflation (in the sense the term
is used here) "removes" certain parts from the operator making it singular,
while augmentation adds a subspace to the Krylov subspace (often the one that
is generated by the singular operator); in contrast, preconditioning changes
the spectrum of the operator without making it singular. Deflation and
augmentation have been used in a variety of methods and settings. Typically,
deflation is combined with augmentation to compensate for the singularity of
the operator, but both techniques can be applied separately.
We introduce a framework of Krylov subspace methods that satisfy a Galerkin
condition. It includes the families of orthogonal residual (OR) and minimal
residual (MR) methods. We show that in this framework augmentation can be
achieved either explicitly or, equivalently, implicitly by projecting the
residuals appropriately and correcting the approximate solutions in a final
step. We study conditions for a breakdown of the deflated methods, and we show
several possibilities to avoid such breakdowns for the deflated MINRES method.
Numerical experiments illustrate properties of different variants of deflated
MINRES analyzed in this paper.Comment: 24 pages, 3 figure
Numerically Stable Recurrence Relations for the Communication Hiding Pipelined Conjugate Gradient Method
Pipelined Krylov subspace methods (also referred to as communication-hiding
methods) have been proposed in the literature as a scalable alternative to
classic Krylov subspace algorithms for iteratively computing the solution to a
large linear system in parallel. For symmetric and positive definite system
matrices the pipelined Conjugate Gradient method outperforms its classic
Conjugate Gradient counterpart on large scale distributed memory hardware by
overlapping global communication with essential computations like the
matrix-vector product, thus hiding global communication. A well-known drawback
of the pipelining technique is the (possibly significant) loss of numerical
stability. In this work a numerically stable variant of the pipelined Conjugate
Gradient algorithm is presented that avoids the propagation of local rounding
errors in the finite precision recurrence relations that construct the Krylov
subspace basis. The multi-term recurrence relation for the basis vector is
replaced by two-term recurrences, improving stability without increasing the
overall computational cost of the algorithm. The proposed modification ensures
that the pipelined Conjugate Gradient method is able to attain a highly
accurate solution independently of the pipeline length. Numerical experiments
demonstrate a combination of excellent parallel performance and improved
maximal attainable accuracy for the new pipelined Conjugate Gradient algorithm.
This work thus resolves one of the major practical restrictions for the
useability of pipelined Krylov subspace methods.Comment: 15 pages, 5 figures, 1 table, 2 algorithm
Predict-and-recompute conjugate gradient variants
The standard implementation of the conjugate gradient algorithm suffers from
communication bottlenecks on parallel architectures, due primarily to the two
global reductions required every iteration. In this paper, we introduce several
predict-and-recompute type conjugate gradient variants, which decrease the
runtime per iteration by overlapping global synchronizations, and in the case
of our pipelined variants, matrix vector products. Through the use of a
predict-and-recompute scheme, whereby recursively updated quantities are first
used as a predictor for their true values and then recomputed exactly at a
later point in the iteration, our variants are observed to have convergence
properties nearly as good as the standard conjugate gradient problem
implementation on every problem we tested. It is also verified experimentally
that our variants do indeed reduce runtime per iteration in practice, and that
they scale similarly to previously studied communication hiding variants.
Finally, because our variants achieve good convergence without the use of any
additional input parameters, they have the potential to be used in place of the
standard conjugate gradient implementation in a range of applications.Comment: This material is based upon work supported by the NSF GRFP. Code for
reproducing all figures and tables in the this paper can be found here:
https://github.com/tchen01/new_cg_variant
- …