2 research outputs found
Predict-and-recompute conjugate gradient variants
The standard implementation of the conjugate gradient algorithm suffers from
communication bottlenecks on parallel architectures, due primarily to the two
global reductions required every iteration. In this paper, we introduce several
predict-and-recompute type conjugate gradient variants, which decrease the
runtime per iteration by overlapping global synchronizations, and in the case
of our pipelined variants, matrix vector products. Through the use of a
predict-and-recompute scheme, whereby recursively updated quantities are first
used as a predictor for their true values and then recomputed exactly at a
later point in the iteration, our variants are observed to have convergence
properties nearly as good as the standard conjugate gradient problem
implementation on every problem we tested. It is also verified experimentally
that our variants do indeed reduce runtime per iteration in practice, and that
they scale similarly to previously studied communication hiding variants.
Finally, because our variants achieve good convergence without the use of any
additional input parameters, they have the potential to be used in place of the
standard conjugate gradient implementation in a range of applications.Comment: This material is based upon work supported by the NSF GRFP. Code for
reproducing all figures and tables in the this paper can be found here:
https://github.com/tchen01/new_cg_variant