661 research outputs found

    Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers

    Get PDF
    This is the peer reviewed version of the following article: Anzt, H, Dongarra, J, Flegar, G, Higham, NJ, Quintana-Ortí, ES. Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers. Concurrency Computat Pract Exper. 2019; 31:e4460, which has been published in final form at https://doi.org/10.1002/cpe.4460. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.[EN] We propose an adaptive scheme to reduce communication overhead caused by data movement by selectively storing the diagonal blocks of a block-Jacobi preconditioner in different precision formats (half, single, or double). This specialized preconditioner can then be combined with any Krylov subspace method for the solution of sparse linear systems to perform all arithmetic in double precision. We assess the effects of the adaptive precision preconditioner on the iteration count and data transfer cost of a preconditioned conjugate gradient solver. A preconditioned conjugate gradient method is, in general, a memory bandwidth-bound algorithm, and therefore its execution time and energy consumption are largely dominated by the costs of accessing the problem's data in memory. Given this observation, we propose a model that quantifies the time and energy savings of our approach based on the assumption that these two costs depend linearly on the bit length of a floating point number. Furthermore, we use a number of test problems from the SuiteSparse matrix collection to estimate the potential benefits of the adaptive block-Jacobi preconditioning scheme.Impuls und Vernetzungsfond of the Helmholtz Association, Grant/Award Number: VH-NG-1241; MINECO and FEDER, Grant/Award Number: TIN2014-53495-R; H2020 EU FETHPC Project, Grant/Award Number: 732631; MathWorks; Engineering and Physical Sciences Research Council, Grant/Award Number: EP/P020720/1; Exascale Computing Project, Grant/Award Number: 17-SC-20-SCAnzt, H.; Dongarra, J.; Flegar, G.; Higham, NJ.; Quintana Ortí, ES. (2019). Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers. Concurrency and Computation Practice and Experience. 31(6):1-12. https://doi.org/10.1002/cpe.4460S112316Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. doi:10.1137/1.9780898718003Anzt H Dongarra J Flegar G Quintana-Ortí ES Batched Gauss-Jordan elimination for block-Jacobi preconditioner generation on GPUs 2017 Austin, TX http://doi.acm.org/10.1145/3026937.3026940Anzt H Dongarra J Flegar G Quintana-Ortí ES Variable-size batched LU for small matrices and its integration into block-Jacobi preconditioning 2017 Bristol, UK https://doi.org/10.1109/ICPP.2017.18Dongarra J Hittinger J Bell J Applied Mathematics Research for Exascale Computing [Technical Report] Washington, DC 2014 https://science.energy.gov/~/media/ascr/pdf/research/am/docs/EMWGreport.pdfDuranton M De Bosschere K Cohen A Maebe J Munk H HiPEAC Vision 2015 https://www.hipeac.org/publications/vision/ 2015Lucas R Top Ten Exascale Research Challenges http://science.energy.gov/~/media/ascr/ascac/pdf/meetings/20140210/Top10reportFEB14.pdf 2014Lavignon JF ETP4HPC Strategic Research Agenda Achieving HPC Leadership in Europe 2013 http://www.etp4hpc.eu/Carson, E., & Higham, N. J. (2017). A New Analysis of Iterative Refinement and Its Application to Accurate Solution of Ill-Conditioned Sparse Linear Systems. SIAM Journal on Scientific Computing, 39(6), A2834-A2856. doi:10.1137/17m1122918Carson E Higham NJ Accelerating the solution of linear systems by iterative refinement in three precisions July 2017 http://eprints.ma.man.ac.uk/2562 SIAM Journal on Scientific ComputingShalf J The evolution of programming models in response to energy efficiency constraints October 2013 Norman, OK http://www.oscer.ou.edu/Symposium2013/oksupercompsymp2013_talk_shalf_20131002.pdfGolub, G. H., & Ye, Q. (1999). Inexact Preconditioned Conjugate Gradient Method with Inner-Outer Iteration. SIAM Journal on Scientific Computing, 21(4), 1305-1320. doi:10.1137/s1064827597323415Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., … van der Vorst, H. (1994). Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. doi:10.1137/1.9781611971538Notay, Y. (2000). Flexible Conjugate Gradients. SIAM Journal on Scientific Computing, 22(4), 1444-1460. doi:10.1137/s1064827599362314Knyazev, A. V., & Lashuk, I. (2008). Steepest Descent and Conjugate Gradient Methods with Variable Preconditioning. SIAM Journal on Matrix Analysis and Applications, 29(4), 1267-1280. doi:10.1137/060675290CROZ, J. J. D., & HIGHAM, N. J. (1992). Stability of Methods for Matrix Inversion. IMA Journal of Numerical Analysis, 12(1), 1-19. doi:10.1093/imanum/12.1.1Higham, N. J. (2002). Accuracy and Stability of Numerical Algorithms. doi:10.1137/1.9780898718027Chow E Scott J On the use of iterative methods and blocking for solving sparse triangular systems in incomplete factorization preconditioning Swindon, UK Rutherford Appleton Laboratory 201

    Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software

    Full text link
    © ACM, 2021. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Mathematical Software, Volume 47, Issue , June 2021, http://doi.acm.org/10.1145/3441850[EN] The use of mixed precision in numerical algorithms is a promising strategy for accelerating scientific applications. In particular, the adoption of specialized hardware and data formats for low-precision arithmetic in high-end GPUs (graphics processing units) has motivated numerous efforts aiming at carefully reducing the working precision in order to speed up the computations. For algorithms whose performance is bound by the memory bandwidth, the idea of compressing its data before (and after) memory accesses has received considerable attention. One idea is to store an approximate operator-like a preconditioner-in lower than working precision hopefully without impacting the algorithm output. We realize the first high-performance implementation of an adaptive precision block-Jacobi preconditioner which selects the precision format used to store the preconditioner data on-the-fly, taking into account the numerical properties of the individual preconditioner blocks. We implement the adaptive block-Jacobi preconditioner as production-ready functionality in the Ginkgo linear algebra library, considering not only the precision formats that are part of the IEEE standard, but also customized formats which optimize the length of the exponent and significand to the characteristics of the preconditioner blocks. Experiments run on a state-of-the-art GPU accelerator show that our implementation offers attractive runtime savings.H. Anzt and T. Cojean were supported by the "Impuls und Vernetzungsfond of the Helmholtz Association" under grant VH-NG-1241. G. Flegar and E. S. Quintana-Orti were supported by project TIN2017-82972-R of the MINECO and FEDER and the H2020 EU FETHPC Project 732631 "OPRECOMP". This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The authors want to acknowledge the access to the Piz Daint supercomputer at the Swiss National Supercomputing Centre (CSCS) granted under the project #d100 and the Summit supercomputer at the Oak Ridge National Lab (ORNL).Flegar, G.; Anzt, H.; Cojean, T.; Quintana-Ortí, ES. (2021). Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software. ACM Transactions on Mathematical Software. 47(2):1-28. https://doi.org/10.1145/3441850S12847

    Mixed Precision Iterative Refinement with Adaptive Precision Sparse Approximate Inverse Preconditioning

    Full text link
    Hardware trends have motivated the development of mixed precision algo-rithms in numerical linear algebra, which aim to decrease runtime while maintaining acceptable accuracy. One recent development is the development of an adaptive precision sparse matrix-vector produce routine, which may be used to accelerate the solution of sparse linear systems by iterative methods. This approach is also applicable to the application of inexact preconditioners, such as sparse approximate inverse preconditioners used in Krylov subspace methods. In this work, we develop an adaptive precision sparse approximate inverse preconditioner and demonstrate its use within a five-precision GMRES-based iterative refinement method. We call this algorithm variant BSPAI-GMRES-IR. We then analyze the conditions for the convergence of BSPAI-GMRES-IR, and determine settings under which BSPAI-GMRES-IR will produce similar backward and forward errors as the existing SPAI-GMRES-IR method, the latter of which does not use adaptive precision in preconditioning. Our numerical experiments show that this approach can potentially lead to a reduction in the cost of storing and applying sparse approximate inverse preconditioners, although a significant reduction in cost may comes at the expense of increasing the number of GMRES iterations required for convergence
    • …
    corecore