9,369 research outputs found

    Numerically Stable Recurrence Relations for the Communication Hiding Pipelined Conjugate Gradient Method

    Full text link
    Pipelined Krylov subspace methods (also referred to as communication-hiding methods) have been proposed in the literature as a scalable alternative to classic Krylov subspace algorithms for iteratively computing the solution to a large linear system in parallel. For symmetric and positive definite system matrices the pipelined Conjugate Gradient method outperforms its classic Conjugate Gradient counterpart on large scale distributed memory hardware by overlapping global communication with essential computations like the matrix-vector product, thus hiding global communication. A well-known drawback of the pipelining technique is the (possibly significant) loss of numerical stability. In this work a numerically stable variant of the pipelined Conjugate Gradient algorithm is presented that avoids the propagation of local rounding errors in the finite precision recurrence relations that construct the Krylov subspace basis. The multi-term recurrence relation for the basis vector is replaced by two-term recurrences, improving stability without increasing the overall computational cost of the algorithm. The proposed modification ensures that the pipelined Conjugate Gradient method is able to attain a highly accurate solution independently of the pipeline length. Numerical experiments demonstrate a combination of excellent parallel performance and improved maximal attainable accuracy for the new pipelined Conjugate Gradient algorithm. This work thus resolves one of the major practical restrictions for the useability of pipelined Krylov subspace methods.Comment: 15 pages, 5 figures, 1 table, 2 algorithm

    Analyzing the effect of local rounding error propagation on the maximal attainable accuracy of the pipelined Conjugate Gradient method

    Get PDF
    Pipelined Krylov subspace methods typically offer improved strong scaling on parallel HPC hardware compared to standard Krylov subspace methods for large and sparse linear systems. In pipelined methods the traditional synchronization bottleneck is mitigated by overlapping time-consuming global communications with useful computations. However, to achieve this communication hiding strategy, pipelined methods introduce additional recurrence relations for a number of auxiliary variables that are required to update the approximate solution. This paper aims at studying the influence of local rounding errors that are introduced by the additional recurrences in the pipelined Conjugate Gradient method. Specifically, we analyze the impact of local round-off effects on the attainable accuracy of the pipelined CG algorithm and compare to the traditional CG method. Furthermore, we estimate the gap between the true residual and the recursively computed residual used in the algorithm. Based on this estimate we suggest an automated residual replacement strategy to reduce the loss of attainable accuracy on the final iterative solution. The resulting pipelined CG method with residual replacement improves the maximal attainable accuracy of pipelined CG, while maintaining the efficient parallel performance of the pipelined method. This conclusion is substantiated by numerical results for a variety of benchmark problems.Comment: 26 pages, 6 figures, 2 tables, 4 algorithm

    A Lanczos Method for Approximating Composite Functions

    Full text link
    We seek to approximate a composite function h(x) = g(f(x)) with a global polynomial. The standard approach chooses points x in the domain of f and computes h(x) at each point, which requires an evaluation of f and an evaluation of g. We present a Lanczos-based procedure that implicitly approximates g with a polynomial of f. By constructing a quadrature rule for the density function of f, we can approximate h(x) using many fewer evaluations of g. The savings is particularly dramatic when g is much more expensive than f or the dimension of x is large. We demonstrate this procedure with two numerical examples: (i) an exponential function composed with a rational function and (ii) a Navier-Stokes model of fluid flow with a scalar input parameter that depends on multiple physical quantities

    Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach

    Get PDF
    The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR) is among the fastest methods. Although fast, the solvers based on MRRR do not deliver the same accuracy as competing methods like Divide & Conquer or the QR algorithm. In this paper, we demonstrate that the use of mixed precisions leads to improved accuracy of MRRR-based eigensolvers with limited or no performance penalty. As a result, we obtain eigensolvers that are not only equally or more accurate than the best available methods, but also -in most circumstances- faster and more scalable than the competition

    Raman scattering from fractals. Simulation on large structures by the method of moments

    Full text link
    We have employed the method of spectral moments to study the density of vibrational states and the Raman coupling coefficient of large 2- and 3- dimensional percolators at threshold and at higher concentration. We first discuss the over-and under-flow problems of the procedure which arise when -like in the present case- it is necessary to calculate a few thousand moments. Then we report on the numerical results; these show that different scattering mechanisms, all {\it a priori} equally probable in real systems, produce largely different coupling coefficients with different frequency dependence. Our results are compared with existing scaling theories of Raman scattering. The situation that emerges is complex; on the one hand, there is indication that the existing theory is not satisfactory; on the other hand, the simulations above threshold show that in this case the coupling coefficients have very little resemblance, if any, with the same quantities at threshold.Comment: 26 pages, RevTex, 8 figures available on reques

    Preconditioned Locally Harmonic Residual Method for Computing Interior Eigenpairs of Certain Classes of Hermitian Matrices

    Full text link
    We propose a Preconditioned Locally Harmonic Residual (PLHR) method for computing several interior eigenpairs of a generalized Hermitian eigenvalue problem, without traditional spectral transformations, matrix factorizations, or inversions. PLHR is based on a short-term recurrence, easily extended to a block form, computing eigenpairs simultaneously. PLHR can take advantage of Hermitian positive definite preconditioning, e.g., based on an approximate inverse of an absolute value of a shifted matrix, introduced in [SISC, 35 (2013), pp. A696-A718]. Our numerical experiments demonstrate that PLHR is efficient and robust for certain classes of large-scale interior eigenvalue problems, involving Laplacian and Hamiltonian operators, especially if memory requirements are tight
    • …
    corecore