Search CORE

1,379 research outputs found

Evaluating the Impact of SDC on the GMRES Iterative Solver

Author: Elliott James
Hoemmen Mark
Mueller Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/11/2013
Field of study

Increasing parallelism and transistor density, along with increasingly tighter energy and peak power constraints, may force exposure of occasionally incorrect computation or storage to application codes. Silent data corruption (SDC) will likely be infrequent, yet one SDC suffices to make numerical algorithms like iterative linear solvers cease progress towards the correct answer. Thus, we focus on resilience of the iterative linear solver GMRES to a single transient SDC. We derive inexpensive checks to detect the effects of an SDC in GMRES that work for a more general SDC model than presuming a bit flip. Our experiments show that when GMRES is used as the inner solver of an inner-outer iteration, it can "run through" SDC of almost any magnitude in the computationally intensive orthogonalization phase. That is, it gets the right answer using faulty data without any required roll back. Those SDCs which it cannot run through, get caught by our detection scheme

arXiv.org e-Print Archive

CiteSeerX

Crossref

Domain decomposition methods for the parallel computation of reacting flows

Author: Keyes David E.
Publication venue
Publication date
Field of study

Domain decomposition is a natural route to parallel computing for partial differential equation solvers. Subdomains of which the original domain of definition is comprised are assigned to independent processors at the price of periodic coordination between processors to compute global parameters and maintain the requisite degree of continuity of the solution at the subdomain interfaces. In the domain-decomposed solution of steady multidimensional systems of PDEs by finite difference methods using a pseudo-transient version of Newton iteration, the only portion of the computation which generally stands in the way of efficient parallelization is the solution of the large, sparse linear systems arising at each Newton step. For some Jacobian matrices drawn from an actual two-dimensional reacting flow problem, comparisons are made between relaxation-based linear solvers and also preconditioned iterative methods of Conjugate Gradient and Chebyshev type, focusing attention on both iteration count and global inner product count. The generalized minimum residual method with block-ILU preconditioning is judged the best serial method among those considered, and parallel numerical experiments on the Encore Multimax demonstrate for it approximately 10-fold speedup on 16 processors

NASA Technical Reports Server

The Anderson model of localization: a challenge for modern eigenvalue methods

Author: Elsner U.
Mehrmann V.
Milde F.
Roemer R. A.
Schreiber M.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/1997
Field of study

We present a comparative study of the application of modern eigenvalue algorithms to an eigenvalue problem arising in quantum physics, namely, the computation of a few interior eigenvalues and their associated eigenvectors for the large, sparse, real, symmetric, and indefinite matrices of the Anderson model of localization. We compare the Lanczos algorithm in the 1987 implementation of Cullum and Willoughby with the implicitly restarted Arnoldi method coupled with polynomial and several shift-and-invert convergence accelerators as well as with a sparse hybrid tridiagonalization method. We demonstrate that for our problem the Lanczos implementation is faster and more memory efficient than the other approaches. This seemingly innocuous problem presents a major challenge for all modern eigenvalue algorithms.Comment: 16 LaTeX pages with 3 figures include

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

A biconjugate gradient type algorithm on massively parallel architectures

Author: Freund Roland W.
Hochbruck Marlis
Publication venue
Publication date
Field of study

The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine

NASA Technical Reports Server