Exploiting Data Representation for Fault Tolerance

Elliott, James; Hoemmen, Mark; Mueller, Frank

research

Exploiting Data Representation for Fault Tolerance

Authors: James Elliott
Mark Hoemmen
Frank Mueller
Publication date: 9 December 2013
Publisher: 'Elsevier BV'
Doi

Abstract

We explore the link between data representation and soft errors in dot products. We present an analytic model for the absolute error introduced should a soft error corrupt a bit in an IEEE-754 floating-point number. We show how this finding relates to the fundamental linear algebra concepts of normalization and matrix equilibration. We present a case study illustrating that the probability of experiencing a large error in a dot product is minimized when both vectors are normalized. Furthermore, when data is normalized we show that the absolute error is less than one or very large, which allows us to detect large errors. We demonstrate how this finding can be used by instrumenting the GMRES iterative solver. We count all possible errors that can be introduced through faults in arithmetic in the computationally intensive orthogonalization phase, and show that when scaling is used the absolute error can be bounded above by one

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.697.2...

Last time updated on 29/10/2017

CiteSeerX

oai:CiteSeerX.psu:10.1.1.697.3...

Last time updated on 29/10/2017

Crossref

info:doi/10.1109%2Fscala.2014....

Last time updated on 05/06/2019