2 research outputs found
Reduced Precision Checking to Detect Errors in Floating Point Arithmetic
In this paper, we use reduced precision checking (RPC) to detect errors in
floating point arithmetic. Prior work explored RPC for addition and
multiplication. In this work, we extend RPC to a complete floating point unit
(FPU), including division and square root, and we present precise analyses of
the errors undetectable with RPC that show bounds that are smaller than prior
work. We implement RPC for a complete FPU in RTL and experimentally evaluate
its error coverage and cost
FPDetect: Efficient Reasoning About Stencil Programs Using Selective Direct Evaluation
We present FPDetect, a low overhead approach for detecting logical errors and
soft errors affecting stencil computations without generating false positives.
We develop an offline analysis that tightly estimates the number of
floating-point bits preserved across stencil applications. This estimate
rigorously bounds the values expected in the data space of the computation.
Violations of this bound can be attributed with certainty to errors. FPDetect
helps synthesize error detectors customized for user-specified levels of
accuracy and coverage. FPDetect also enables overhead reduction techniques
based on deploying these detectors coarsely in space and time. Experimental
evaluations demonstrate the practicality of our approach.Comment: Accepted in Journal of ACM Transactions on Architecture and Code
Optimizatio