3 research outputs found

    Temporal memoization for energy-efficient timing error recovery in GPGPUs

    No full text

    Temporal memoization for energy-efficient timing error recovery in GPGPUs

    No full text
    Abstract—Manufacturing and environmental variability lead to timing errors in computing systems that are typically corrected by error detection and correction mechanisms at the circuit level. The cost and speed of recovery can be improved by memoization-based optimization methods that exploit spatial or temporal parallelisms in suitable computing fabrics such as general-purpose graphics processing units (GPGPUs). We propose here a temporal memoization technique for use in floating-point units (FPUs) in GPGPUs that uses value locality inside data-parallel programs. The technique recalls (memorizes) the context of error-free execution of an instruction on a FPU. To enable scalable and independent recovery, a single-cycle lookup table (LUT) is tightly coupled to every FPU to maintain contexts of recent error-free executions. The LUT reuses these memorized contexts to exactly, or approximately, correct errant FP instructions based on application needs. In real-world applications, the temporal memoization technique achieves an average energy saving of 8%–28 % for a wide range of timing error rates (0%–4%) and outperforms recent advances in resilient architectures. This technique also enhances robustness in the voltage overscaling regime and achieves relative average energy saving of 66 % with 11 % voltage overscaling. Keywords—Variability; timing errors; error recovery; temporal memoization; value locality; GPGPU I
    corecore