High-fidelity error injection and acceleration techniques

Abstract

As technology scales down, the likelihood of hardware errors that silently corrupt the results of applications is increasing. Evaluating the resilience of applications against hardware errors is thus of significant concern. Current evaluation techniques via error injection are either low-fidelity or inefficient in terms of using computing resources. This dissertation demonstrates that sophisticated integration of injectors across abstraction layers and novel sampling algorithms can significantly improve both the fidelity and efficiency. Specifically, this dissertation describes an open-source instruction-level error injector that generates high-fidelity hardware errors due to particle strikes and voltage droops. Two acceleration techniques, nested Monte Carlo and Injection-Point Overprovisioning, are proposed to speed up error injection campaigns by 1−2 orders of magnitude. This dissertation also answers the question of when high-fidelity is needed to evaluate the impact of hardware errors on applications and the effectiveness of error detectors.Electrical and Computer Engineerin

    Similar works