research

An Approach for the Assessment of System Upset Resilience

Abstract

This report describes an approach for the assessment of upset resilience that is applicable to systems in general, including safety-critical, real-time systems. For this work, resilience is defined as the ability to preserve and restore service availability and integrity under stated conditions of configuration, functional inputs and environmental conditions. To enable a quantitative approach, we define novel system service degradation metrics and propose a new mathematical definition of resilience. These behavioral-level metrics are based on the fundamental service classification criteria of correctness, detectability, symmetry and persistence. This approach consists of a Monte-Carlo-based stimulus injection experiment, on a physical implementation or an error-propagation model of a system, to generate a system response set that can be characterized in terms of dimensional error metrics and integrated to form an overall measure of resilience. We expect this approach to be helpful in gaining insight into the error containment and repair capabilities of systems for a wide range of conditions

    Similar works