research

A Game-Theoretic Foundation for the Maximum Software Resilience against Dense Errors

Abstract

Safety-critical systems need to maintain their functionality in the presence of multiple errors caused by component failures or disastrous environment events. We propose a game-theoretic foundation for synthesizing control strategies that maximize the resilience of a software system in defense against a realistic error model. The new control objective of such a game is called kk -resilience. In order to be kk -resilient, a system needs to rapidly recover from infinitely many waves of a small number of up to kk close errors provided that the blocks of up to kk errors are separated by short time intervals, which can be used by the system to recover. We first argue why we believe this to be the right level of abstraction for safety critical systems when local faults are few and far between. We then show how the analysis of kk -resilience problems can be formulated as a model-checking problem of a mild extension to the alternating-time μ\mu -calculus (AMC). The witness for kk resilience, which can be provided by the model checker, can be used for providing control strategies that are optimal with respect to resilience. We show that the computational complexity of constructing such optimal control strategies is low and demonstrate the feasibility of our approach through an implementation and experimental results

    Similar works