research

Are your lights off? Using problem frames to diagnose system failures

Abstract

This paper reports on our experience of investigating the role of software systems in the power blackout that affected parts of the United States and Canada on 14 August 2003. Based on a detailed study of the official report on the blackout, our investigation has aimed to bring out requirements engineering lessons that can inform development practices for dependable software systems. Since the causes of failures are typically rooted in the complex structures of software systems and their world contexts, we have deployed and evaluated a framework that looks beyond the scope of software and into its physical context, directing attention to places in the system structures where failures are likely to occur. We report that (i) Problem Frames were effective in diagnosing the causes of failures and documenting the causes in a schematic and accessible way, and (ii) errors in addressing the concerns of biddable domains, model building problems, and monitoring problems had contributed to the blackout

    Similar works