thesis

A framework for dynamic safety and risk management modeling in complex engineering systems

Abstract

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, June 2007."February 2007."Includes bibliographical references (p. 328-338).Almost all traditional hazard analysis or risk assessment techniques, such as failure modes and effect analysis (FMEA), fault tree analysis (FTA), and probabilistic risk analysis (PRA) rely on a chain-of-event paradigm of accident causation. Event-based techniques have some limitations for the study of modem engineering systems. Specifically, they are not suited to handle complex software-intensive systems, complex human-machine interactions, and systems-of-systems with distributed decision-making that cut across both physical and organizational boundaries. STAMP (System-Theoretic Accident Model and Processes) is a comprehensive accident model created by Nancy Leveson that is based on systems theory. It draws on concepts from engineering, mathematics, cognitive and social psychology, organizational theory, political science, and economics. The general notion in STAMP is that accidents result from inadequate enforcement of safety constraints in design, development, and operation. STAMP includes traditional failure-based models as a subset, but goes beyond physical failures to include causal factors involving dysfunctional interactions among non-failing components; software and logic design errors; errors in complex human decision-making; various organizational characteristics such as workforce, safety processes and standards, contracting; and other managerial, social, organizational, and cultural factors. The main contribution of this thesis is the augmentation of STAMP with a dynamic executable modeling framework in order to further improve safety in the development and operation of complex engineering systems. This executable modeling framework: 1) enables the dynamic analysis of safety-related decision-making in complex systems, 2) assists with the design and testing of non-intuitive policies and processes to better mitigate risks and prevent time-dependent risk increase, and 3) enables the identification of technical and organizational factors to detect and monitor states of increasing risk before an accident occurs.(cont.) The modeling framework is created by combining STAMP safety control structures with system dynamic modeling principles. A component-based model-building methodology is proposed to facilitate the building of customized STAMP-based dynamic risk management models and make them accessible to managers and engineers with limited simulation experience. A library of generic executable components is provided as a basis for model creation, refinement, and validation. A toolset is assembled to identify risk increase patterns, analyze time-dependent risks, assist engineers and managers in safety-related decision-making, create and test risk mitigation actions and policies, and monitor the system for states of increasing risk. The usefulness of the new framework is demonstrated in two independent projects: 1) A risk analysis of the NASA Independent Technical Authority (ITA), an organization mandated by the Columbia Accident Investigation Board (CAIB) to provide independent safety oversight of space shuttle operations, and 2) A risk management study for the Exploration Systems Mission Directorate (ESMD) at NASA. For these two projects, model refinement, validation and analysis required extensive data collection and interactions with NASA workforce. Over 45 interviews were conducted at five NASA centers (HQ, MSFC, KSC, JSC, and LaRC). Interviewees included representatives from the Office of the Administrator, the Office of the Chief Engineer, the Office of Safety and Mission Assurance, ESMD Directorate Offices, Program/Project Offices, and many others. Among other data sources, 200 pages of interview transcripts were compiled and used for model creation and validation activities. Specific risks analyzed include: 1) NASA workforce and knowledge management issues, 2) the impact of various levels of outsourcing, 3) the impact of safety priority on design, and 4) the impact of requirements change on safety and schedule during development.by Nicolas Dulac.Ph.D

    Similar works