    Software plays an increasingly important role in modern safety-critical systems. Although research has been done to integrate software into the classical Probability Risk Assessment (PRA) framework, current PRA practice overwhelmingly neglects the contribution of software to system risk. The objective of this research is to develop a methodology to integrate software contributions in the Dynamic Probabilistic Risk Assessment (DPRA) environment. DPRA is considered to be the next generation of PRA techniques. It is a set of methods and techniques in which simulation models that represent the behavior of the elements of a system are exercised in order to identify risks and vulnerabilities of the system. DPRA allows consideration of dynamic interactions of system elements and physical variables. The fact remains, however, that modeling software for use in the DPRA framework is also quite complex and very little has been done to address the question directly and comprehensively. This dissertation describes a framework and a set of techniques to extend the DPRA approach to allow consideration of the software contributions on system risk. The framework includes a software representation, an approach to incorporate the software representation into the DPRA environment SimPRA, and an experimental demonstration of the methodology. This dissertation also proposes a framework to simulate the multi-level objects in the simulation based DPRA environment. This is a new methodology to address the state explosion problem. The results indicate that the DPRA simulation performance is improved using the new approach. The entire methodology is implemented in the SimPRA software. An easy to use tool is developed to help the analyst to develop the software model. This study is the first systematic effort to integrate software risk contributions into the dynamic PRA environment

    Use of limited data to construct Bayesian networks for probabilistic risk assessment.

    Probabilistic Risk Assessment (PRA) is a fundamental part of safety/quality assurance for nuclear power and nuclear weapons. Traditional PRA very effectively models complex hardware system risks using binary probabilistic models. However, traditional PRA models are not flexible enough to accommodate non-binary soft-causal factors, such as digital instrumentation&control, passive components, aging, common cause failure, and human errors. Bayesian Networks offer the opportunity to incorporate these risks into the PRA framework. This report describes the results of an early career LDRD project titled %E2%80%9CUse of Limited Data to Construct Bayesian Networks for Probabilistic Risk Assessment%E2%80%9D. The goal of the work was to establish the capability to develop Bayesian Networks from sparse data, and to demonstrate this capability by producing a data-informed Bayesian Network for use in Human Reliability Analysis (HRA) as part of nuclear power plant Probabilistic Risk Assessment (PRA). This report summarizes the research goal and major products of the research

    Method and system for dynamic probabilistic risk assessment

    The DEFT methodology, system and computer readable medium extends the applicability of the PRA (Probabilistic Risk Assessment) methodology to computer-based systems, by allowing DFT (Dynamic Fault Tree) nodes as pivot nodes in the Event Tree (ET) model. DEFT includes a mathematical model and solution algorithm, supports all common PRA analysis functions and cutsets. Additional capabilities enabled by the DFT include modularization, phased mission analysis, sequence dependencies, and imperfect coverage

    "Making Safety Happen" Through Probabilistic Risk Assessment at NASA

    NASA is using Probabilistic Risk Assessment (PRA) as one of the tools in its Safety & Mission Assurance (S&MA) tool belt to identify and quantify risks associated with human spaceflight. This paper discusses some of the challenges and benefits associated with developing and using PRA for NASA human space programs. Some programs have entered operation prior to developing a PRA, while some have implemented PRA from the start of the program. It has been observed that the earlier a design change is made in the concept or design phase, the less impact it has on cost and schedule. Not finding risks until the operation phase yields much costlier design changes and major delays, which can result in discussions of just accepting the risk. Risk contributors identified by PRA are not just associated with hardware failures. They include but are not limited to crew fatality due to medical causes, the environment the vehicle and crew are exposed to, the software being used, and the reliability of the crew performing required actions. Some programs have entered operation prior to developing a PRA, and while PRA can still provide a benefit for operations and future design trades, the benefit of implementing PRA from the start of the program provides the added benefit of informing design and reducing risk early in program development. Currently, NASAs International Space Station (ISS) program is in its 20th year of on-orbit operations around the Earth and has several new programs in the design phase preparing to enter the operation phase all of which have active (or living) PRAs. These programs incorporate PRA as part of their Risk-Informed, Decision-Making (RIDM) process. For new NASA human spaceflight programs discussion begins with mission concept, establishing requirements, forming the PRA team, and continues through the design cycles into the operational phase. Several examples of PRA related applications and observed lessons are included

    Probabilistic Risk Assessment Procedures Guide for NASA Managers and Practitioners (Second Edition)

    Probabilistic Risk Assessment (PRA) is a comprehensive, structured, and logical analysis method aimed at identifying and assessing risks in complex technological systems for the purpose of cost-effectively improving their safety and performance. NASA's objective is to better understand and effectively manage risk, and thus more effectively ensure mission and programmatic success, and to achieve and maintain high safety standards at NASA. NASA intends to use risk assessment in its programs and projects to support optimal management decision making for the improvement of safety and program performance. In addition to using quantitative/probabilistic risk assessment to improve safety and enhance the safety decision process, NASA has incorporated quantitative risk assessment into its system safety assessment process, which until now has relied primarily on a qualitative representation of risk. Also, NASA has recently adopted the Risk-Informed Decision Making (RIDM) process [1-1] as a valuable addition to supplement existing deterministic and experience-based engineering methods and tools. Over the years, NASA has been a leader in most of the technologies it has employed in its programs. One would think that PRA should be no exception. In fact, it would be natural for NASA to be a leader in PRA because, as a technology pioneer, NASA uses risk assessment and management implicitly or explicitly on a daily basis. NASA has probabilistic safety requirements (thresholds and goals) for crew transportation system missions to the International Space Station (ISS) [1-2]. NASA intends to have probabilistic requirements for any new human spaceflight transportation system acquisition. Methods to perform risk and reliability assessment in the early 1960s originated in U.S. aerospace and missile programs. Fault tree analysis (FTA) is an example. It would have been a reasonable extrapolation to expect that NASA would also become the world leader in the application of PRA. That was, however, not to happen. Early in the Apollo program, estimates of the probability for a successful roundtrip human mission to the moon yielded disappointingly low (and suspect) values and NASA became discouraged from further performing quantitative risk analyses until some two decades later when the methods were more refined, rigorous, and repeatable. Instead, NASA decided to rely primarily on the Hazard Analysis (HA) and Failure Modes and Effects Analysis (FMEA) methods for system safety assessment

    A Study of Software Input Failure Propagation Mechanisms

    Probabilistic Risk Assessment (PRA) is a well-established technique to assess the probability of failure or success of a system. Classical PRA does not consider the contributions of software to risk. Dr. B. Li and C. Smidts have established a framework to integrate software into PRA which recognizes the existence of four classes of risk contributors: functional, input, output and support failures. Input/Output failures have been shown to make up 57.4 % of the failures experienced during software development of major aerospace systems and have been at the origin of a number of major accidents such as the Mars Polar Lander. This research quantifies the contribution of the input failures. More specifically, this dissertation 1) defines the concept of input failure, 2) studies the related propagation mechanisms, 2) estimates the propagation probability for different types of input failures, and 3) applies the fault propagation analysis to the framework of integrating software into PRA. The dissertation defines the concept of artifact as a reference point to identify expected inputs and consequently input failures (inputs which differ from the expected ones). Input failures are divided into value-related failures (including value, range, type and amount failures) and time-related failures (including time, rate and duration failures). Value failures are examined first. The concept of masking areas and flat parts is defined, and the dissertation proposes an Image Reconstruction Method (IRM) to estimate the propagation probability of input value failures. This method is proven to require less number of test cases than one that could be based on random testing to reach the same relative error. For the other input failure modes, the dissertation reveals how they transform to the data state error and formalizes their propagation criteria so that the IRM can be applied to estimate the propagation probability. The contributions are thus: 1. Clear definition of the concept of input failure; 2. Definition of a systematic process of identification and quantification of the contributions of input failures to risk; 3. Systematic analysis of the propagation mechanisms of each type of input failures

    Integrated Scenario-Based Methodology for Project Risk Management

    Project risk management is currently used in several industries and mandated by government acquisition agencies around the world to manage uncertainty in an effort to improve a project's probability of success. Common practice involves developing a list of risk items scored with probability and consequence ordinal scales by committee usually focusing on cost and schedule issues. A scenario based process modeling construct is introduced using a hybrid Probabilistic Risk Assessment and Decision Analysis framework integrating project development risks with operational system risks. Project management's decisions are explicitly modeled and ranked based on risk importance to the project. Multiple consequence attributes are unified providing a basis for computing total project risk. This study shows that such an approach leads to an analysis system where scenarios tracing risk items to many possible consequences are explicitly understood; the interaction between cost, schedule, and performance models drive the analysis; probabilities for overruns, delays, increased system hazards are determined directly; and state-of-the-art quantification techniques are directly applicable. All these enhance project management's capability to respond with more effective decisions

    Johnson Space Center's Risk and Reliability Analysis Group 2008 Annual Report

    The Johnson Space Center (JSC) Safety & Mission Assurance (S&MA) Directorate s Risk and Reliability Analysis Group provides both mathematical and engineering analysis expertise in the areas of Probabilistic Risk Assessment (PRA), Reliability and Maintainability (R&M) analysis, and data collection and analysis. The fundamental goal of this group is to provide National Aeronautics and Space Administration (NASA) decisionmakers with the necessary information to make informed decisions when evaluating personnel, flight hardware, and public safety concerns associated with current operating systems as well as with any future systems. The Analysis Group includes a staff of statistical and reliability experts with valuable backgrounds in the statistical, reliability, and engineering fields. This group includes JSC S&MA Analysis Branch personnel as well as S&MA support services contractors, such as Science Applications International Corporation (SAIC) and SoHaR. The Analysis Group s experience base includes nuclear power (both commercial and navy), manufacturing, Department of Defense, chemical, and shipping industries, as well as significant aerospace experience specifically in the Shuttle, International Space Station (ISS), and Constellation Programs. The Analysis Group partners with project and program offices, other NASA centers, NASA contractors, and universities to provide additional resources or information to the group when performing various analysis tasks. The JSC S&MA Analysis Group is recognized as a leader in risk and reliability analysis within the NASA community. Therefore, the Analysis Group is in high demand to help the Space Shuttle Program (SSP) continue to fly safely, assist in designing the next generation spacecraft for the Constellation Program (CxP), and promote advanced analytical techniques. The Analysis Section s tasks include teaching classes and instituting personnel qualification processes to enhance the professional abilities of our analysts as well as performing major probabilistic assessments used to support flight rationale and help establish program requirements. During 2008, the Analysis Group performed more than 70 assessments. Although all these assessments were important, some were instrumental in the decisionmaking processes for the Shuttle and Constellation Programs. Two of the more significant tasks were the Space Transportation System (STS)-122 Low Level Cutoff PRA for the SSP and the Orion Pad Abort One (PA-1) PRA for the CxP. These two activities, along with the numerous other tasks the Analysis Group performed in 2008, are summarized in this report. This report also highlights several ongoing and upcoming efforts to provide crucial statistical and probabilistic assessments, such as the Extravehicular Activity (EVA) PRA for the Hubble Space Telescope service mission and the first fully integrated PRAs for the CxP's Lunar Sortie and ISS missions

    Managing multi-module issues in SMR PRA

    Review of Quantitative Software Reliability Methods

    The current U.S. Nuclear Regulatory Commission (NRC) licensing process for digital systems rests on deterministic engineering criteria. In its 1995 probabilistic risk assessment (PRA) policy statement, the Commission encouraged the use of PRA technology in all regulatory matters to the extent supported by the state-of-the-art in PRA methods and data. Although many activities have been completed in the area of risk-informed regulation, the risk-informed analysis process for digital systems has not yet been satisfactorily developed. Since digital instrumentation and control (I&C) systems are expected to play an increasingly important role in nuclear power plant (NPP) safety, the NRC established a digital system research plan that defines a coherent set of research programs to support its regulatory needs. One of the research programs included in the NRC's digital system research plan addresses risk assessment methods and data for digital systems. Digital I&C systems have some unique characteristics, such as using software, and may have different failure causes and/or modes than analog I&C systems; hence, their incorporation into NPP PRAs entails special challenges. The objective of the NRC's digital system risk research is to identify and develop methods, analytical tools, and regulatory guidance for (1) including models of digital systems into NPP PRAs, and (2) using information on the risks of digital systems to support the NRC's risk-informed licensing and oversight activities. For several years, Brookhaven National Laboratory (BNL) has worked on NRC projects to investigate methods and tools for the probabilistic modeling of digital systems, as documented mainly in NUREG/CR-6962 and NUREG/CR-6997. However, the scope of this research principally focused on hardware failures, with limited reviews of software failure experience and software reliability methods. NRC also sponsored research at the Ohio State University investigating the modeling of digital systems using dynamic PRA methods. These efforts, documented in NUREG/CR-6901, NUREG/CR-6942, and NUREG/CR-6985, included a functional representation of the system's software but did not explicitly address failure modes caused by software defects or by inadequate design requirements. An important identified research need is to establish a commonly accepted basis for incorporating the behavior of software into digital I&C system reliability models for use in PRAs. To address this need, BNL is exploring the inclusion of software failures into the reliability models of digital I&C systems, such that their contribution to the risk of the associated NPP can be assessed