94 research outputs found

    The art of fault-tolerant system reliability modeling

    Get PDF
    A step-by-step tutorial of the methods and tools used for the reliability analysis of fault-tolerant systems is presented. Emphasis is on the representation of architectural features in mathematical models. Details of the mathematical solution of complex reliability models are not presented. Instead the use of several recently developed computer programs--SURE, ASSIST, STEM, PAWS--which automate the generation and solution of these models is described

    SURE reliability analysis: Program and mathematics

    Get PDF
    The SURE program is a new reliability analysis tool for ultrareliable computer system architectures. The computational methods on which the program is based provide an efficient means for computing accurate upper and lower bounds for the death state probabilities of a large class of semi-Markov models. Once a semi-Markov model is described using a simple input language, the SURE program automatically computes the upper and lower bounds on the probability of system failure. A parameter of the model can be specified as a variable over a range of values directing the SURE program to perform a sensitivity analysis automatically. This feature, along with the speed of the program, makes it especially useful as a design tool

    The fault-tree compiler

    Get PDF
    The Fault Tree Compiler Program is a new reliability tool used to predict the top event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N gates. The high level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precise (within the limits of double precision floating point arithmetic) to the five digits in the answer. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Corporation VAX with the VMS operation system

    The Fault Tree Compiler (FTC): Program and mathematics

    Get PDF
    The Fault Tree Compiler Program is a new reliability tool used to predict the top-event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, AND m OF n gates. The high-level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precisely (within the limits of double precision floating point arithmetic) within a user specified number of digits accuracy. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Equipment Corporation (DEC) VAX computer with the VMS operation system

    Hardware proofs using EHDM and the RSRE verification methodology

    Get PDF
    Examined is a methodology for hardware verification developed by Royal Signals and Radar Establishment (RSRE) in the context of the SRI International's Enhanced Hierarchical Design Methodology (EHDM) specification/verification system. The methodology utilizes a four-level specification hierarchy with the following levels: functional level, finite automata model, block model, and circuit level. The properties of a level are proved as theorems in the level below it. This methodology is applied to a 6-bit counter problem and is critically examined. The specifications are written in EHDM's specification language, Extended Special, and the proofs are improving both the RSRE methodology and the EHDM system

    A preliminary transient-fault experiment on the SIFT computer system

    Get PDF
    This paper presents the results of a preliminary experiment to study the effectiveness of a fault-tolerant system's ability to handle transient faults. The primary goal of the experiment was to develop the techniques to measure the parameters needed for a reliability analysis of the SIFT computer system which includes th effects of transient faults. A key aspect of such an analysis is the determination of the effectiveness of the operating system's ability to discriminate between transient and permanent faults. A detailed description of the preliminary transient fault experiment along with the results from 297 transient fault injections are given. Although not enough data was obtained to draw statistically significant conclusions, the foundation has been laid for a large-scale transient fault experiment

    A Primer on Architectural Level Fault Tolerance

    Get PDF
    This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition

    Design for validation, based on formal methods

    Get PDF
    Validation of ultra-reliable systems decomposes into two subproblems: (1) quantification of probability of system failure due to physical failure; (2) establishing that Design Errors are not present. Methods of design, testing, and analysis of ultra-reliable software are discussed. It is concluded that a design-for-validation based on formal methods is needed for the digital flight control systems problem, and also that formal methods will play a major role in the development of future high reliability digital systems

    NASA Formal Methods Workshop, 1990

    Get PDF
    The workshop brought together researchers involved in the NASA formal methods research effort for detailed technical interchange and provided a mechanism for interaction with representatives from the FAA and the aerospace industry. The workshop also included speakers from industry to debrief the formal methods researchers on the current state of practice in flight critical system design, verification, and certification. The goals were: define and characterize the verification problem for ultra-reliable life critical flight control systems and the current state of practice in industry today; determine the proper role of formal methods in addressing these problems, and assess the state of the art and recent progress toward applying formal methods to this area

    The Second NASA Formal Methods Workshop 1992

    Get PDF
    The primary goal of the workshop was to bring together formal methods researchers and aerospace industry engineers to investigate new opportunities for applying formal methods to aerospace problems. The first part of the workshop was tutorial in nature. The second part of the workshop explored the potential of formal methods to address current aerospace design and verification problems. The third part of the workshop involved on-line demonstrations of state-of-the-art formal verification tools. Also, a detailed survey was filled in by the attendees; the results of the survey are compiled
    corecore