Search CORE

59,066 research outputs found

Fault-free performance validation of fault-tolerant multiprocessors

Author: Czeck Edward W.
Feather Frank E.
Grizzaffi Ann Marie
Segall Zary Z.
Siewiorek Daniel P.
Publication venue
Publication date
Field of study

A validation methodology for testing the performance of fault-tolerant computer systems was developed and applied to the Fault-Tolerant Multiprocessor (FTMP) at NASA-Langley's AIRLAB facility. This methodology was claimed to be general enough to apply to any ultrareliable computer system. The goal of this research was to extend the validation methodology and to demonstrate the robustness of the validation methodology by its more extensive application to NASA's Fault-Tolerant Multiprocessor System (FTMP) and to the Software Implemented Fault-Tolerance (SIFT) Computer System. Furthermore, the performance of these two multiprocessors was compared by conducting similar experiments. An analysis of the results shows high level language instruction execution times for both SIFT and FTMP were consistent and predictable, with SIFT having greater throughput. At the operating system level, FTMP consumes 60% of the throughput for its real-time dispatcher and 5% on fault-handling tasks. In contrast, SIFT consumes 16% of its throughput for the dispatcher, but consumes 66% in fault-handling software overhead

NASA Technical Reports Server

Tutorial: Advanced fault tree applications using HARP

Author: Bavuso Salvatore J.
Boyd Mark A.
Dugan Joanne Bechta
Publication venue
Publication date
Field of study

Reliability analysis of fault tolerant computer systems for critical applications is complicated by several factors. These modeling difficulties are discussed and dynamic fault tree modeling techniques for handling them are described and demonstrated. Several advanced fault tolerant computer systems are described, and fault tree models for their analysis are presented. HARP (Hybrid Automated Reliability Predictor) is a software package developed at Duke University and NASA Langley Research Center that is capable of solving the fault tree models presented

NASA Technical Reports Server

Fault tolerant architectures for integrated aircraft electronics systems, task 2

Author: Levitt K. N.
Melliar-Smith P. M.
Schwartz R. L.
Publication venue
Publication date
Field of study

The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported

NASA Technical Reports Server

Study of fault-tolerant software technology

Author: Broglio C.
Goldberg J.
Hitt E.
Levitt K.
Slivinski T.
Webb J.
Wild C.
Publication venue
Publication date
Field of study

Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance

NASA Technical Reports Server

Reliability estimation procedures and CARE: The Computer-Aided Reliability Estimation Program

Author: Mathur F. P.
Publication venue
Publication date
Field of study

Ultrareliable fault-tolerant onboard digital systems for spacecraft intended for long mission life exploration of the outer planets are under development. The design of systems involving self-repair and fault-tolerance leads to the companion problem of quantifying and evaluating the survival probability of the system for the mission under consideration and the constraints imposed upon the system. Methods have been developed to (1) model self-repair and fault-tolerant organizations; (2) compute survival probability, mean life, and many other reliability predictive functions with respect to various systems and mission parameters; (3) perform sensitivity analysis of the system with respect to mission parameters; and (4) quantitatively compare competitive fault-tolerant systems. Various measures of comparison are offered. To automate the procedures of reliability mathematical modeling and evaluation, the CARE (computer-aided reliability estimation) program was developed. CARE is an interactive program residing on the UNIVAC 1108 system, which makes the above calculations and facilitates report preparation by providing output in tabular form, graphical 2-dimensional plots, and 3-dimensional projections. The reliability estimation of fault-tolerant organization by means of the CARE program is described

NASA Technical Reports Server

FAULT INJECTION BASED DEPENDABILITY ANALYSIS

Author: Benyó B.
Pataricza A.
Publication venue: Periodica Polytechnica Electrical Engineering (Archives)
Publication date: 01/01/1993
Field of study

In more recent years there has been a rapid increase in the use of fault tolerant systems. The majority of computer systems, even those which are not labeled as fault tolerant have some built-in fault tolerant features. Accordingly, the need for dependability evaluation tools is increasing. These tools may help the system designer in the validation of the fault tolerance specification of their systems. A portable, general purpose evaluation.environment (called DEEP, Dependability Evaluation Experimental Package) was developed for the dependability analysis of fault tolerant systems. Our objective was to design a general purpose tool both in the sense of the target machine type and fault conditions as well. A special emphasis was given to a realistic fault injection scheme. The test environment was implemented for the dependability analysis of the Mod-ular Expandable Multiprocessor SYstem MEMSY, developed at the Friedrich-Alexander University of Erlangen-Nuremberg. In the paper the developed dependability environment (DEEP) is treated. The system structure and the detailed description of the modules are introduced. The paper contains the description of the reimplementation work of the developed portable system for the master-checker simulation as well. Experimental results of the evaluation of the MEMSY system are presented

Periodica Polytechnica (Budapest University of Technology and Economics)

The art of fault-tolerant system reliability modeling

Author: Butler Ricky W.
Johnson Sally C.
Publication venue
Publication date
Field of study

A step-by-step tutorial of the methods and tools used for the reliability analysis of fault-tolerant systems is presented. Emphasis is on the representation of architectural features in mathematical models. Details of the mathematical solution of complex reliability models are not presented. Instead the use of several recently developed computer programs--SURE, ASSIST, STEM, PAWS--which automate the generation and solution of these models is described

NASA Technical Reports Server

Building a fault-tolerant quantum computer using concatenated cat codes

Author: Arrangoiz-Arriola Patricio
Bohdanowicz Thomas C.
Brandão Fernando G. S. L.
Campbell Earl T.
Chamberland Christopher
Flammia Steven T.
Hann Connor T.
Iverson Joseph K.
Jiang Liang
Keller Andrew J.
Noh Kyungjoo
Painter Oskar
Preskill John
Putterman Harald
Refael Gil
Safavi-Naeini Amir H.
Publication venue
Publication date: 07/12/2020
Field of study

We present a comprehensive architectural analysis for a fault-tolerant quantum computer based on cat codes concatenated with outer quantum error-correcting codes. For the physical hardware, we propose a system of acoustic resonators coupled to superconducting circuits with a two-dimensional layout. Using estimated near-term physical parameters for electro-acoustic systems, we perform a detailed error analysis of measurements and gates, including CNOT and Toffoli gates. Having built a realistic noise model, we numerically simulate quantum error correction when the outer code is either a repetition code or a thin rectangular surface code. Our next step toward universal fault-tolerant quantum computation is a protocol for fault-tolerant Toffoli magic state preparation that significantly improves upon the fidelity of physical Toffoli gates at very low qubit cost. To achieve even lower overheads, we devise a new magic-state distillation protocol for Toffoli states. Combining these results together, we obtain realistic full-resource estimates of the physical error rates and overheads needed to run useful fault-tolerant quantum algorithms. We find that with around 1,000 superconducting circuit components, one could construct a fault-tolerant quantum computer that can run circuits which are intractable for classical supercomputers. Hardware with 32,000 superconducting circuit components, in turn, could simulate the Hubbard model in a regime beyond the reach of classical computing

Performability analysis of fault-tolerant computer systems

Author: Nabli Hédi
Sericola Bruno
Publication venue: HAL CCSD
Publication date: 01/01/1994
Field of study

Disponible dans les fichiers attachés à ce documen

INRIA a CCSD electronic archive server