Search CORE

3,310 research outputs found

Fault tolerant architectures for integrated aircraft electronics systems

Author: Levitt K. N.
Melliar-Smith P. M.
Schwartz R. L.
Publication venue
Publication date
Field of study

Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered

NASA Technical Reports Server

Critical fault patterns determination in fault-tolerant computer systems

Author: Losq J.
Mccluskey E. J.
Publication venue
Publication date
Field of study

The method proposed tries to enumerate all the critical fault-patterns (successive occurrences of failures) without analyzing every single possible fault. The conditions for the system to be operating in a given mode can be expressed in terms of the static states. Thus, one can find all the system states that correspond to a given critical mode of operation. The next step consists in analyzing the fault-detection mechanisms, the diagnosis algorithm and the process of switch control. From them, one can find all the possible system configurations that can result from a failure occurrence. Thus, one can list all the characteristics, with respect to detection, diagnosis, and switch control, that failures must have to constitute critical fault-patterns. Such an enumeration of the critical fault-patterns can be directly used to evaluate the overall system tolerance to failures. Present research is focused on how to efficiently make use of these system-level characteristics to enumerate all the failures that verify these characteristics

NASA Technical Reports Server

Evaluation of fault-tolerant parallel-processor architectures over long space missions

Author: Johnson Sally C.
Publication venue
Publication date
Field of study

The impact of a five year space mission environment on fault-tolerant parallel processor architectures is examined. The target application is a Strategic Defense Initiative (SDI) satellite requiring 256 parallel processors to provide the computation throughput. The reliability requirements are that the system still be operational after five years with .99 probability and that the probability of system failure during one-half hour of full operation be less than 10(-7). The fault tolerance features an architecture must possess to meet these reliability requirements are presented, many potential architectures are briefly evaluated, and one candidate architecture, the Charles Stark Draper Laboratory's Fault-Tolerant Parallel Processor (FTPP) is evaluated in detail. A methodology for designing a preliminary system configuration to meet the reliability and performance requirements of the mission is then presented and demonstrated by designing an FTPP configuration

NASA Technical Reports Server

Reliability analysis of triple modular redundancy system with spare

Author: Al-Kofahi Khalid A.
Publication venue: RIT Scholar Works
Publication date: 01/12/1993
Field of study

Hardware redundant fault-tolerant systems and the different design approaches are discussed. The reliability analysis of fault-tolerant systems is usually done under permanent fault conditions. With statistical data suggesting that up to 90% of system failures are caused by intermittent faults, the reliability analysis of fault-tolerant systems must concentrate more on this class of faults. In this work, a reconfigurable Triple Modular Redundancy (TMR) with spare system that differentiates between permanent and intermittent faults has been built. The reconfiguration process of this system depends on both the current status of its modules and their history. Based on this, a different approach for reliability analysis under intermittent fault conditions using Markov models is presented. This approach shows a much higher system reliability compared to other redundant and non-redundant configurations

RIT Scholar Works

On Fault Tolerance Methods for Networks-on-Chip

Author: Lehtonen Teijo
Publication venue: Turku Centre for Computer Science
Publication date: 13/11/2009
Field of study

Technology scaling has proceeded into dimensions in which the reliability of manufactured devices is becoming endangered. The reliability decrease is a consequence of physical limitations, relative increase of variations, and decreasing noise margins, among others. A promising solution for bringing the reliability of circuits back to a desired level is the use of design methods which introduce tolerance against possible faults in an integrated circuit. This thesis studies and presents fault tolerance methods for network-onchip (NoC) which is a design paradigm targeted for very large systems-onchip. In a NoC resources, such as processors and memories, are connected to a communication network; comparable to the Internet. Fault tolerance in such a system can be achieved at many abstraction levels. The thesis studies the origin of faults in modern technologies and explains the classification to transient, intermittent and permanent faults. A survey of fault tolerance methods is presented to demonstrate the diversity of available methods. Networks-on-chip are approached by exploring their main design choices: the selection of a topology, routing protocol, and flow control method. Fault tolerance methods for NoCs are studied at different layers of the OSI reference model. The data link layer provides a reliable communication link over a physical channel. Error control coding is an efficient fault tolerance method especially against transient faults at this abstraction level. Error control coding methods suitable for on-chip communication are studied and their implementations presented. Error control coding loses its effectiveness in the presence of intermittent and permanent faults. Therefore, other solutions against them are presented. The introduction of spare wires and split transmissions are shown to provide good tolerance against intermittent and permanent errors and their combination to error control coding is illustrated. At the network layer positioned above the data link layer, fault tolerance can be achieved with the design of fault tolerant network topologies and routing algorithms. Both of these approaches are presented in the thesis together with realizations in the both categories. The thesis concludes that an optimal fault tolerance solution contains carefully co-designed elements from different abstraction levelsSiirretty Doriast

UTUPub

A fault-tolerant multiprocessor architecture for aircraft, volume 1

Author: Ausrotas R. A.
Hanley L. D.
Hopkins A. L.
Lala J. H.
Martin J. H.
Smith T. B.
Taylor W.
Publication venue
Publication date
Field of study

A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed

NASA Technical Reports Server

Design of a fault tolerant airborne digital computer. Volume 1: Architecture

Author: Goldberg J.
Green M. W.
Levitt K. N.
Neumann P. G.
Wensley J. H.
Publication venue
Publication date
Field of study

This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive

NASA Technical Reports Server