Search CORE

1,526 research outputs found

A fault-tolerant multiprocessor architecture for aircraft, volume 1

Author: Ausrotas R. A.
Hanley L. D.
Hopkins A. L.
Lala J. H.
Martin J. H.
Smith T. B.
Taylor W.
Publication venue
Publication date
Field of study

A fault-tolerant multiprocessor architecture is reported. This architecture, together with a comprehensive information system architecture, has important potential for future aircraft applications. A preliminary definition and assessment of a suitable multiprocessor architecture for such applications is developed

NASA Technical Reports Server

Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 4: FTMP executive summary

Author: Lala J. H.
Smith T. B., III
Publication venue
Publication date
Field of study

The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts

NASA Technical Reports Server

Problems related to the integration of fault tolerant aircraft electronic systems

Author: Adlakha V.
Alspaugh T. A., Jr.
Bannister J. A.
Triyedi K.
Publication venue
Publication date
Field of study

Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included

NASA Technical Reports Server

Fault-tolerant computer study

Author: Avizienis A. A.
Ercegovac M. D.
Rennels D. A.
Publication venue
Publication date
Field of study

A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed

NASA Technical Reports Server

Lessons learned in creating spacecraft computer systems: Implications for using Ada (R) for the space station

Author: Tomayko James E.
Publication venue
Publication date
Field of study

Twenty-five years of spacecraft onboard computer development have resulted in a better understanding of the requirements for effective, efficient, and fault tolerant flight computer systems. Lessons from eight flight programs (Gemini, Apollo, Skylab, Shuttle, Mariner, Voyager, and Galileo) and three reserach programs (digital fly-by-wire, STAR, and the Unified Data System) are useful in projecting the computer hardware configuration of the Space Station and the ways in which the Ada programming language will enhance the development of the necessary software. The evolution of hardware technology, fault protection methods, and software architectures used in space flight in order to provide insight into the pending development of such items for the Space Station are reviewed

NASA Technical Reports Server

Autonomous spacecraft maintenance study group

Author: Low G. D.
Marshall M. H.
Publication venue
Publication date
Field of study

A plan to incorporate autonomous spacecraft maintenance (ASM) capabilities into Air Force spacecraft by 1989 is outlined. It includes the successful operation of the spacecraft without ground operator intervention for extended periods of time. Mechanisms, along with a fault tolerant data processing system (including a nonvolatile backup memory) and an autonomous navigation capability, are needed to replace the routine servicing that is presently performed by the ground system. The state of the art fault handling capabilities of various spacecraft and computers are described, and a set conceptual design requirements needed to achieve ASM is established. Implementations for near term technology development needed for an ASM proof of concept demonstration by 1985, and a research agenda addressing long range academic research for an advanced ASM system for 1990s are established

NASA Technical Reports Server

Analysis of checkpointing schemes for multiprocessor systems

Author: Bruck Jehoshua
Ziv Avi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1993
Field of study

Parallel computing systems provide hardware redundancy that helps to achieve low cost fault-tolerance, by duplicating the task into more than a single processor, and comparing the states of the processors at checkpoints. This paper suggests a novel technique, based on a Markov Reward Model (MRM), for analyzing the performance of checkpointing schemes with task duplication. We show how this technique can be used to derive the average execution time of a task and other important parameters related to the performance of checkpointing schemes. Our analytical results match well the values we obtained using a simulation program. We compare the average task execution time and total work of four checkpointing schemes, and show that generally increasing the number of processors reduces the average execution time, but increases the total work done by the processors. However, in cases where there is a big difference between the time it takes to perform different operations, those results can change

CiteSeerX

Caltech Authors

Performance optimization of checkpointing schemes with task duplication

Author: Bruck Jehoshua
Ziv Avi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/1997
Field of study

In checkpointing schemes with task duplication, checkpointing serves two purposes: detecting faults by comparing the processors' states at checkpoints, and reducing fault recovery time by supplying a safe point to rollback to. In this paper, we show that, by tuning the checkpointing schemes to a given architecture, a significant reduction in the execution time can be achieved. The main idea is to use two types of checkpoints: compare-checkpoints (comparing the states of the redundant processes to detect faults) and store-checkpoints (storing the states to reduce recovery time). With two types of checkpoints, we can use both the comparison and storage operations in an efficient way and improve the performance of checkpointing schemes. Results we obtained show that, in some cases, using compare and store checkpoints can reduce the overhead of DMR checkpointing schemes by as much as 30 percent

Caltech Authors