Search CORE

4 research outputs found

Fault-tolerant software: dependability/performance trade-offs, concurrency and system support

Author: Xu Jie
Publication venue: Newcastle University
Publication date: 01/01/1999
Field of study

PhD ThesisAs the use of computer systems becomes more and more widespread in applications that demand high levels of dependability, these applications themselves are growing in complexity in a rapid rate, especially in the areas that require concurrent and distributed computing. Such complex systems are very prone to faults and errors. No matter how rigorously fault avoidance and fault removal techniques are applied, software design faults often remain in systems when they are delivered to the customers. In fact, residual software faults are becoming the significant underlying cause of system failures and the lack of dependability. There is tremendous need for systematic techniques for building dependable software, including the fault tolerance techniques that ensure software-based systems to operate dependably even when potential faults are present. However, although there has been a large amount of research in the area of fault-tolerant software, existing techniques are not yet sufficiently mature as a practical engineering discipline for realistic applications. In particular, they are often inadequate when applied to highly concurrent and distributed software. This thesis develops new techniques for building fault-tolerant software, addresses the problem of achieving high levels of dependability in concurrent and distributed object systems, and studies system-level support for implementing dependable software. Two schemes are developed - the t/(n-l)-VP approach is aimed at increasing software reliability and controlling additional complexity, while the SCOP approach presents an adaptive way of dynamically adjusting software reliability and efficiency aspects. As a more general framework for constructing dependable concurrent and distributed software, the Coordinated Atomic (CA) Action scheme is examined thoroughly. Key properties of CA actions are formalized, conceptual model and mechanisms for handling application level exceptions are devised, and object-based diversity techniques are introduced to cope with potential software faults. These three schemes are evaluated analytically and validated by controlled experiments. System-level support is also addressed with a multi-level system architecture. An architectural pattern for implementing fault-tolerant objects is documented in detail to capture existing solutions and our previous experience. An industrial safety-critical application, the Fault-Tolerant Production Cell, is used as a case study to examine most of the concepts and techniques developed in this research.ESPRIT

Newcastle University eTheses

A system architecture for fault tolerance in concurrent software

Author: Ancona M.
Clematis A.
Consiglio Nazionale delle Ricerche Genoa (Italy). Istituto per la Matematica Applicata
Dodero G.
Fernandez E.B.
Giannuzzi V
Publication venue
Publication date: 01/01/1989
Field of study

SIGLEITItal

OpenGrey Repository

A System Architecture for Fault Tolerance in Concurrent Software

Author: A. Clematis
Ancona Massimo
E. B. Fernandez
G. Dodero
V. Gianuzzi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1990
Field of study

A system architecture called the recovery metaprogram (RMP) is proposed. It separates the application from the recovery software, giving programmers a single environment that lets them use the most appropriate fault-tolerance scheme. To simplify the presentation of the RMP approach, it is assumed that the fault model is limited to faults originating in the application software, and that the hardware and kernel layers can mask their own faults from the RMP. Also, relationships between backward and forward error recovery are not considered. Some RMP examples are given, and a particular RMP implementation is describe

Archivio istituzionale della ricerca - Università di Genova

A system architecture for fault tolerance in concurrent software

Author: Ancona M.
Clematis A.
Dodero G.
Fernandez E.B.
Giannuzzi V
Consiglio Nazionale delle Ricerche Genoa (Italy). Istituto per la Matematica Applicata
Publication venue
Publication date: 01/01/1989
Field of study

SIGLEITItal

Crossref

Repository of the Academy's Library

OpenGrey Repository