8 research outputs found
Garbage collection in distributed systems
PhD ThesisThe provision of system-wide heap storage has a number of advantages.
However, when the technique is applied to distributed systems
automatically recovering inaccessible variables becomes a serious problem.
This thesis presents a survey of such garbage collection techniques but
finds that no existing algorithm is entirely suitable. A new, general
purpose algorithm is developed and presented which allows individual
systems to garbage collect largely independently. The effects of these
garbage collections are combined, using recursively structured control
mechanisms, to achieve garbage collection of the entire heap with the
minimum of overheads. Experimental results show that new algorithm
recovers most inaccessible variables more quickly than a straightforward
garbage collection, giving an improved memory utilisation
Fault injection testing of software implemented fault tolerance mechanisms of distributed systems
PhD ThesisOne way of gaining confidence in the adequacy of fault tolerance mechanisms of a
system is to test the system by injecting faults and see how the system performs under
faulty conditions. This thesis investigates the issues of testing software-implemented
fault tolerance mechanisms of distributed systems through fault injection.
A fault injection method has been developed. The method requires that the target
software system be structured as a collection of objects interacting via messages. This
enables easy insertion of fault injection objects into the target system to emulate
incorrect behaviour of faulty processors by manipulating messages. This approach
allows one to inject specific classes of faults while not requiring any significant changes
to the target system. The method differs from the previous work in that it exploits an
object oriented approach of software implementation to support the injection of specific
classes of faults at the system level.
The proposed fault injection method has been applied to test software-implemented
reliable node systems: a TMR (triple modular redundant) node and a fail-silent node.
The nodes have integrated fault tolerance mechanisms and are expected to exhibit
certain behaviour in the presence of a failure. The thesis describes how various such
mechanisms (for example, clock synchronisation protocol, and atomic broadcast
protocol) were tested. The testing revealed flaws in implementation that had not been
discovered before, thereby demonstrating the usefulness of the method. Application of
the approach to other distributed systems is also described in the thesis.CEC ESPRIT programme,
UK Engineering and Physical Sciences Research Council (EPSRC)
Selective transparency in distributed transaction processing
PhD ThesisObject-oriented programming languages provide a powerful interface for
programmers to access the mechanisms necessary for reliable distributed
computing. Using inheritance and polymorphism provided by the object model, it
is possible to develop a hierarchy of classes to capture the semantics and
inter-relationships of various levels of functionality required for distributed
transaction processing. Using multiple inheritance, application developers can
selectively apply transaction properties to suit the requirements of the application
objects.
In addition to the specific problems of (distributed) transaction processing in an
environment of persistent objects, there is a need for a unified framework, or
architecture in which to place this system. To be truly effective, not only the
transaction manager, but the entire transaction support environment must be
described, designed and implemented in terms of objects.
This thesis presents an architecture for reliable distributed processing in which
the management of persistence, provision of transaction properties (e.g.,
concurrency control), and organisation of support services (e.g., RPC) are all
gathered into a unified design based on the object model.UK Science and Engineering Council:
ESPRIT project
Design and development of algorithms for fault tolerant distributed systems
PhD ThesisThis thesis describes the design and development of algorithms for fault
tolerant distributed systems. The development of such algorithms requires
making assumptions about the types of component faults for which toler-
ance is to be provided. Such assumptions must be specified accurately. To
this end, this thesis develops a classification of faults in systems. This fault
classification identifies a range of fault types from the most restricted to the
least restricted. For each fault type, an algorithm for reaching distributed
agreement in the presence of a bounded number of faulty processors is
developed, and thus a family of agreement algorithms is presented. The
influence of the various fault types on the complexities of these algorithms
is discussed. Early stopping algorithms are also developed for selected fault
types and the influence of fault types on the early stopping conditions of the
respective algorithms is analysed. The problem of evaluating the perfor-
mance of distributed replicated systems which will require agreement algo-
rithms is considered next. As a first step in the direction of meeting this
challenging task, a pipeline triple modular redundant system is considered
and analytical methods are derived to evaluate the performance of such a
system. Finally, the accuracy of these methods is examined using computer
simulations.UK Science and Engineering Research Council (SERC),
DELTA-4 consortium of ESPIRI
Integrating safety analysis techniques, supporting identification of common cause failures.
When we apply safety analysis techniques on a new design, our primary objective is to malfunctions. The ultimate aim is to identify weak areas of the design and stimulate design iterations that improve the safety of the system under examination. Unfortunately, the current industrial pratrise shows that this aim is seriously hindered by the lack of appropriate techniques for the analysis of complex hierarchical designs
A characterisation of faults in systems
SIGLEAvailable from British Library Document Supply Centre- DSC:8724.9(206) / BLDSC - British Library Document Supply CentreGBUnited Kingdo