7 research outputs found
An automated wrapper-based approach to the design of dependable software
The design of dependable software systems invariably comprises two main activities: (i) the design of dependability mechanisms, and (ii) the location of dependability mechanisms. It has been shown that these activities are intrinsically difficult. In this paper we propose an automated wrapper-based methodology to circumvent the problems associated with the design and location of dependability mechanisms. To achieve this we replicate important variables so that they can be used as part of standard, efficient dependability mechanisms. These well-understood mechanisms are then deployed in all relevant locations. To validate the proposed methodology we apply it to three complex software systems, evaluating the dependability enhancement and execution overhead in each case. The results generated demonstrate that the system failure rate of a wrapped software system can be several orders of magnitude lower than that of an unwrapped equivalent
The Degree of Masking Fault Tolerance vs. Temporal Redundancy
Abstract-Self-stabilizing systems, intended to run for a long time, commonly have to cope with transient faults during their mission. We model the behavior of a distributed self-stabilizing system under such a fault model as a Markov chain. Adding fault detection to a self-correcting non-masking fault tolerant system is required to progress from non-masking systems towards their masking fault tolerant functional equivalents. We introduce a novel measure, called limiting window availability (LWA) and apply it on self-stabilizing systems in order to quantify the probability of (masked) stabilization against the time that is needed for stabilization. We show how to calculate LWA based on Markov chains: first, by a straightforward Markov chain modeling and second, by using a suitable abstraction resulting in a space-reduced Markov chain. The proposed abstraction can in particular be applied to spot fault tolerance bottlenecks in the system design
Preserving Stabilization while Practically Bounding State Space
Stabilization is a key dependability property for dealing with unanticipated
transient faults, as it guarantees that even in the presence of such faults,
the system will recover to states where it satisfies its specification. One of
the desirable attributes of stabilization is the use of bounded space for each
variable. In this paper, we present an algorithm that transforms a stabilizing
program that uses variables with unbounded domain into a stabilizing program
that uses bounded variables and (practically bounded) physical time. While
non-stabilizing programs (that do not handle transient faults) can deal with
unbounded variables by assigning large enough but bounded space, stabilizing
programs that need to deal with arbitrary transient faults cannot do the same
since a transient fault may corrupt the variable to its maximum value. We show
that our transformation algorithm is applicable to several problems including
logical clocks, vector clocks, mutual exclusion, leader election, diffusing
computations, Paxos based consensus, and so on. Moreover, our approach can also
be used to bound counters used in an earlier work by Katz and Perry for adding
stabilization to a non-stabilizing program. By combining our algorithm with
that earlier work by Katz and Perry, it would be possible to provide
stabilization for a rich class of problems, by assigning large enough but
bounded space for variables.Comment: Moved some content from the Appendix to the main paper, added some
details to the transformation algorithm and to its descriptio
On Systematic Design of Fast and Perfect Detectors
We present a theory of fast and perfect detector components that extends the theory of detectors and correctors of Arora and Kulkarni, and based on which, we develop an algorithm that automatically transforms a fault-intolerant program into a fail-safe fault-tolerant program. Apart from presenting novel insights into the working principles of detectors, the theory also allows the definition of a detection latency efficiency metric for a fail-safe fault-tolerant program. We prove that in contrast to an earlier algorithm by Kulkarni and Arora, our algorithm produces fail-safe fault-tolerant programs with optimal detection latency. The application area of our results is in the domain of distributed embedded applications
Designing Masking Fault-tolerance via Nonmasking Fault-tolerance
Masking fault-tolerance guarantees that programs continually satisfy their specification in the presence of faults. By way of contrast, nonmasking fault-tolerance does not guarantee as much: it merely guarantees that when faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. We present in this paper a component based method for the design of masking faulttolerant programs. In this method, components are added to a fault-intolerant program in a stepwise manner, first, to transform the fault-intolerant program into a nonmasking fault-tolerant one and, then, to enhance the fault-tolerance from nonmasking to masking. We illustrate the method by designing programs for agreement in the presence of Byzantine faults, data transfer in the presence of message loss, triple modular redundancy in the presence of input corruption, and mutual exclusion in the presence of process fail-stops. These examples also serve t..
On the Limits and Practice of Automatically Designing Self-Stabilization
A protocol is said to be self-stabilizing when the distributed system executing it is guaranteed to recover from any fault that does not cause permanent damage. Designing such protocols is hard since they must recover from all possible states, therefore we investigate how feasible it is to synthesize them automatically. We show that synthesizing stabilization on a fixed topology is NP-complete in the number of system states. When a solution is found, we further show that verifying its correctness on a general topology (with any number of processes) is undecidable, even for very simple unidirectional rings. Despite these negative results, we develop an algorithm to synthesize a self-stabilizing protocol given its desired topology, legitimate states, and behavior. By analogy to shadow puppetry, where a puppeteer may design a complex puppet to cast a desired shadow, a protocol may need to be designed in a complex way that does not even resemble its specification. Our shadow/puppet synthesis algorithm addresses this concern and, using a complete backtracking search, has automatically designed 4 new self-stabilizing protocols with minimal process space requirements: 2-state maximal matching on bidirectional rings, 5-state token passing on unidirectional rings, 3-state token passing on bidirectional chains, and 4-state orientation on daisy chains