7 research outputs found

    An automated wrapper-based approach to the design of dependable software

    Get PDF
    The design of dependable software systems invariably comprises two main activities: (i) the design of dependability mechanisms, and (ii) the location of dependability mechanisms. It has been shown that these activities are intrinsically difficult. In this paper we propose an automated wrapper-based methodology to circumvent the problems associated with the design and location of dependability mechanisms. To achieve this we replicate important variables so that they can be used as part of standard, efficient dependability mechanisms. These well-understood mechanisms are then deployed in all relevant locations. To validate the proposed methodology we apply it to three complex software systems, evaluating the dependability enhancement and execution overhead in each case. The results generated demonstrate that the system failure rate of a wrapped software system can be several orders of magnitude lower than that of an unwrapped equivalent

    The Degree of Masking Fault Tolerance vs. Temporal Redundancy

    Get PDF
    Abstract-Self-stabilizing systems, intended to run for a long time, commonly have to cope with transient faults during their mission. We model the behavior of a distributed self-stabilizing system under such a fault model as a Markov chain. Adding fault detection to a self-correcting non-masking fault tolerant system is required to progress from non-masking systems towards their masking fault tolerant functional equivalents. We introduce a novel measure, called limiting window availability (LWA) and apply it on self-stabilizing systems in order to quantify the probability of (masked) stabilization against the time that is needed for stabilization. We show how to calculate LWA based on Markov chains: first, by a straightforward Markov chain modeling and second, by using a suitable abstraction resulting in a space-reduced Markov chain. The proposed abstraction can in particular be applied to spot fault tolerance bottlenecks in the system design

    Preserving Stabilization while Practically Bounding State Space

    Full text link
    Stabilization is a key dependability property for dealing with unanticipated transient faults, as it guarantees that even in the presence of such faults, the system will recover to states where it satisfies its specification. One of the desirable attributes of stabilization is the use of bounded space for each variable. In this paper, we present an algorithm that transforms a stabilizing program that uses variables with unbounded domain into a stabilizing program that uses bounded variables and (practically bounded) physical time. While non-stabilizing programs (that do not handle transient faults) can deal with unbounded variables by assigning large enough but bounded space, stabilizing programs that need to deal with arbitrary transient faults cannot do the same since a transient fault may corrupt the variable to its maximum value. We show that our transformation algorithm is applicable to several problems including logical clocks, vector clocks, mutual exclusion, leader election, diffusing computations, Paxos based consensus, and so on. Moreover, our approach can also be used to bound counters used in an earlier work by Katz and Perry for adding stabilization to a non-stabilizing program. By combining our algorithm with that earlier work by Katz and Perry, it would be possible to provide stabilization for a rich class of problems, by assigning large enough but bounded space for variables.Comment: Moved some content from the Appendix to the main paper, added some details to the transformation algorithm and to its descriptio

    On Systematic Design of Fast and Perfect Detectors

    Get PDF
    We present a theory of fast and perfect detector components that extends the theory of detectors and correctors of Arora and Kulkarni, and based on which, we develop an algorithm that automatically transforms a fault-intolerant program into a fail-safe fault-tolerant program. Apart from presenting novel insights into the working principles of detectors, the theory also allows the definition of a detection latency efficiency metric for a fail-safe fault-tolerant program. We prove that in contrast to an earlier algorithm by Kulkarni and Arora, our algorithm produces fail-safe fault-tolerant programs with optimal detection latency. The application area of our results is in the domain of distributed embedded applications

    Designing Masking Fault-tolerance via Nonmasking Fault-tolerance

    No full text
    Masking fault-tolerance guarantees that programs continually satisfy their specification in the presence of faults. By way of contrast, nonmasking fault-tolerance does not guarantee as much: it merely guarantees that when faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. We present in this paper a component based method for the design of masking faulttolerant programs. In this method, components are added to a fault-intolerant program in a stepwise manner, first, to transform the fault-intolerant program into a nonmasking fault-tolerant one and, then, to enhance the fault-tolerance from nonmasking to masking. We illustrate the method by designing programs for agreement in the presence of Byzantine faults, data transfer in the presence of message loss, triple modular redundancy in the presence of input corruption, and mutual exclusion in the presence of process fail-stops. These examples also serve t..

    On the Limits and Practice of Automatically Designing Self-Stabilization

    Get PDF
    A protocol is said to be self-stabilizing when the distributed system executing it is guaranteed to recover from any fault that does not cause permanent damage. Designing such protocols is hard since they must recover from all possible states, therefore we investigate how feasible it is to synthesize them automatically. We show that synthesizing stabilization on a fixed topology is NP-complete in the number of system states. When a solution is found, we further show that verifying its correctness on a general topology (with any number of processes) is undecidable, even for very simple unidirectional rings. Despite these negative results, we develop an algorithm to synthesize a self-stabilizing protocol given its desired topology, legitimate states, and behavior. By analogy to shadow puppetry, where a puppeteer may design a complex puppet to cast a desired shadow, a protocol may need to be designed in a complex way that does not even resemble its specification. Our shadow/puppet synthesis algorithm addresses this concern and, using a complete backtracking search, has automatically designed 4 new self-stabilizing protocols with minimal process space requirements: 2-state maximal matching on bidirectional rings, 5-state token passing on unidirectional rings, 3-state token passing on bidirectional chains, and 4-state orientation on daisy chains
    corecore