308 research outputs found
Recommended from our members
Assessing Asymmetric Fault-Tolerant Software
The most popular forms of fault tolerance against design faults use "asymmetric" architectures in which a "primary" part performs the computation and a "secondary" part is in charge of detecting errors and performing some kind of error processing and recovery. In contrast, the most studied forms of software fault tolerance are "symmetric" ones, e.g. N-version programming. The latter are often controversial, the former are not. We discuss how to assess the dependability gains achieved by these methods. Substantial difficulties have been shown to exist for symmetric schemes, but we show that the same difficulties affect asymmetric schemes. Indeed, the latter present somewhat subtler problems. In both cases, to predict the dependability of the fault-tolerant system it is not enough to know the dependability of the individual components. We extend to asymmetric architectures the style of probabilistic modeling that has been useful for describing the dependability of "symmetric" architectures, to highlight factors that complicate the assessment. In the light of these models, we finally discuss fault injection approaches to estimating coverage factors. We highlight the limits of what can be predicted and some useful research directions towards clarifying and extending the range of situations in which estimates of coverage of fault tolerance mechanisms can be trusted
Recommended from our members
An Empirical Study of the Effectiveness of 'Forcing Diversity' Based on a Large Population of Diverse Programs
Use of diverse software components is a viable defence against common-mode failures in redundant softwarebased systems. Various forms of "Diversity-Seeking Decisions" (“DSDs”) can be applied to the process of developing, or procuring, redundant components, to improve the chances of the resulting components not failing on the same demands. An open question is how effective these decisions, and their combinations, are for achieving large enough reliability gains. Using a large population of software programs, we studied experimentally the effectiveness of specific "DSDs" (and their combinations) mandating differences between redundant components. Some of these combinations produced much better improvements in system probability of failure per demand (PFD) than "uncontrolled" diversity did. Yet, our findings suggest that the gains from such "DSDs" vary significantly between them and between the application problems studied. The relationship between DSDs and system PFD is complex and does not allow for simple universal rules
(e.g. "the more diversity the better") to apply
Recommended from our members
Assessing the reliability of diverse fault-tolerant software-based systems
We discuss a problem in the safety assessment of automatic control and protection systems. There is an increasing dependence on software for performing safety-critical functions, like the safety shut-down of dangerous plants. Software brings increased risk of design defects and thus systematic failures; redundancy with diversity between redundant channels is a possible defence. While diversity techniques can improve the dependability of software-based systems, they do not alleviate the difficulties of assessing whether such a system is safe enough for operation. We study this problem for a simple safety protection system consisting of two diverse channels performing the same function. The problem is evaluating its probability of failure in demand. Assuming failure independence between dangerous failures of the channels is unrealistic. One can instead use evidence from the observation of the whole system's behaviour under realistic test conditions. Standard inference procedures can then estimate system reliability, but they take no advantage of a system’s fault-tolerant structure. We show how to extend these techniques to take account of fault tolerance by a conceptually straightforward application of Bayesian inference. Unfortunately, the method is computationally complex and requires the conceptually difficult step of specifying 'prior' distributions for the parameters of interest. This paper presents the correct inference procedure, exemplifies possible pitfalls in its application and clarifies some non-intuitive issues about reliability assessment for fault-tolerant software
Recommended from our members
A note on reliability estimation of functionally diverse systems
It has been argued that functional diversity might be a plausible means of claiming independence of failures between two versions of a system. We present a model of functional diversity, in the spirit of earlier models of diversity such as those of Eckhardt and Lee, and Hughes. In terms of the model, we show that the claims for independence between functionally diverse systems seem rather unrealistic. Instead, it seems likely that functionally diverse systems will exhibit positively correlated failures, and thus will be less reliable than an assumption of independence would suggest. The result does not, of course, suggest that functional diversity is not worthwhile; instead, it places upon the evaluator of such a system the onus to estimate the degree of dependence so as to evaluate the reliability of the system
Choosing effective methods for design diversity - How to progress from intuition to science
Design diversity is a popular defence against design faults in safety critical systems. Design diversity is at times pursued by simply isolating the development teams of the different versions, but it is presumably better to "force" diversity, by appropriate prescriptions to the teams. There are many ways of forcing diversity. Yet, managers who have to choose a cost-effective combination of these have little guidance except their own intuition. We argue the need for more scientifically based recommendations, and outline the problems with producing them. We focus on what we think is the standard basis for most recommendations: the belief that, in order to produce failure diversity among versions, project decisions should aim at causing "diversity" among the faults in the versions. We attempt to clarify what these beliefs mean, in which cases they may be justified and how they can be checked or disproved experimentally
Assessing the Reliability of Diverse Fault-Tolerant Systems
Design diversity between redundant channels is a way of improving the dependability of software-based systems, but it does not alleviate the difficulties of dependability assessment
Recommended from our members
How to discriminate between computer-aided and computer-hindered decisions: a case study in mammography
Background. Computer aids can affect decisions in complex ways, potentially even making them worse; common assessment methods may miss these effects. We developed a method for estimating the quality of decisions, as well as how computer aids affect it, and applied it to computer-aided detection (CAD) of cancer, reanalyzing data from a published study where 50 professionals (“readers”) interpreted 180 mammograms, both with and without computer support.
Method. We used stepwise regression to estimate how CAD affected the probability of a reader making a correct screening decision on a patient with cancer (sensitivity), thereby taking into account the effects of the difficulty of the cancer (proportion of readers who missed it) and the reader’s discriminating ability (Youden’s determinant). Using regression estimates, we obtained thresholds for classifying a posteriori the cases (by difficulty) and the readers (by discriminating ability).
Results. Use of CAD was associated with a 0.016 increase in sensitivity (95% confidence interval [CI], 0.003–0.028) for the 44 least discriminating radiologists for 45 relatively easy, mostly CAD-detected cancers. However, for the 6 most discriminating radiologists, with CAD, sensitivity decreased by 0.145 (95% CI, 0.034–0.257) for the 15 relatively difficult cancers.
Conclusions. Our exploratory analysis method reveals unexpected effects. It indicates that, despite the original study detecting no significant average effect, CAD helped the less discriminating readers but hindered the more discriminating readers. Such differential effects, although subtle, may be clinically significant and important for improving both computer algorithms and protocols for their use. They should be assessed when evaluating CAD and similar warning systems
On Systematic Design of Protectors for Employing OTS Items
Off-the-shelf (OTS) components are increasingly used in application areas with stringent dependability requirements. Component wrapping is a well known structuring technique used in many areas. We propose a general approach to developing protective wrappers that assist in integrating OTS items with a focus on the overall system dependability. The wrappers are viewed as redundant software used to detect errors or suspicious activity and to execute appropriate recovery when possible; wrapper development is considered as a part of system integration activities. Wrappers are to be rigorously specified and executed at run time as a means of protecting OTS items against faults in the rest of the system, and the system against the OTS item's faults. Possible symptoms of erroneous behaviour to be detected by a protective wrapper and possible actions to be undertaken in response are listed and discussed. The information required for wrapper development is provided by traceability analysis. Possible approaches to implementing “protectors” in the standard current component technologies are briefly outline
Recommended from our members
Modeling the effects of combining diverse software fault detection techniques
The software engineering literature contains many studies of the efficacy of fault finding techniques. Few of these, however, consider what happens when several different techniques are used together. We show that the effectiveness of such multitechnique approaches depends upon quite subtle interplay between their individual efficacies and dependence between them. The modelling tool we use to study this problem is closely related to earlier work on software design diversity. The earliest of these results showed that, under quite plausible assumptions, it would be unreasonable even to expect software versions that were developed ‘truly independently’ to fail independently of one another. The key idea here was a ‘difficulty function’ over the input space. Later work extended these ideas to introduce a notion of ‘forced’ diversity, in which it became possible to obtain system failure behaviour better even than could be expected if the versions failed independently. In this paper we show that many of these results for design diversity have counterparts in diverse fault detection in a single software version. We define measures of fault finding effectiveness, and of diversity, and show how these might be used to give guidance for the optimal application of different fault finding procedures to a particular program. We show that the effects upon reliability of repeated applications of a particular fault finding procedure are not statistically independent - in fact such an incorrect assumption of independence will always give results that are too optimistic. For diverse fault finding procedures, on the other hand, things are different: here it is possible for effectiveness to be even greater than it would be under an assumption of statistical independence. We show that diversity of fault finding procedures is, in a precisely defined way, ‘a good thing’, and should be applied as widely as possible. The new model and its results are illustrated using some data from an experimental investigation into diverse fault finding on a railway signalling application
- …
