59 research outputs found
Software reliability through fault-avoidance and fault-tolerance
Twenty independently developed but functionally equivalent software versions were used to investigate and compare empirically some properties of N-version programming, Recovery Block, and Consensus Recovery Block, using the majority and consensus voting algorithms. This was also compared with another hybrid fault-tolerant scheme called Acceptance Voting, using dynamic versions of consensus and majority voting. Consensus voting provides adaptation of the voting strategy to varying component reliability, failure correlation, and output space characteristics. Since failure correlation among versions effectively reduces the cardinality of the space in which the voter make decisions, consensus voting is usually preferable to simple majority voting in any fault-tolerant system. When versions have considerably different reliabilities, the version with the best reliability will perform better than any of the fault-tolerant techniques
Software reliability through fault-avoidance and fault-tolerance
The use of back-to-back, or comparison, testing for regression test or porting is examined. The efficiency and the cost of the strategy is compared with manual and table-driven single version testing. Some of the key parameters that influence the efficiency and the cost of the approach are the failure identification effort during single version program testing, the extent of implemented changes, the nature of the regression test data (e.g., random), and the nature of the inter-version failure correlation and fault-masking. The advantages and disadvantages of the technique are discussed, together with some suggestions concerning its practical use
Multiversion software reliability through fault-avoidance and fault-tolerance
In this project we have proposed to investigate a number of experimental and theoretical issues associated with the practical use of multi-version software in providing dependable software through fault-avoidance and fault-elimination, as well as run-time tolerance of software faults. In the period reported here we have working on the following: We have continued collection of data on the relationships between software faults and reliability, and the coverage provided by the testing process as measured by different metrics (including data flow metrics). We continued work on software reliability estimation methods based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. We have continued studying back-to-back testing as an efficient mechanism for removal of uncorrelated faults, and common-cause faults of variable span. We have also been studying back-to-back testing as a tool for improvement of the software change process, including regression testing. We continued investigating existing, and worked on formulation of new fault-tolerance models. In particular, we have partly finished evaluation of Consensus Voting in the presence of correlated failures, and are in the process of finishing evaluation of Consensus Recovery Block (CRB) under failure correlation. We find both approaches far superior to commonly employed fixed agreement number voting (usually majority voting). We have also finished a cost analysis of the CRB approach
Recommended from our members
Conservative bounds for the pfd of a 1-out-of-2 software-based system based on an assessor’s subjective probability of "not worse than independence"
We consider the problem of assessing the reliability of a 1-out-of-2 software-based system, in which failures of the two channels cannot be assumed to be independent with certainty. An informal approach to this problem assesses the channel pfds (probabilities of failure on demand) conservatively and then multiplies these together in the hope that the conservatism will be sufficient to overcome any possible dependence between the channel failures. Our intention here is to place this kind of reasoning on a formal footing. We introduce a notion of “not worse than independence” and assume that an assessor has a prior belief about this, expressed as a probability. We obtain a conservative prior system pfd, and show how a conservative posterior system pfd can be obtained following the observation of a number of demands without system failure. We present some illustrative numerical examples, discuss some of the difficulties involved in this way of reasoning, and suggest some avenues of future research
Recommended from our members
Conservative bounds for the pfd of a 1-out-of-2 software-based system based on an assessor’s subjective probability of “not worse than independence”
We consider the problem of assessing the reliability of a 1-out-of-2 software-based system, in which failures of the two channels cannot be assumed to be independent with certainty. An informal approach to this problem assesses the channel pfds (probabilities of failure on demand) conservatively and then multiplies these together in the hope that the conservatism will be sufficient to overcome any possible dependence between the channel failures. Our intention here is to place this kind of reasoning on a formal footing. We introduce a notion of “not worse than independence” and assume that an assessor has a prior belief about this, expressed as a probability. We obtain a conservative prior system pfd, and show how a conservative posterior system pfd can be obtained following the observation of a number of demands without system failure. We present some illustrative numerical examples, discuss some of the difficulties involved in this way of reasoning, and suggest some avenues of future research
Recommended from our members
An Empirical Study of the Effectiveness of 'Forcing Diversity' Based on a Large Population of Diverse Programs
Use of diverse software components is a viable defence against common-mode failures in redundant softwarebased systems. Various forms of "Diversity-Seeking Decisions" (“DSDs”) can be applied to the process of developing, or procuring, redundant components, to improve the chances of the resulting components not failing on the same demands. An open question is how effective these decisions, and their combinations, are for achieving large enough reliability gains. Using a large population of software programs, we studied experimentally the effectiveness of specific "DSDs" (and their combinations) mandating differences between redundant components. Some of these combinations produced much better improvements in system probability of failure per demand (PFD) than "uncontrolled" diversity did. Yet, our findings suggest that the gains from such "DSDs" vary significantly between them and between the application problems studied. The relationship between DSDs and system PFD is complex and does not allow for simple universal rules
(e.g. "the more diversity the better") to apply
Reasoning About the Reliability of Multi-version, Diverse Real-Time Systems
This paper is concerned with the development of reliable real-time systems for use in high integrity applications. It advocates the use of diverse replicated channels, but does not require the dependencies between the channels to be evaluated. Rather it develops and extends the approach of Little wood and Rush by (for general systems) by investigating a two channel system in which one channel, A, is produced to a high level of reliability (i.e. has a very low failure rate), while the other, B, employs various forms of static analysis to sustain an argument that it is perfect (i.e. it will never miss a deadline). The first channel is fully functional, the second contains a more restricted computational model and contains only the critical computations. Potential dependencies between the channels (and their verification) are evaluated in terms of aleatory and epistemic uncertainty. At the aleatory level the events ''A fails" and ''B is imperfect" are independent. Moreover, unlike the general case, independence at the epistemic level is also proposed for common forms of implementation and analysis for real-time systems and their temporal requirements (deadlines). As a result, a systematic approach is advocated that can be applied in a real engineering context to produce highly reliable real-time systems, and to support numerical claims about the level of reliability achieved
SOFTWARE RELIABILITY SIMULATION: PROCESS, APPROACHES AND METHODOLOGY
Reliability is probably the most crucial factor to put ones hand up for in any engineering process. Quantitatively, reliability gives a measure (quantity) of quality, and the quantity can be properly engineered using appropriate reliability engineering process. Software Reliability Modeling has been one of the much-attracted research domains in Software Reliability Engineering, to estimate the current state as well as predict the future state of the software system reliability. This paper aims to raise awareness about the usefulness and importance of simulation in support of software reliability modeling and engineering. Simulation can be applied in many critical and touchy areas and enables one to address issues before they these issues become problems. This paper brings to fore some key concepts in simulation-based software reliability modeling. This paper suggests that the software engineering community could exploit simulation to much greater advantage which include cutting down on software development costs, improving reliability, narrowing down the gestation period of software development, fore-seeing the software development process and the software product itself and so on
Recommended from our members
Conservative reasoning about epistemic uncertainty for the probability of failure on demand of a 1-out-of-2 software-based system in which one channel is “possibly perfect”
In earlier work, (Littlewood and Rushby 2012) (henceforth LR), an analysis was presented of a 1-out-of-2 software-based system in which one channel was “possibly perfect”. It was shown that, at the aleatory level, the system pfd (probability of failure on demand) could be bounded above by the product of the pfd of channel A and the pnp (probability of non-perfection) of channel B. This result was presented as a way of avoiding the well-known difficulty that for two certainly-fallible channels, failures of the two will be dependent, i.e. the system pfd cannot be expressed simply as a product of the channel pfds. A price paid in this new approach for avoiding the issue of failure dependence is that the result is conservative. Furthermore, a complete analysis requires that account be taken of epistemic uncertainty – here concerning the numeric values of the two parameters pfdA and pnpB. Unfortunately this introduces a different difficult problem of dependence: estimating the dependence between an assessor’s beliefs about the parameters. The work reported here avoids this problem by obtaining results that require only an assessor’s marginal beliefs about the individual channels, i.e. they do not require knowledge of the dependence between these beliefs. The price paid is further conservatism in the results
Reasoning about the Reliability of Diverse Two-Channel Systems in which One Channel is "Possibly Perfect"
This paper considers the problem of reasoning about the reliability of fault-tolerant systems with two "channels" (i.e., components) of which one, A, supports only a claim of reliability, while the other, B, by virtue of extreme simplicity and extensive analysis, supports a plausible claim of "perfection." We begin with the case where either channel can bring the system to a safe state. We show that, conditional upon knowing pA (the probability that A fails on a randomly selected demand) and pB (the probability that channel B is imperfect), a conservative bound on the probability that the system fails on a randomly selected demand is simply pA.pB. That is, there is conditional independence between the events "A fails" and "B is imperfect." The second step of the reasoning involves epistemic uncertainty about (pA, pB) and we show that under quite plausible assumptions, a conservative bound on system pfd can be constructed from point estimates for just three parameters. We discuss the feasibility of establishing credible estimates for these parameters. We extend our analysis from faults of omission to those of commission, and then combine these to yield an analysis for monitored architectures of a kind proposed for aircraft
- …