6,174 research outputs found

    Software reliability through fault-avoidance and fault-tolerance

    Get PDF
    Twenty independently developed but functionally equivalent software versions were used to investigate and compare empirically some properties of N-version programming, Recovery Block, and Consensus Recovery Block, using the majority and consensus voting algorithms. This was also compared with another hybrid fault-tolerant scheme called Acceptance Voting, using dynamic versions of consensus and majority voting. Consensus voting provides adaptation of the voting strategy to varying component reliability, failure correlation, and output space characteristics. Since failure correlation among versions effectively reduces the cardinality of the space in which the voter make decisions, consensus voting is usually preferable to simple majority voting in any fault-tolerant system. When versions have considerably different reliabilities, the version with the best reliability will perform better than any of the fault-tolerant techniques

    Study of a unified hardware and software fault-tolerant architecture

    Get PDF
    A unified architectural concept, called the Fault Tolerant Processor Attached Processor (FTP-AP), that can tolerate hardware as well as software faults is proposed for applications requiring ultrareliable computation capability. An emulation of the FTP-AP architecture, consisting of a breadboard Motorola 68010-based quadruply redundant Fault Tolerant Processor, four VAX 750s as attached processors, and four versions of a transport aircraft yaw damper control law, is used as a testbed in the AIRLAB to examine a number of critical issues. Solutions of several basic problems associated with N-Version software are proposed and implemented on the testbed. This includes a confidence voter to resolve coincident errors in N-Version software. A reliability model of N-Version software that is based upon the recent understanding of software failure mechanisms is also developed. The basic FTP-AP architectural concept appears suitable for hosting N-Version application software while at the same time tolerating hardware failures. Architectural enhancements for greater efficiency, software reliability modeling, and N-Version issues that merit further research are identified

    Multiversion software reliability through fault-avoidance and fault-tolerance

    Get PDF
    In this project we have proposed to investigate a number of experimental and theoretical issues associated with the practical use of multi-version software in providing dependable software through fault-avoidance and fault-elimination, as well as run-time tolerance of software faults. In the period reported here we have working on the following: We have continued collection of data on the relationships between software faults and reliability, and the coverage provided by the testing process as measured by different metrics (including data flow metrics). We continued work on software reliability estimation methods based on non-random sampling, and the relationship between software reliability and code coverage provided through testing. We have continued studying back-to-back testing as an efficient mechanism for removal of uncorrelated faults, and common-cause faults of variable span. We have also been studying back-to-back testing as a tool for improvement of the software change process, including regression testing. We continued investigating existing, and worked on formulation of new fault-tolerance models. In particular, we have partly finished evaluation of Consensus Voting in the presence of correlated failures, and are in the process of finishing evaluation of Consensus Recovery Block (CRB) under failure correlation. We find both approaches far superior to commonly employed fixed agreement number voting (usually majority voting). We have also finished a cost analysis of the CRB approach

    Self-stabilising Byzantine Clock Synchronisation is Almost as Easy as Consensus

    Get PDF
    We give fault-tolerant algorithms for establishing synchrony in distributed systems in which each of the nn nodes has its own clock. Our algorithms operate in a very strong fault model: we require self-stabilisation, i.e., the initial state of the system may be arbitrary, and there can be up to f<n/3f<n/3 ongoing Byzantine faults, i.e., nodes that deviate from the protocol in an arbitrary manner. Furthermore, we assume that the local clocks of the nodes may progress at different speeds (clock drift) and communication has bounded delay. In this model, we study the pulse synchronisation problem, where the task is to guarantee that eventually all correct nodes generate well-separated local pulse events (i.e., unlabelled logical clock ticks) in a synchronised manner. Compared to prior work, we achieve exponential improvements in stabilisation time and the number of communicated bits, and give the first sublinear-time algorithm for the problem: - In the deterministic setting, the state-of-the-art solutions stabilise in time Θ(f)\Theta(f) and have each node broadcast Θ(flogf)\Theta(f \log f) bits per time unit. We exponentially reduce the number of bits broadcasted per time unit to Θ(logf)\Theta(\log f) while retaining the same stabilisation time. - In the randomised setting, the state-of-the-art solutions stabilise in time Θ(f)\Theta(f) and have each node broadcast O(1)O(1) bits per time unit. We exponentially reduce the stabilisation time to logO(1)f\log^{O(1)} f while each node broadcasts logO(1)f\log^{O(1)} f bits per time unit. These results are obtained by means of a recursive approach reducing the above task of self-stabilising pulse synchronisation in the bounded-delay model to non-self-stabilising binary consensus in the synchronous model. In general, our approach introduces at most logarithmic overheads in terms of stabilisation time and broadcasted bits over the underlying consensus routine.Comment: 54 pages. To appear in JACM, preliminary version of this work has appeared in DISC 201

    Quantum computing and the entanglement frontier - Rapporteur talk at the 25th Solvay Conference

    Get PDF
    Quantum information science explores the frontier of highly complex quantum states, the "entanglement frontier". This study is motivated by the observation (widely believed but unproven) that classical systems cannot simulate highly entangled quantum systems efficiently, and we hope to hasten the day when well controlled quantum systems can perform tasks surpassing what can be done in the classical world. One way to achieve such "quantum supremacy" would be to run an algorithm on a quantum computer which solves a problem with a super-polynomial speedup relative to classical computers, but there may be other ways that can be achieved sooner, such as simulating exotic quantum states of strongly correlated matter. To operate a large scale quantum computer reliably we will need to overcome the debilitating effects of decoherence, which might be done using "standard" quantum hardware protected by quantum error-correcting codes, or by exploiting the nonabelian quantum statistics of anyons realized in solid state systems, or by combining both methods. Only by challenging the entanglement frontier will we learn whether Nature provides extravagant resources far beyond what the classical world would allow

    Are Condorcet and minimax voting systems the best?

    Full text link
    For decades, the minimax voting system was well known to experts on voting systems, but was not widely considered to be one of the best systems. But in recent years, two important experts, Nicolaus Tideman and Andrew Myers, have both recognized minimax as one of the best systems. I agree with that. This paper presents my own reasons for preferring minimax. The paper explicitly discusses about 20 systems, though over 50 are known to exist.Comment: 41 pages, no figures. The Introduction has been changed. Also fixed some version 6 errors in referencing subsection numbers in section

    Fault tolerant architectures for integrated aircraft electronics systems, task 2

    Get PDF
    The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported
    corecore