20,792 research outputs found

    Fault-tolerant Resource Reasoning.

    Get PDF
    Abstract. Separation logic has been successful at verifying that programs do not crash due to illegal use of resources. The underlying assumption, however, is that machines do not fail. In practice, machines can fail unpredictably for various reasons, e.g. power loss, corrupting resources. Critical software, e.g. file systems, employ recovery methods to mitigate these effects. We introduce an extension of the Views framework to reason about such methods. We use concurrent separation logic as an instance of the framework to illustrate our reasoning, and explore programs using write-ahead logging, e.g. an ARIES recovery algorithm

    Unattended network operations technology assessment study. Technical support for defining advanced satellite systems concepts

    Get PDF
    The results are summarized of an unattended network operations technology assessment study for the Space Exploration Initiative (SEI). The scope of the work included: (1) identified possible enhancements due to the proposed Mars communications network; (2) identified network operations on Mars; (3) performed a technology assessment of possible supporting technologies based on current and future approaches to network operations; and (4) developed a plan for the testing and development of these technologies. The most important results obtained are as follows: (1) addition of a third Mars Relay Satellite (MRS) and MRS cross link capabilities will enhance the network's fault tolerance capabilities through improved connectivity; (2) network functions can be divided into the six basic ISO network functional groups; (3) distributed artificial intelligence technologies will augment more traditional network management technologies to form the technological infrastructure of a virtually unattended network; and (4) a great effort is required to bring the current network technology levels for manned space communications up to the level needed for an automated fault tolerance Mars communications network

    Expert System for UNIX System Reliability and Availability Enhancement

    Get PDF
    Highly reliable and available systems are critical to the airline industry. However, most off-the-shelf computer operating systems and hardware do not have built-in fault tolerant mechanisms, the UNIX workstation is one example. In this research effort, we have developed a rule-based Expert System (ES) to monitor, command, and control a UNIX workstation system with hot-standby redundancy. The ES on each workstation acts as an on-line system administrator to diagnose, report, correct, and prevent certain types of hardware and software failures. If a primary station is approaching failure, the ES coordinates the switch-over to a hot-standby secondary workstation. The goal is to discover and solve certain fatal problems early enough to prevent complete system failure from occurring and therefore to enhance system reliability and availability. Test results show that the ES can diagnose all targeted faulty scenarios and take desired actions in a consistent manner regardless of the sequence of the faults. The ES can perform designated system administration tasks about ten times faster than an experienced human operator. Compared with a single workstation system, our hot-standby redundancy system downtime is predicted to be reduced by more than 50 percent by using the ES to command and control the system

    Distributed Adaptive Fault-Tolerant Control of Uncertain Multi-Agent Systems

    Get PDF
    This paper presents an adaptive fault-tolerant control (FTC) scheme for a class of nonlinear uncertain multi-agent systems. A local FTC scheme is designed for each agent using local measurements and suitable information exchanged between neighboring agents. Each local FTC scheme consists of a fault diagnosis module and a reconfigurable controller module comprised of a baseline controller and two adaptive fault-tolerant controllers activated after fault detection and after fault isolation, respectively. Under certain assumptions, the closed-loop system's stability and leader-follower consensus properties are rigorously established under different modes of the FTC system, including the time-period before possible fault detection, between fault detection and possible isolation, and after fault isolation

    Constraint integration and violation handling for BPEL processes

    Get PDF
    Autonomic, i.e. dynamic and fault-tolerant Web service composition is a requirement resulting from recent developments such as on-demand services. In the context of planning-based service composition, multi-agent planning and dynamic error handling are still unresolved problems. Recently, business rule and constraint management has been looked at for enterprise SOA to add business flexibility. This paper proposes a constraint integration and violation handling technique for dynamic service composition. Higher degrees of reliability and fault-tolerance, but also performance for autonomously composed WS-BPEL processes are the objectives

    Quantitative Robustness Analysis of Quantum Programs (Extended Version)

    Full text link
    Quantum computation is a topic of significant recent interest, with practical advances coming from both research and industry. A major challenge in quantum programming is dealing with errors (quantum noise) during execution. Because quantum resources (e.g., qubits) are scarce, classical error correction techniques applied at the level of the architecture are currently cost-prohibitive. But while this reality means that quantum programs are almost certain to have errors, there as yet exists no principled means to reason about erroneous behavior. This paper attempts to fill this gap by developing a semantics for erroneous quantum while-programs, as well as a logic for reasoning about them. This logic permits proving a property we have identified, called ϵ\epsilon-robustness, which characterizes possible "distance" between an ideal program and an erroneous one. We have proved the logic sound, and showed its utility on several case studies, notably: (1) analyzing the robustness of noisy versions of the quantum Bernoulli factory (QBF) and quantum walk (QW); (2) demonstrating the (in)effectiveness of different error correction schemes on single-qubit errors; and (3) analyzing the robustness of a fault-tolerant version of QBF.Comment: 34 pages, LaTeX; v2: fixed typo

    On fault-tolerance with noisy and slow measurements

    Full text link
    It is not so well-known that measurement-free quantum error correction protocols can be designed to achieve fault-tolerant quantum computing. Despite the potential advantages of using such protocols in terms of the relaxation of accuracy, speed and addressing requirements on the measurement process, they have usually been overlooked because they are expected to yield a very bad threshold as compared to error correction protocols which use measurements. Here we show that this is not the case. We design fault-tolerant circuits for the 9 qubit Bacon-Shor code and find a threshold for gates and preparation of p(p,g)thresh=3.76×105p_{(p,g) thresh}=3.76 \times 10^{-5} (30% of the best known result for the same code using measurement based error correction) while admitting up to 1/3 error rates for measurements and allocating no constraints on measurement speed. We further show that demanding gate error rates sufficiently below the threshold one can improve the preparation threshold to p(p)thresh=1/3p_{(p)thresh} = 1/3. We also show how these techniques can be adapted to other Calderbank-Shor-Steane codes.Comment: 11 pages, 7 figures. v3 has an extended exposition and several simplifications that provide for an improved threshold value and resource overhea
    corecore