Search CORE

20,792 research outputs found

Fault-tolerant Resource Reasoning.

Author: Gian Ntzik
Pedro Da
Philippa Gardner
Rocha Pinto
Publication venue
Publication date: 01/01/2015
Field of study

Abstract. Separation logic has been successful at verifying that programs do not crash due to illegal use of resources. The underlying assumption, however, is that machines do not fail. In practice, machines can fail unpredictably for various reasons, e.g. power loss, corrupting resources. Critical software, e.g. file systems, employ recovery methods to mitigate these effects. We introduce an extension of the Views framework to reason about such methods. We use concurrent separation logic as an instance of the framework to illustrate our reasoning, and explore programs using write-ahead logging, e.g. an ARIES recovery algorithm

CiteSeerX

Unattended network operations technology assessment study. Technical support for defining advanced satellite systems concepts

Author: Holdridge Mark
Jaworski Allan
Morgan Herbert K.
Odubiyi Jide
Price Kent M.
Publication venue
Publication date
Field of study

The results are summarized of an unattended network operations technology assessment study for the Space Exploration Initiative (SEI). The scope of the work included: (1) identified possible enhancements due to the proposed Mars communications network; (2) identified network operations on Mars; (3) performed a technology assessment of possible supporting technologies based on current and future approaches to network operations; and (4) developed a plan for the testing and development of these technologies. The most important results obtained are as follows: (1) addition of a third Mars Relay Satellite (MRS) and MRS cross link capabilities will enhance the network's fault tolerance capabilities through improved connectivity; (2) network functions can be divided into the six basic ISO network functional groups; (3) distributed artificial intelligence technologies will augment more traditional network management technologies to form the technological infrastructure of a virtually unattended network; and (4) a great effort is required to bring the current network technology levels for manned space communications up to the level needed for an automated fault tolerance Mars communications network

NASA Technical Reports Server

Expert System for UNIX System Reliability and Availability Enhancement

Author: Xu Catherine Q.
Publication venue
Publication date: 01/02/1993
Field of study

Highly reliable and available systems are critical to the airline industry. However, most off-the-shelf computer operating systems and hardware do not have built-in fault tolerant mechanisms, the UNIX workstation is one example. In this research effort, we have developed a rule-based Expert System (ES) to monitor, command, and control a UNIX workstation system with hot-standby redundancy. The ES on each workstation acts as an on-line system administrator to diagnose, report, correct, and prevent certain types of hardware and software failures. If a primary station is approaching failure, the ES coordinates the switch-over to a hot-standby secondary workstation. The goal is to discover and solve certain fatal problems early enough to prevent complete system failure from occurring and therefore to enhance system reliability and availability. Test results show that the ES can diagnose all targeted faulty scenarios and take desired actions in a consistent manner regardless of the sequence of the faults. The ES can perform designated system administration tasks about ten times faster than an experienced human operator. Compared with a single workstation system, our hot-standby redundancy system downtime is predicted to be reduced by more than 50 percent by using the ES to command and control the system

NASA Technical Reports Server

Distributed Adaptive Fault-Tolerant Control of Uncertain Multi-Agent Systems

Author: Cao Yongcan
Khalili Mohsen
Parisini Thomas
Polycarpou Marios M.
Zhang Xiaodong
Publication venue
Publication date: 01/01/2015
Field of study

This paper presents an adaptive fault-tolerant control (FTC) scheme for a class of nonlinear uncertain multi-agent systems. A local FTC scheme is designed for each agent using local measurements and suitable information exchanged between neighboring agents. Each local FTC scheme consists of a fault diagnosis module and a reconfigurable controller module comprised of a baseline controller and two adaptive fault-tolerant controllers activated after fault detection and after fault isolation, respectively. Under certain assumptions, the closed-loop system's stability and leader-follower consensus properties are rigorously established under different modes of the FTC system, including the time-period before possible fault detection, between fault detection and possible isolation, and after fault isolation

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Trieste

Crossref

Constraint integration and violation handling for BPEL processes

Author: Bandara Kosala Yapa
Pahl Claus
Wang MingXue
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Autonomic, i.e. dynamic and fault-tolerant Web service composition is a requirement resulting from recent developments such as on-demand services. In the context of planning-based service composition, multi-agent planning and dynamic error handling are still unresolved problems. Recently, business rule and constraint management has been looked at for enterprise SOA to add business flexibility. This paper proposes a constraint integration and violation handling technique for dynamic service composition. Higher degrees of reliability and fault-tolerance, but also performance for autonomously composed WS-BPEL processes are the objectives

CiteSeerX

Crossref

DCU Online Research Access Service

Quantitative Robustness Analysis of Quantum Programs (Extended Version)

Author: Hicks Michael
Hietala Kesha
Hung Shih-Han
Wu Xiaodi
Ying Mingsheng
Zhu Shaopeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Quantum computation is a topic of significant recent interest, with practical advances coming from both research and industry. A major challenge in quantum programming is dealing with errors (quantum noise) during execution. Because quantum resources (e.g., qubits) are scarce, classical error correction techniques applied at the level of the architecture are currently cost-prohibitive. But while this reality means that quantum programs are almost certain to have errors, there as yet exists no principled means to reason about erroneous behavior. This paper attempts to fill this gap by developing a semantics for erroneous quantum while-programs, as well as a logic for reasoning about them. This logic permits proving a property we have identified, called

\epsilon

-robustness, which characterizes possible "distance" between an ideal program and an erroneous one. We have proved the logic sound, and showed its utility on several case studies, notably: (1) analyzing the robustness of noisy versions of the quantum Bernoulli factory (QBF) and quantum walk (QW); (2) demonstrating the (in)effectiveness of different error correction schemes on single-qubit errors; and (3) analyzing the robustness of a fault-tolerant version of QBF.Comment: 34 pages, LaTeX; v2: fixed typo

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

On fault-tolerance with noisy and slow measurements

Author: B. Reichardt
F. M. Spedalieri
Gavin K. Brennen
Gerardo A. Paz-Silva
Jason Twamley
K. M. Svore
P. O. Boykin
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2010
Field of study

It is not so well-known that measurement-free quantum error correction protocols can be designed to achieve fault-tolerant quantum computing. Despite the potential advantages of using such protocols in terms of the relaxation of accuracy, speed and addressing requirements on the measurement process, they have usually been overlooked because they are expected to yield a very bad threshold as compared to error correction protocols which use measurements. Here we show that this is not the case. We design fault-tolerant circuits for the 9 qubit Bacon-Shor code and find a threshold for gates and preparation of

p_{(p,g) thresh}=3.76 \times 10^{-5}

(30% of the best known result for the same code using measurement based error correction) while admitting up to 1/3 error rates for measurements and allocating no constraints on measurement speed. We further show that demanding gate error rates sufficiently below the threshold one can improve the preparation threshold to

p_{(p)thresh} = 1/3

. We also show how these techniques can be adapted to other Calderbank-Shor-Steane codes.Comment: 11 pages, 7 figures. v3 has an extended exposition and several simplifications that provide for an improved threshold value and resource overhea

arXiv.org e-Print Archive

Crossref

Macquarie University ResearchOnline