21,395 research outputs found

    Incorporating component-based design in the category-theoretic framework for composition of fault-tolerant software

    Get PDF
    With the increasing use of software in many systems like telecommunications, e-commerce, manufacturing, etc., and the need for reliable services in these systems, there is an ever-growing demand for providing fault-tolerance. Generally, software is built without concentrating much on the fault-tolerant aspect, and fault-tolerance is typically an additional feature to ensure reliability if ever a failure has been encountered. However, there are many legacy software systems that are being deployed in highly critical applications where fault tolerance is inevitable. Various methods have been put forth in the literature for designing fault-tolerance, including a component-based methodology, wherein fault-tolerance is separated from the functionality, and fault-tolerant components, such as correctors and detectors, are added to achieve the desired reliability. Utilizing the concepts of the component-based design, we propose a category theoretic framework for the composition of these fault-tolerant components with a fault-intolerant program. We illustrate our proposed approach to compose the fault-tolerant components with a fault-intolerant program to result in a final fault-tolerant program through two case studies. In our first case study, we show the feasibility of our approach by composing the fault-tolerant components for a distributed mutual exclusion algorithm using our proposed approach. In the second case study, we decompose the fault-tolerant Label Distribution Protocol and prove the correctness of the design of the fault-tolerant components. Furthermore, the formal specification and verification of these case studies has been conducted using Specware. Some of the benefits of the proposed approach include (a) traceability of all the sorts, operations and properties used to derive the composed program, (b) well-defined interfaces, that allows components to interact in a well-specified behaviour, and (c) reuse of specification for subsequent similar system design

    Formal and Fault Tolerant Design

    Get PDF
    Software quality and reliability were verified for a long time at the post-implementation level (test, fault sce-nario ...). The design of embedded systems and digital circuits is more and more complex because of inte-gration density, heterogeneity. Now almost ¾ of the digital circuits contain at least one processor, that is, can execute software code. In other words, co-design is the most usual case and traditional verification by simu-lation is no more practical. Moreover, the increase in integration density comes with a decrease in the reliabil-ity of the components. So fault detection, diagnostics techniques, introspection are essential for defect toler-ance, fault tolerance and self repair of safety-critical systems. The use of a formal specification language is considered as the foundation of a real validation. What we would like to emphasize is that refinement (from an abstract model to the point where the system will be implemented) could be and should be formal too in order to ensure the traceability of requirements, to man-age such development projects and so to design fault-tolerant systems correct by proven construction. Such a thorough approach can be achieved by the automation or semi-automation of the refinement process. We have studied how to ensure the traceability of these requirements in a component-based approach. Re-liability, fault tolerance can be seen here as particular refinement steps. For instance, a given formal specifi-cation of a system/component may be refined by adding redundancy (data, computation, component) and be verified to be fault-tolerant w.r.t. some given fault scenarios. A self-repair component can be defined as the refinement of its original form enhanced with error detection. We describe in this paper the PCSI project (Zero Defect Systems) based on B Method, VHDL and PSL. The three modeling approaches can collaborate together and guarantee the codesign of embedded systems for which the requirements and the fault-tolerant aspects are taken into account for the beginning and formally verified all along the implementation process

    Incorporating faults and fault-tolerance into real-time networks: a graph-transformational approach

    Get PDF
    PhD ThesisThe introduction of fault tolerance into real-time systems presents particular challenges because of the price of redundancy and the added complexity of verification and validation on these redundant structures. This thesis brings structural and formal design techniques to bear on this problem. Verification of fault tolerance properties in such systems has only received limited attention. in particular the design methodologies are in their infancy. We propose a transformational design methodology, specific to a real-time systems architecture. We then reason about the compositional addition of fault tolerant components and templates of the derived designs. This requires that we show the existing axiomatic semantics for our chosen architecture sound with respect to a more constructive semantic model. The issues of presenting an operational model for a real-time architecture are discussed and a model is proposed. The extension of the existing semantics, to allow for faulty behaviour, is shown to preserve the existing semantic properties and the application of our methodology shown to be usable by a sizeable study. The contribution of this thesis is to define a transformational design methodology in which components can be extracted from a design and replaced by another component preserving functionality while providing fault tolerance. This approach requires the precise modelling of the faults we consider. the transformational method and verification of the transformed design with respect to faults.BAE Systems: EPSRC

    Restart-Based Fault-Tolerance: System Design and Schedulability Analysis

    Full text link
    Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models

    A Scalable System Architecture for High-Performance Fault Tolerant Machine Drives

    Get PDF
    When targeting mission critical applications, the design of the electronic actuation systems needs to consider many requirements and constraints not typical in standard industrial applications. One of these is tolerance to faults, as the unplanned shutdown of a critical subsystem, if not handled correctly, could lead to financial harm, environmental disaster, or even loss of life. One way this can be avoided is through the design of an electric drive systems based on multi-phase machines that can keep operating, albeit with degraded performance, in a partial configuration under fault conditions. Distributed architectures are uniquely suited to meet these challenges, by providing a large degree of isolation between the various components. This paper presents a system architecture suitable for scalable and high-performance fault tolerant machine drive systems. the effectiveness of this system is demonstrated through theoretical analysis and experimental verification on a six-phase machine

    AFTI/F-16 digital flight control system experience

    Get PDF
    The Advanced Flighter Technology Integration (AFTI) F-16 program is investigating the integration of emerging technologies into an advanced fighter aircraft. The three major technologies involved are the triplex digital flight control system; decoupled aircraft flight control; and integration of avionics, pilot displays, and flight control. In addition to investigating improvements in fighter performance, the AFTI/F-16 program provides a look at generic problems facing highly integrated, flight-crucial digital controls. An overview of the AFTI/F-16 systems is followed by a summary of flight test experience and recommendations

    Verification of the FtCayuga fault-tolerant microprocessor system. Volume 1: A case study in theorem prover-based verification

    Get PDF
    The design and formal verification of a hardware system for a task that is an important component of a fault tolerant computer architecture for flight control systems is presented. The hardware system implements an algorithm for obtaining interactive consistancy (byzantine agreement) among four microprocessors as a special instruction on the processors. The property verified insures that an execution of the special instruction by the processors correctly accomplishes interactive consistency, provided certain preconditions hold. An assumption is made that the processors execute synchronously. For verification, the authors used a computer aided design hardware design verification tool, Spectool, and the theorem prover, Clio. A major contribution of the work is the demonstration of a significant fault tolerant hardware design that is mechanically verified by a theorem prover

    Rapid Recovery for Systems with Scarce Faults

    Full text link
    Our goal is to achieve a high degree of fault tolerance through the control of a safety critical systems. This reduces to solving a game between a malicious environment that injects failures and a controller who tries to establish a correct behavior. We suggest a new control objective for such systems that offers a better balance between complexity and precision: we seek systems that are k-resilient. In order to be k-resilient, a system needs to be able to rapidly recover from a small number, up to k, of local faults infinitely many times, provided that blocks of up to k faults are separated by short recovery periods in which no fault occurs. k-resilience is a simple but powerful abstraction from the precise distribution of local faults, but much more refined than the traditional objective to maximize the number of local faults. We argue why we believe this to be the right level of abstraction for safety critical systems when local faults are few and far between. We show that the computational complexity of constructing optimal control with respect to resilience is low and demonstrate the feasibility through an implementation and experimental results.Comment: In Proceedings GandALF 2012, arXiv:1210.202

    Moving formal methods into practice. Verifying the FTPP Scoreboard: Results, phase 1

    Get PDF
    This report documents the Phase 1 results of an effort aimed at formally verifying a key hardware component, called Scoreboard, of a Fault-Tolerant Parallel Processor (FTPP) being built at Charles Stark Draper Laboratory (CSDL). The Scoreboard is part of the FTPP virtual bus that guarantees reliable communication between processors in the presence of Byzantine faults in the system. The Scoreboard implements a piece of control logic that approves and validates a message before it can be transmitted. The goal of Phase 1 was to lay the foundation of the Scoreboard verification. A formal specification of the functional requirements and a high-level hardware design for the Scoreboard were developed. The hardware design was based on a preliminary Scoreboard design developed at CSDL. A main correctness theorem, from which the functional requirements can be established as corollaries, was proved for the Scoreboard design. The goal of Phase 2 is to verify the final detailed design of Scoreboard. This task is being conducted as part of a NASA-sponsored effort to explore integration of formal methods in the development cycle of current fault-tolerant architectures being built in the aerospace industry
    • …
    corecore