864 research outputs found

    The Watchdog Task: Concurrent error detection using assertions

    Get PDF
    The Watchdog Task, a software abstraction of the Watchdog-processor, is shown to be a powerful error detection tool with a great deal of flexibility and the advantages of watchdog techniques. A Watchdog Task system in Ada is presented; issues of recovery, latency, efficiency (communication) and preprocessing are discussed. Different applications, one of which is error detection on a single processor, are examined

    Study of fault-tolerant software technology

    Get PDF
    Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance

    CONTROL FLOW CHECKING IN MULTITASKING SYSTEMS

    Get PDF
    The control flow checking technique presented in our paper is based on the new watchdog- processor method SEIS1 (Signature Encoded Instruction Stream). This method is in- tended to check the still uncovered area of state-of-the-art microprocessors using on-chip caches or instruction pipelines, since the processor instruction bus needs not be monitored. The control flow is checked using assigned actual signatures and embedded reference sig- natures. Since the actual and reference signatures are embedded in the checked program, the usual reference database and the time-consuming search/ compare engine in the watch- dog can be omitted. The evaluation of the actual signature is a simple combinatorial task allowing high speed and thus the sharing of the watchdog between different tasks and processors. The checking method has been extended to higher levels of the application like simultaneous check of different processes and their synchronization in multitasking systems

    Software dependability techniques validated via fault injection experiments

    Get PDF
    The present paper proposes a C/C++ source-to-source compiler able to increase the dependability properties of a given application. The adopted strategy is based on two main techniques: variable duplication/triplication and control flow checking. The validation of these techniques is based on the emulation of fault appearance by software fault injection. The chosen test case is a client-server application in charge of calculating and drawing a Mandelbrot fracta

    SOFTWARE DIAGNOSIS USING COMPRESSED SIGNATURE SEQUENCES

    Get PDF
    Software monitoring and debugging can be efficiently supported by one of the concurrent error detection methods, the application of watchdog processors. A watchdog processor, as a co-processor, receives and evaluates signatures assigned to the states of the program execution. After the checking, it stores the run-time sequence of signatures which identify the statements of the program. In this way, a trace of the statements executed before the error is available. The signature buffer can be efficiently utilized if the signature sequence is compressed. In the paper, two real-time compression methods are presented and compared. The first one uses predefined dictionaries, while the other one utilizes the structural information encoded in the signatures

    Graph-tree-based software control flow checking for COTS processors on pico-satellites

    Get PDF
    AbstractThis paper proposes a generic high-performance and low-time-overhead software control flow checking solution, graph-tree-based control flow checking (GTCFC) for space-borne commercial-off-the-shelf (COTS) processors. A graph tree data structure with a topology similar to common trees is introduced to transform the control flow graphs of target programs. This together with design of IDs and signatures of its vertices and edges allows for an easy check of legality of actual branching during target program execution. As a result, the algorithm not only is capable of detecting all single and multiple branching errors with low latency and time overheads along with a linear-complexity space overhead, but also remains generic among arbitrary instruction sets and independent of any specific hardware. Tests of the algorithm using a COTS-processor-based on-board computer (OBC) of in-service ZDPS-1A pico-satellite products show that GTCFC can detect over 90% of the randomly injected and all-pattern-covering branching errors for different types of target programs, with performance and overheads consistent with the theoretical analysis; and beats well-established preeminent control flow checking algorithms in these dimensions. Furthermore, it is validated that GTCGC not only can be accommodated in pico-satellites conveniently with still sufficient system margins left, but also has the ability to minimize the risk of control flow errors being undetected in their space missions. Therefore, due to its effectiveness, efficiency, and compatibility, the GTCFC solution is ready for applications on COTS processors on pico-satellites in their real space missions

    Airborne Advanced Reconfigurable Computer System (ARCS)

    Get PDF
    A digital computer subsystem fault-tolerant concept was defined, and the potential benefits and costs of such a subsystem were assessed when used as the central element of a new transport's flight control system. The derived advanced reconfigurable computer system (ARCS) is a triple-redundant computer subsystem that automatically reconfigures, under multiple fault conditions, from triplex to duplex to simplex operation, with redundancy recovery if the fault condition is transient. The study included criteria development covering factors at the aircraft's operation level that would influence the design of a fault-tolerant system for commercial airline use. A new reliability analysis tool was developed for evaluating redundant, fault-tolerant system availability and survivability; and a stringent digital system software design methodology was used to achieve design/implementation visibility

    Software implemented fault tolerance for microprocessor controllers: fault tolerance for microprocessor controllers

    Get PDF
    It is generally accepted that transient faults are a major cause of failure in micro processor systems. Industrial controllers with embedded microprocessors are particularly at risk from this type of failure because their working environments are prone to transient disturbances which can generate transient faults. In order to improve the reliability of processor systems for industrial applications within a limited budget, fault tolerant techniques for uniprocessors are implemented. These techniques aim to identify characteristics of processor operation which are attributed to erroneous behaviour. Once detection is achieved, a programme of restoration activity can be initiated. This thesis initially develops a previous model of erroneous microprocessor behaviour from which characteristics particular to mal-operation are identified. A new technique is proposed, based on software implemented fault tolerance which, by recognizing a particular behavioural characteristic, facilitates the self-detection of erroneous execution. The technique involves inserting detection mechanisms into the target software. This can be quite a complex process and so a prototype software tool called Post-programming Automated Recovery UTility (PARUT) is developed to automate the technique's application. The utility can be used to apply the proposed behavioural fault tolerant technique for a selection of target processors. Fault injection and emulation experiments assess the effectiveness of the proposed fault tolerant technique for three application programs implemented on an 8, 16, and 32- bit processors respectively. The modified application programs are shown to have an improved detection capability and hence reliability when the proposed fault tolerant technique is applied. General assessment of the technique cannot be made, however, because its effectiveness is application specific. The thesis concludes by considering methods of generating non-hazardous application programs at the compilation stage, and design features for incorporation into the architecture of a microprocessor which inherently reduce the hazard, and increase the detection capability of the target software. Particular suggestions are made to add a 'PARUT' phase to the translation process, and to orientate microprocessor design towards the instruction opcode map
    • …
    corecore