277 research outputs found

    Efficient diagnosis of multiprocessor systems under probabilistic models

    Get PDF
    The problem of fault diagnosis in multiprocessor systems is considered under a probabilistic fault model. The focus is on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. Lower and upper bounds on the number of tests required for regular systems are also presented. A class of regular systems which includes hypercubes is shown to be correctly diagnosable with high probability. In all cases, the number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis techniques

    Theory of reliable systems

    Get PDF
    An attempt was made to refine the current notion of system reliability by identifying and investigating attributes of a system which are important to reliability considerations. Techniques which facilitate analysis of system reliability are included. Special attention was given to fault tolerance, diagnosability, and reconfigurability characteristics of systems

    Formal Design of Asynchronous Fault Detection and Identification Components using Temporal Epistemic Logic

    Get PDF
    Autonomous critical systems, such as satellites and space rovers, must be able to detect the occurrence of faults in order to ensure correct operation. This task is carried out by Fault Detection and Identification (FDI) components, that are embedded in those systems and are in charge of detecting faults in an automated and timely manner by reading data from sensors and triggering predefined alarms. The design of effective FDI components is an extremely hard problem, also due to the lack of a complete theoretical foundation, and of precise specification and validation techniques. In this paper, we present the first formal approach to the design of FDI components for discrete event systems, both in a synchronous and asynchronous setting. We propose a logical language for the specification of FDI requirements that accounts for a wide class of practical cases, and includes novel aspects such as maximality and trace-diagnosability. The language is equipped with a clear semantics based on temporal epistemic logic, and is proved to enjoy suitable properties. We discuss how to validate the requirements and how to verify that a given FDI component satisfies them. We propose an algorithm for the synthesis of correct-by-construction FDI components, and report on the applicability of the design approach on an industrial case-study coming from aerospace.Comment: 33 pages, 20 figure

    Causality and Temporal Dependencies in the Design of Fault Management Systems

    Get PDF
    Reasoning about causes and effects naturally arises in the engineering of safety-critical systems. A classical example is Fault Tree Analysis, a deductive technique used for system safety assessment, whereby an undesired state is reduced to the set of its immediate causes. The design of fault management systems also requires reasoning on causality relationships. In particular, a fail-operational system needs to ensure timely detection and identification of faults, i.e. recognize the occurrence of run-time faults through their observable effects on the system. Even more complex scenarios arise when multiple faults are involved and may interact in subtle ways. In this work, we propose a formal approach to fault management for complex systems. We first introduce the notions of fault tree and minimal cut sets. We then present a formal framework for the specification and analysis of diagnosability, and for the design of fault detection and identification (FDI) components. Finally, we review recent advances in fault propagation analysis, based on the Timed Failure Propagation Graphs (TFPG) formalism.Comment: In Proceedings CREST 2017, arXiv:1710.0277

    Diagnosis in Infinite-State Probabilistic Systems

    Get PDF
    In a recent work, we introduced four variants of diagnosability (FA, IA, FF, IF) in (finite) probabilistic systems (pLTS) depending whether one considers (1) finite or infinite runs and (2) faulty or all runs. We studied their relationship and established that the corresponding decision problems are PSPACE-complete. A key ingredient of the decision procedures was a characterisation of diagnosability by the fact that a random run almost surely lies in an open set whose specification only depends on the qualitative behaviour of the pLTS. Here we investigate similar issues for infinite pLTS. We first show that this characterisation still holds for FF-diagnosability but with a G-delta set instead of an open set and also for IF- and IA-diagnosability when pLTS are finitely branching. We also prove that surprisingly FA-diagnosability cannot be characterised in this way even in the finitely branching case. Then we apply our characterisations for a partially observable probabilistic extension of visibly pushdown automata (POpVPA), yielding EXPSPACE procedures for solving diagnosability problems. In addition, we establish some computational lower bounds and show that slight extensions of POpVPA lead to undecidability

    Discrete event approach to network fault management

    Get PDF
    Failure diagnosis in large and complex systems such as a communication network is a critical task. An important aspect of network management is fault management, i.e.,determining, locating, isolation, and correcting faults in the network. In the realm of discrete event systems Sampath et al proposed a failure diagnosis approach, and Jiang et al proposed an efficient algorithm for testing diagnosability. In this work, we adopt the framework of the communicating finite state machine (CFSM) of Miller et al for modeling networks and to investigate fault detection, fault identification and fault location using Sampath et al and Jiang et al methods. Our approach provides a systematic way of performing fault diagnosis aspects of network fault management

    RULES BASED MODELING OF DISCRETE EVENT SYSTEMS WITH FAULTS AND THEIR DIAGNOSIS

    Get PDF
    Failure diagnosis in large and complex systems is a critical task. In the realm of discrete event systems, Sampath et al. proposed a language based failure diagnosis approach. They introduced the diagnosability for discrete event systems and gave a method for testing the diagnosability by first constructing a diagnoser for the system. The complexity of this method of testing diagnosability is exponential in the number of states of the system and doubly exponential in the number of failure types. In this thesis, we give an algorithm for testing diagnosability that does not construct a diagnoser for the system, and its complexity is of 4th order in the number of states of the system and linear in the number of the failure types. In this dissertation we also study diagnosis of discrete event systems (DESs) modeled in the rule-based modeling formalism introduced in [12] to model failure-prone systems. The results have been represented in [43]. An attractive feature of rule-based model is it\u27s compactness (size is polynomial in number of signals). A motivation for the work presented is to develop failure diagnosis techniques that are able to exploit this compactness. In this regard, we develop symbolic techniques for testing diagnosability and computing a diagnoser. Diagnosability test is shown to be an instance of 1st order temporal logic model-checking. An on-line algorithm for diagnosersynthesis is obtained by using predicates and predicate transformers. We demonstrate our approach by applying it to modeling and diagnosis of a part of the assembly-line. When the system is found to be not diagnosable, we use sensor refinement and sensor augmentation to make the system diagnosable. In this dissertation, a controller is also extracted from the maximally permissive supervisor for the purpose of implementing the control by selecting, when possible, only one controllable event from among the ones allowed by the supervisor for the assembly line in automaton models
    • …
    corecore