277 research outputs found
Efficient diagnosis of multiprocessor systems under probabilistic models
The problem of fault diagnosis in multiprocessor systems is considered under a probabilistic fault model. The focus is on minimizing the number of tests that must be conducted in order to correctly diagnose the state of every processor in the system with high probability. A diagnosis algorithm that can correctly diagnose the state of every processor with probability approaching one in a class of systems performing slightly greater than a linear number of tests is presented. A nearly matching lower bound on the number of tests required to achieve correct diagnosis in arbitrary systems is also proven. Lower and upper bounds on the number of tests required for regular systems are also presented. A class of regular systems which includes hypercubes is shown to be correctly diagnosable with high probability. In all cases, the number of tests required under this probabilistic model is shown to be significantly less than under a bounded-size fault set model. Because the number of tests that must be conducted is a measure of the diagnosis overhead, these results represent a dramatic improvement in the performance of system-level diagnosis techniques
Theory of reliable systems
An attempt was made to refine the current notion of system reliability by identifying and investigating attributes of a system which are important to reliability considerations. Techniques which facilitate analysis of system reliability are included. Special attention was given to fault tolerance, diagnosability, and reconfigurability characteristics of systems
Formal Design of Asynchronous Fault Detection and Identification Components using Temporal Epistemic Logic
Autonomous critical systems, such as satellites and space rovers, must be
able to detect the occurrence of faults in order to ensure correct operation.
This task is carried out by Fault Detection and Identification (FDI)
components, that are embedded in those systems and are in charge of detecting
faults in an automated and timely manner by reading data from sensors and
triggering predefined alarms. The design of effective FDI components is an
extremely hard problem, also due to the lack of a complete theoretical
foundation, and of precise specification and validation techniques. In this
paper, we present the first formal approach to the design of FDI components for
discrete event systems, both in a synchronous and asynchronous setting. We
propose a logical language for the specification of FDI requirements that
accounts for a wide class of practical cases, and includes novel aspects such
as maximality and trace-diagnosability. The language is equipped with a clear
semantics based on temporal epistemic logic, and is proved to enjoy suitable
properties. We discuss how to validate the requirements and how to verify that
a given FDI component satisfies them. We propose an algorithm for the synthesis
of correct-by-construction FDI components, and report on the applicability of
the design approach on an industrial case-study coming from aerospace.Comment: 33 pages, 20 figure
Causality and Temporal Dependencies in the Design of Fault Management Systems
Reasoning about causes and effects naturally arises in the engineering of
safety-critical systems. A classical example is Fault Tree Analysis, a
deductive technique used for system safety assessment, whereby an undesired
state is reduced to the set of its immediate causes. The design of fault
management systems also requires reasoning on causality relationships. In
particular, a fail-operational system needs to ensure timely detection and
identification of faults, i.e. recognize the occurrence of run-time faults
through their observable effects on the system. Even more complex scenarios
arise when multiple faults are involved and may interact in subtle ways.
In this work, we propose a formal approach to fault management for complex
systems. We first introduce the notions of fault tree and minimal cut sets. We
then present a formal framework for the specification and analysis of
diagnosability, and for the design of fault detection and identification (FDI)
components. Finally, we review recent advances in fault propagation analysis,
based on the Timed Failure Propagation Graphs (TFPG) formalism.Comment: In Proceedings CREST 2017, arXiv:1710.0277
Diagnosis in Infinite-State Probabilistic Systems
In a recent work, we introduced four variants of diagnosability
(FA, IA, FF, IF) in (finite) probabilistic
systems (pLTS) depending whether one considers (1) finite or
infinite runs and (2) faulty or all runs. We studied their
relationship and established that the corresponding decision
problems are PSPACE-complete. A key ingredient of the decision
procedures was a characterisation of diagnosability by the fact that
a random run almost surely lies in an open set whose specification
only depends on the qualitative behaviour of the pLTS. Here we
investigate similar issues for infinite pLTS. We first show that
this characterisation still holds for FF-diagnosability but
with a G-delta set instead of an open set and also for IF-
and IA-diagnosability when pLTS are finitely branching. We also
prove that surprisingly FA-diagnosability cannot be
characterised in this way even in the finitely branching case. Then
we apply our characterisations for a partially observable
probabilistic extension of visibly pushdown automata (POpVPA),
yielding EXPSPACE procedures for solving diagnosability problems.
In addition, we establish some computational lower bounds and show
that slight extensions of POpVPA lead to undecidability
Discrete event approach to network fault management
Failure diagnosis in large and complex systems such as a communication network is a critical task. An important aspect of network management is fault management, i.e.,determining, locating, isolation, and correcting faults in the network. In the realm of discrete event systems Sampath et al proposed a failure diagnosis approach, and Jiang et al proposed an efficient algorithm for testing diagnosability. In this work, we adopt the framework of the communicating finite state machine (CFSM) of Miller et al for modeling networks and to investigate fault detection, fault identification and fault location using Sampath et al and Jiang et al methods. Our approach provides a systematic way of performing fault diagnosis aspects of network fault management
RULES BASED MODELING OF DISCRETE EVENT SYSTEMS WITH FAULTS AND THEIR DIAGNOSIS
Failure diagnosis in large and complex systems is a critical task. In the realm of discrete event systems, Sampath et al. proposed a language based failure diagnosis approach. They introduced the diagnosability for discrete event systems and gave a method for testing the diagnosability by first constructing a diagnoser for the system. The complexity of this method of testing diagnosability is exponential in the number of states of the system and doubly exponential in the number of failure types. In this thesis, we give an algorithm for testing diagnosability that does not construct a diagnoser for the system, and its complexity is of 4th order in the number of states of the system and linear in the number of the failure types. In this dissertation we also study diagnosis of discrete event systems (DESs) modeled in the rule-based modeling formalism introduced in [12] to model failure-prone systems. The results have been represented in [43]. An attractive feature of rule-based model is it\u27s compactness (size is polynomial in number of signals). A motivation for the work presented is to develop failure diagnosis techniques that are able to exploit this compactness. In this regard, we develop symbolic techniques for testing diagnosability and computing a diagnoser. Diagnosability test is shown to be an instance of 1st order temporal logic model-checking. An on-line algorithm for diagnosersynthesis is obtained by using predicates and predicate transformers. We demonstrate our approach by applying it to modeling and diagnosis of a part of the assembly-line. When the system is found to be not diagnosable, we use sensor refinement and sensor augmentation to make the system diagnosable. In this dissertation, a controller is also extracted from the maximally permissive supervisor for the purpose of implementing the control by selecting, when possible, only one controllable event from among the ones allowed by the supervisor for the assembly line in automaton models
- …