96 research outputs found

    A Game-Theoretic approach to Fault Diagnosis of Hybrid Systems

    Full text link
    Physical systems can fail. For this reason the problem of identifying and reacting to faults has received a large attention in the control and computer science communities. In this paper we study the fault diagnosis problem for hybrid systems from a game-theoretical point of view. A hybrid system is a system mixing continuous and discrete behaviours that cannot be faithfully modeled neither by using a formalism with continuous dynamics only nor by a formalism including only discrete dynamics. We use the well known framework of hybrid automata for modeling hybrid systems, and we define a Fault Diagnosis Game on them, using two players: the environment and the diagnoser. The environment controls the evolution of the system and chooses whether and when a fault occurs. The diagnoser observes the external behaviour of the system and announces whether a fault has occurred or not. Existence of a winning strategy for the diagnoser implies that faults can be detected correctly, while computing such a winning strategy corresponds to implement a diagnoser for the system. We will show how to determine the existence of a winning strategy, and how to compute it, for some decidable classes of hybrid automata like o-minimal hybrid automata.Comment: In Proceedings GandALF 2011, arXiv:1106.081

    Designing Diagnosable Distributed Programs.

    Get PDF
    The difficulty in debugging distributed programs motivates the development of formal methods for designing distributed programs that are easier to debug and maintain. We address state identification problem for distributed systems using the finite state I/O automaton model. A state S is identified based on the unique event sequences starting at S, called distinguishing sequences. An automaton is diagnosable if every state has a distinguishing sequence. A distributed program may not be diagnosable even if its components are diagnosable. Non-dignosable automata can, in some cases, be converted to a diagnosable form by relabelling some of its transitions in a way that preserves the semantics of the program. Not all automata can be converted to a diagnosable form in this way. This is due to inherent ill-posedness of specification. Two algorithms to convert a non-diagnosable automaton to a diagnosable form are presented. Debugging is the controlled execution of one program by another. The latter is called the supervisor of the former. The supervision operation is defined so that the debugging of a distributed program by distributed debuggers is reduced to the same as the debugging of a single program by a single debugger. An algorithm to construct a debugger for a diagnosable program is developed. Every diagnosable program has a unique debugger associated with it. This leads to the introduction of the notion of debugging complexity of programs

    Fault diagnosis of distributed systems : analysis, simulation and performance measurement.

    Get PDF
    Fault diagnosis forms an essential component in the design of highly reliable distributed computing systems. Early models for diagnosis require a global observer, whereas the diagnosis is shared between the systems nodes in later models. These models are reviewed and their different diagnosability properties reconciled. The design of improved fault diagnosis algorithms for systems without a global observer provides the main motivation for the thesis. The modified algorithm SELF3 [Hoss88] is taken as a starting point. A number of communication architectures used in distributed systems are reviewed. The properties of diagnosis algorithms depend strongly on the testing graph. A general class of testing graphs, designated as H-graphs, (which are a generalization of Dꞩṭ graphs introduced in [Prep67]), are investigated and their diagnostic properties determined. A software simulator for distributed systems has been written as the main investigative tool for diagnosis algorithms. The design and structure of the simulator are described. The diagnosis process is measured in terms of diagnostic time and number of messages produced, and the factors upon which these quantities depend are identified. The results of simulation of a number of systems are given under various fault conditions. A modified way of routing diagnosis messages, which, especially in large system s, results in a reduction in both the number of diagnosis messages and the time required to perform diagnosis, is presented. The thesis also contains a number of specific recommendations for improving existing self-diagnosis algorithms

    Intermittent/transient fault phenomena in digital systems

    Get PDF
    An overview of the intermittent/transient (IT) fault study is presented. An interval survivability evaluation of digital systems for IT faults is discussed along with a method for detecting and diagnosing IT faults in digital systems

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    Design and Evaluation of Online Fault Diagnosis Protocols forwireless Networks

    Get PDF
    Any node in a network, or a component of it may fail and show undesirable behavior due to physical defects, imperfections, or hardware and/or software related glitches. Presence of faulty hosts in the network affects the computational efficiency, and quality of service (QoS). This calls for the development of efficient fault diagnosis protocols to detect and handle faulty hosts. Fault diagnosis protocols designed for wired networks cannot directly be propagated to wireless networks, due to difference in characteristics, and requirements. This thesis work unravels system level fault diagnosis protocols for wireless networks, particularly for Mobile ad hoc Networks (MANETs), and Wireless Sensor Networks (WSNs), considering faults based on their persistence (permanent, intermittent, and transient), and node mobility. Based on the comparisons of outcomes of the same tasks (comparison model ), a distributed diagnosis protocol has been proposed for static topology MANETs, where a node requires to respond to only one test request from its neighbors, that reduces the communication complexity of the diagnosis process. A novel approach to handle more intractable intermittent faults in dynamic topology MANETs is also discussed.Based on the spatial correlation of sensor measurements, a distributed fault diagnosis protocol is developed to classify the nodes to be fault-free, permanently faulty, or intermittently faulty, in WSNs. The nodes affected by transient faults are often considered fault-free, and should not be isolated from the network. Keeping this objective in mind, we have developed a diagnosis algorithm for WSNs to discriminate transient faults from intermittent and permanent faults. After each node finds the status of all 1-hop neighbors (local diagnostic view), these views are disseminated among the fault-free nodes to deduce the fault status of all nodes in the network (global diagnostic view). A spanning tree based dissemination strategy is adopted, instead of conventional flooding, to have less communication complexity. Analytically, the proposed protocols are shown to be correct, and complete. The protocols are implemented using INET-20111118 (for MANETs) and Castalia-3.2 (forWSNs) on OMNeT++ 4.2 platform. The obtained simulation results for accuracy and false alarm rate vouch the feasibility and efficiency of the proposed algorithms over existing landmark protocols

    A More General Theory of Diagnosis from First Principles

    Full text link
    Model-based diagnosis has been an active research topic in different communities including artificial intelligence, formal methods, and control. This has led to a set of disparate approaches addressing different classes of systems and seeking different forms of diagnoses. In this paper, we resolve such disparities by generalising Reiter's theory to be agnostic to the types of systems and diagnoses considered. This more general theory of diagnosis from first principles defines the minimal diagnosis as the set of preferred diagnosis candidates in a search space of hypotheses. Computing the minimal diagnosis is achieved by exploring the space of diagnosis hypotheses, testing sets of hypotheses for consistency with the system's model and the observation, and generating conflicts that rule out successors and other portions of the search space. Under relatively mild assumptions, our algorithms correctly compute the set of preferred diagnosis candidates. The main difficulty here is that the search space is no longer a powerset as in Reiter's theory, and that, as consequence, many of the implicit properties (such as finiteness of the search space) no longer hold. The notion of conflict also needs to be generalised and we present such a more general notion. We present two implementations of these algorithms, using test solvers based on satisfiability and heuristic search, respectively, which we evaluate on instances from two real world discrete event problems. Despite the greater generality of our theory, these implementations surpass the special purpose algorithms designed for discrete event systems, and enable solving instances that were out of reach of existing diagnosis approaches

    Advances in Robotics, Automation and Control

    Get PDF
    The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man
    corecore