96 research outputs found
A Game-Theoretic approach to Fault Diagnosis of Hybrid Systems
Physical systems can fail. For this reason the problem of identifying and
reacting to faults has received a large attention in the control and computer
science communities. In this paper we study the fault diagnosis problem for
hybrid systems from a game-theoretical point of view. A hybrid system is a
system mixing continuous and discrete behaviours that cannot be faithfully
modeled neither by using a formalism with continuous dynamics only nor by a
formalism including only discrete dynamics. We use the well known framework of
hybrid automata for modeling hybrid systems, and we define a Fault Diagnosis
Game on them, using two players: the environment and the diagnoser. The
environment controls the evolution of the system and chooses whether and when a
fault occurs. The diagnoser observes the external behaviour of the system and
announces whether a fault has occurred or not. Existence of a winning strategy
for the diagnoser implies that faults can be detected correctly, while
computing such a winning strategy corresponds to implement a diagnoser for the
system. We will show how to determine the existence of a winning strategy, and
how to compute it, for some decidable classes of hybrid automata like o-minimal
hybrid automata.Comment: In Proceedings GandALF 2011, arXiv:1106.081
Designing Diagnosable Distributed Programs.
The difficulty in debugging distributed programs motivates the development of formal methods for designing distributed programs that are easier to debug and maintain. We address state identification problem for distributed systems using the finite state I/O automaton model. A state S is identified based on the unique event sequences starting at S, called distinguishing sequences. An automaton is diagnosable if every state has a distinguishing sequence. A distributed program may not be diagnosable even if its components are diagnosable. Non-dignosable automata can, in some cases, be converted to a diagnosable form by relabelling some of its transitions in a way that preserves the semantics of the program. Not all automata can be converted to a diagnosable form in this way. This is due to inherent ill-posedness of specification. Two algorithms to convert a non-diagnosable automaton to a diagnosable form are presented. Debugging is the controlled execution of one program by another. The latter is called the supervisor of the former. The supervision operation is defined so that the debugging of a distributed program by distributed debuggers is reduced to the same as the debugging of a single program by a single debugger. An algorithm to construct a debugger for a diagnosable program is developed. Every diagnosable program has a unique debugger associated with it. This leads to the introduction of the notion of debugging complexity of programs
Fault diagnosis of distributed systems : analysis, simulation and performance measurement.
Fault diagnosis forms an essential component in the design of highly reliable distributed
computing systems. Early models for diagnosis require a global observer, whereas the
diagnosis is shared between the systems nodes in later models. These models are reviewed and their different diagnosability properties reconciled. The design of improved fault diagnosis algorithms for systems without a global observer provides the main motivation for the thesis. The modified algorithm SELF3 [Hoss88] is taken as a starting point.
A number of communication architectures used in distributed systems are reviewed. The
properties of diagnosis algorithms depend strongly on the testing graph. A general class
of testing graphs, designated as H-graphs, (which are a generalization of Dꞩṭ graphs
introduced in [Prep67]), are investigated and their diagnostic properties determined.
A software simulator for distributed systems has been written as the main investigative
tool for diagnosis algorithms. The design and structure of the simulator are described.
The diagnosis process is measured in terms of diagnostic time and number of messages
produced, and the factors upon which these quantities depend are identified. The results
of simulation of a number of systems are given under various fault conditions. A modified
way of routing diagnosis messages, which, especially in large system s, results in a
reduction in both the number of diagnosis messages and the time required to perform
diagnosis, is presented. The thesis also contains a number of specific recommendations
for improving existing self-diagnosis algorithms
Intermittent/transient fault phenomena in digital systems
An overview of the intermittent/transient (IT) fault study is presented. An interval survivability evaluation of digital systems for IT faults is discussed along with a method for detecting and diagnosing IT faults in digital systems
Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack
Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria
Design and Evaluation of Online Fault Diagnosis Protocols forwireless Networks
Any node in a network, or a component of it may fail and show undesirable behavior due to physical defects, imperfections, or hardware and/or software related glitches.
Presence of faulty hosts in the network affects the computational efficiency, and quality of service (QoS). This calls for the development of efficient fault diagnosis protocols to detect and handle faulty hosts. Fault diagnosis protocols designed for wired networks cannot directly be propagated to wireless networks, due to difference in characteristics, and requirements. This thesis work unravels system level fault diagnosis protocols for wireless networks, particularly for Mobile ad hoc Networks (MANETs), and Wireless Sensor Networks (WSNs), considering faults based on their persistence (permanent,
intermittent, and transient), and node mobility. Based on the comparisons of outcomes of the same tasks (comparison model ), a distributed diagnosis protocol has been proposed for static topology MANETs, where a node requires to respond to only one test request from its neighbors, that reduces the communication complexity of the diagnosis process. A novel approach to handle more intractable intermittent faults in dynamic topology MANETs is also discussed.Based on the spatial correlation of sensor measurements, a distributed fault diagnosis protocol is developed to classify the nodes to be fault-free, permanently faulty, or intermittently faulty, in WSNs. The nodes affected by transient faults are often considered fault-free, and should not be isolated from the network. Keeping this objective in mind, we have developed a diagnosis algorithm for WSNs to discriminate transient faults from intermittent and permanent faults. After each node finds the status of all 1-hop neighbors (local diagnostic view), these views are disseminated among the fault-free nodes to deduce the fault status of all nodes in the network (global diagnostic view). A spanning tree based dissemination strategy is adopted, instead of conventional flooding, to have less communication complexity. Analytically, the proposed protocols are shown to be correct, and complete. The protocols are implemented using INET-20111118 (for MANETs) and Castalia-3.2 (forWSNs) on OMNeT++ 4.2 platform. The obtained simulation results for accuracy and false alarm rate vouch the feasibility and efficiency of the proposed algorithms over existing landmark protocols
A More General Theory of Diagnosis from First Principles
Model-based diagnosis has been an active research topic in different
communities including artificial intelligence, formal methods, and control.
This has led to a set of disparate approaches addressing different classes of
systems and seeking different forms of diagnoses. In this paper, we resolve
such disparities by generalising Reiter's theory to be agnostic to the types of
systems and diagnoses considered. This more general theory of diagnosis from
first principles defines the minimal diagnosis as the set of preferred
diagnosis candidates in a search space of hypotheses. Computing the minimal
diagnosis is achieved by exploring the space of diagnosis hypotheses, testing
sets of hypotheses for consistency with the system's model and the observation,
and generating conflicts that rule out successors and other portions of the
search space. Under relatively mild assumptions, our algorithms correctly
compute the set of preferred diagnosis candidates. The main difficulty here is
that the search space is no longer a powerset as in Reiter's theory, and that,
as consequence, many of the implicit properties (such as finiteness of the
search space) no longer hold. The notion of conflict also needs to be
generalised and we present such a more general notion. We present two
implementations of these algorithms, using test solvers based on satisfiability
and heuristic search, respectively, which we evaluate on instances from two
real world discrete event problems. Despite the greater generality of our
theory, these implementations surpass the special purpose algorithms designed
for discrete event systems, and enable solving instances that were out of reach
of existing diagnosis approaches
Advances in Robotics, Automation and Control
The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man
- …