82,536 research outputs found

    Runtime Enforcement for Component-Based Systems

    Get PDF
    Runtime enforcement is an increasingly popular and effective dynamic validation technique aiming to ensure the correct runtime behavior (w.r.t. a formal specification) of systems using a so-called enforcement monitor. In this paper we introduce runtime enforcement of specifications on component-based systems (CBS) modeled in the BIP (Behavior, Interaction and Priority) framework. BIP is a powerful and expressive component-based framework for formal construction of heterogeneous systems. However, because of BIP expressiveness, it remains difficult to enforce at design-time complex behavioral properties. First we propose a theoretical runtime enforcement framework for CBS where we delineate a hierarchy of sets of enforceable properties (i.e., properties that can be enforced) according to the number of observational steps a system is allowed to deviate from the property (i.e., the notion of k-step enforceability). To ensure the observational equivalence between the correct executions of the initial system and the monitored system, we show that i) only stutter-invariant properties should be enforced on CBS with our monitors, ii) safety properties are 1-step enforceable. Given an abstract enforcement monitor (as a finite-state machine) for some 1-step enforceable specification, we formally instrument (at relevant locations) a given BIP system to integrate the monitor. At runtime, the monitor observes and automatically avoids any error in the behavior of the system w.r.t. the specification. Our approach is fully implemented in an available tool that we used to i) avoid deadlock occurrences on a dining philosophers benchmark, and ii) ensure the correct placement of robots on a map.Comment: arXiv admin note: text overlap with arXiv:1109.5505 by other author

    A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems

    Full text link
    Supercomputing systems today often come in the form of large numbers of commodity systems linked together into a computing cluster. These systems, like any distributed system, can have large numbers of independent hardware components cooperating or collaborating on a computation. Unfortunately, any of this vast number of components can fail at any time, resulting in potentially erroneous output. In order to improve the robustness of supercomputing applications in the presence of failures, many techniques have been developed to provide resilience to these kinds of system faults. This survey provides an overview of these various fault-tolerance techniques.Comment: 11 page

    Rapid Recovery for Systems with Scarce Faults

    Full text link
    Our goal is to achieve a high degree of fault tolerance through the control of a safety critical systems. This reduces to solving a game between a malicious environment that injects failures and a controller who tries to establish a correct behavior. We suggest a new control objective for such systems that offers a better balance between complexity and precision: we seek systems that are k-resilient. In order to be k-resilient, a system needs to be able to rapidly recover from a small number, up to k, of local faults infinitely many times, provided that blocks of up to k faults are separated by short recovery periods in which no fault occurs. k-resilience is a simple but powerful abstraction from the precise distribution of local faults, but much more refined than the traditional objective to maximize the number of local faults. We argue why we believe this to be the right level of abstraction for safety critical systems when local faults are few and far between. We show that the computational complexity of constructing optimal control with respect to resilience is low and demonstrate the feasibility through an implementation and experimental results.Comment: In Proceedings GandALF 2012, arXiv:1210.202

    Real-time and fault tolerance in distributed control software

    Get PDF
    Closed loop control systems typically contain multitude of spatially distributed sensors and actuators operated simultaneously. So those systems are parallel and distributed in their essence. But mapping this parallelism onto the given distributed hardware architecture, brings in some additional requirements: safe multithreading, optimal process allocation, real-time scheduling of bus and network resources. Nowadays, fault tolerance methods and fast even online reconfiguration are becoming increasingly important. All those often conflicting requirements, make design and implementation of real-time distributed control systems an extremely difficult task, that requires substantial knowledge in several areas of control and computer science. Although many design methods have been proposed so far, none of them had succeeded to cover all important aspects of the problem at hand. [1] Continuous increase of production in embedded market, makes a simple and natural design methodology for real-time systems needed more then ever

    Self-repair ability of evolved self-assembling systems in cellular automata

    Get PDF
    Self-repairing systems are those that are able to reconfigure themselves following disruptions to bring them back into a defined normal state. In this paper we explore the self-repair ability of some cellular automata-like systems, which differ from classical cellular automata by the introduction of a local diffusion process inspired by chemical signalling processes in biological development. The update rules in these systems are evolved using genetic programming to self-assemble towards a target pattern. In particular, we demonstrate that once the update rules have been evolved for self-assembly, many of those update rules also provide a self-repair ability without any additional evolutionary process aimed specifically at self-repair
    • …
    corecore