6,837 research outputs found

    A Short Counterexample Property for Safety and Liveness Verification of Fault-tolerant Distributed Algorithms

    Full text link
    Distributed algorithms have many mission-critical applications ranging from embedded systems and replicated databases to cloud computing. Due to asynchronous communication, process faults, or network failures, these algorithms are difficult to design and verify. Many algorithms achieve fault tolerance by using threshold guards that, for instance, ensure that a process waits until it has received an acknowledgment from a majority of its peers. Consequently, domain-specific languages for fault-tolerant distributed systems offer language support for threshold guards. We introduce an automated method for model checking of safety and liveness of threshold-guarded distributed algorithms in systems where the number of processes and the fraction of faulty processes are parameters. Our method is based on a short counterexample property: if a distributed algorithm violates a temporal specification (in a fragment of LTL), then there is a counterexample whose length is bounded and independent of the parameters. We prove this property by (i) characterizing executions depending on the structure of the temporal formula, and (ii) using commutativity of transitions to accelerate and shorten executions. We extended the ByMC toolset (Byzantine Model Checker) with our technique, and verified liveness and safety of 10 prominent fault-tolerant distributed algorithms, most of which were out of reach for existing techniques.Comment: 16 pages, 11 pages appendi

    Automatic Generation of Minimal Cut Sets

    Get PDF
    A cut set is a collection of component failure modes that could lead to a system failure. Cut Set Analysis (CSA) is applied to critical systems to identify and rank system vulnerabilities at design time. Model checking tools have been used to automate the generation of minimal cut sets but are generally based on checking reachability of system failure states. This paper describes a new approach to CSA using a Linear Temporal Logic (LTL) model checker called BT Analyser that supports the generation of multiple counterexamples. The approach enables a broader class of system failures to be analysed, by generalising from failure state formulae to failure behaviours expressed in LTL. The traditional approach to CSA using model checking requires the model or system failure to be modified, usually by hand, to eliminate already-discovered cut sets, and the model checker to be rerun, at each step. By contrast, the new approach works incrementally and fully automatically, thereby removing the tedious and error-prone manual process and resulting in significantly reduced computation time. This in turn enables larger models to be checked. Two different strategies for using BT Analyser for CSA are presented. There is generally no single best strategy for model checking: their relative efficiency depends on the model and property being analysed. Comparative results are given for the A320 hydraulics case study in the Behavior Tree modelling language.Comment: In Proceedings ESSS 2015, arXiv:1506.0325

    Development of a framework for automated systematic testing of safety-critical embedded systems

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”In this paper we introduce the development of a framework for testing safety-critical embedded systems based on the concepts of model-based testing. In model-based testing the test cases are derived from a model of the system under test. In our approach the model is an automaton model that is automatically extracted from the C-source code of the system under test. Beside random test data generation the test case generation uses formal methods, in detail model checking techniques. To find appropriate test cases we use the requirements defined in the system specification. To cover further execution paths we developed an additional, to our best knowledge, novel method based on special structural coverage criteria. We present preliminary results on the model extraction using a concrete industrial case study from the automotive domain

    Embedding runtime verification post-deployment for real-time health management of safety-critical systems

    Get PDF
    As cyber-physical systems increase in both complexity and criticality, formal methods have gained traction for design-time verification of safety properties. A lightweight formal method, runtime verification (RV), embeds checks necessary for safety-critical system health management; however, these techniques have been slow to appear in practice despite repeated calls by both industry and academia to leverage them. Additionally, the state-of-the-art in RV lacks a best practice approach when a deployed system requires increased flexibility due to a change in mission, or in response to an emergent condition not accounted for at design time. Human-robot interaction necessitates stringent safety guarantees to protect humans sharing the workspace, particularly in hazardous environments. For example, Robonaut2 (R2) developed an emergent fault while deployed to the International Space Station. Possibly-inaccurate actuator readings trigger the R2 safety system, preventing further motion of a joint until a ground-control operator determines the root-cause and initiates proper corrective action. Operator time is scarce and expensive; when waiting, R2 is an obstacle instead of an asset. We adapt the Realizable, Responsive, Unobtrusive Unit (R2U2) RV framework for resource-constrained environments. We retrofit the R2 motor controller, embedding R2U2 within the remaining resources of the Field-Programmable Gate Array (FPGA) controlling the joint actuator. We add online, stream-based, real-time system health monitoring in a provably unobtrusive way that does not interfere with the control of the joint. We design and embed formal temporal logic specifications that disambiguate the emergent faults and enable automated corrective actions. We overview the challenges and techniques for formally specifying behaviors of an existing command and data bus. We present our specification debugging, validation, and refinement steps. We demonstrate success in the Robonaut2 case study, then detail effective techniques and lessons learned from adding RV with real-time fault disambiguation under the constraints of a deployed system

    Fault-injection through model checking via naive assumptions about state machine synchrony semantics

    Get PDF
    Software behavior can be defined as the action or reaction of software to external and/or internal conditions. Software behavior is an important characteristic in determining software quality. Fault-injection is a method to assess software quality through its\u27 behavior. Our research involves a fault-injection process combined with model checking. We introduce a concept of naive assumptions which exploits the assumptions of execution order, synchrony and fairness. Naive assumptions are applied to inject faults into our models. We use linear temporal logic to examine the model for anomalous behaviors. This method shows us the benefits of using fault-injection and model checking and the advantage of the counter-examples generated by model checkers. We illustrate this technique on a fuel injection Sensor Failure Detection system and discuss the anomalies in detail
    corecore