229 research outputs found

    A Short Counterexample Property for Safety and Liveness Verification of Fault-tolerant Distributed Algorithms

    Full text link
    Distributed algorithms have many mission-critical applications ranging from embedded systems and replicated databases to cloud computing. Due to asynchronous communication, process faults, or network failures, these algorithms are difficult to design and verify. Many algorithms achieve fault tolerance by using threshold guards that, for instance, ensure that a process waits until it has received an acknowledgment from a majority of its peers. Consequently, domain-specific languages for fault-tolerant distributed systems offer language support for threshold guards. We introduce an automated method for model checking of safety and liveness of threshold-guarded distributed algorithms in systems where the number of processes and the fraction of faulty processes are parameters. Our method is based on a short counterexample property: if a distributed algorithm violates a temporal specification (in a fragment of LTL), then there is a counterexample whose length is bounded and independent of the parameters. We prove this property by (i) characterizing executions depending on the structure of the temporal formula, and (ii) using commutativity of transitions to accelerate and shorten executions. We extended the ByMC toolset (Byzantine Model Checker) with our technique, and verified liveness and safety of 10 prominent fault-tolerant distributed algorithms, most of which were out of reach for existing techniques.Comment: 16 pages, 11 pages appendi

    ByMC: Byzantine Model Checker

    Get PDF
    International audienceIn recent work [12,10], we have introduced a technique for automatic verification of threshold-guarded distributed algorithms that have the following features: (1) up to t of processes may crash or behave Byzantine; (2) the correct processes count messages and progress when they receive sufficiently many messages, e.g., at least t + 1; (3) the number n of processes in the system is a parameter, as well as t; (4) and the parameters are restricted by a resilience condition, e.g., n > 3t.In this paper, we present Byzantine Model Checker that implements the above-mentioned technique. It takes two kinds of inputs, namely, (i) threshold automata (the framework of our verification techniques) or (ii) Parametric Promela (which is similar to the way in which the distributed algorithms were described in the literature).We introduce a parallel extension of the tool, which exploits the parallelism enabled by our technique on an MPI cluster. We compare performance of the original technique and of the extensions by verifying 10 benchmarks that model fault-tolerant distributed algorithms from the literature. For each benchmark algorithm we check two encodings: a manual encoding in threshold automata vs. a Promela encoding

    Specification and verification of network algorithms using temporal logic

    Get PDF
    In software engineering, formal methods are mathematical-based techniques that are used in the specification, development and verification of algorithms and programs in order to provide reliability and robustness of systems. One of the most difficult challenges for software engineering is to tackle the complexity of algorithms and software found in concurrent systems. Networked systems have come to prominence in many aspects of modern life, and therefore software engineering techniques for treating concurrency in such systems has acquired a particular importance. Algorithms in the software of concurrent systems are used to accomplish certain tasks which need to comply with the properties required of the system as a whole. These properties can be broadly subdivided into `safety properties', where the requirement is `nothing bad will happen', and `liveness properties', where the requirement is that `something good will happen'. As such, specifying network algorithms and their safety and liveness properties through formal methods is the aim of the research presented in this thesis. Since temporal logic has proved to be a successful technique in formal methods, which have various practical applications due to the availability of powerful model-checking tools such as the NuSMV model checker, we will investigate the specification and verification of network algorithms using temporal logic and model checking. In the first part of the thesis, we specify and verify safety properties for network algorithms. We will use temporal logic to prove the safety property of data consistency or serializability for a model of the execution of an unbounded number of concurrent transactions over time, which could represent software schedulers for an unknown number of transactions being present in a network. In the second part of the thesis, we will specify and verify the liveness properties of networked flooding algorithms. Considering the above in more detail, the first part of this thesis specifies a model of the execution of an unbounded number of concurrent transactions over time in propositional Linear Temporal Logic (LTL) in order to prove serializability. This is made possible by assuming that data items are ordered and that the transactions accessing these data items respects this order, as then there is a bound on the number of transactions that need to be considered to prove serializability. In particular, we make use of recent work which places such bounds on the number of transactions needed when data items are accessed in order, but do not have to be accessed contiguously, i.e., there may be `gaps' in the data items being accessed by individual transactions. Our aim is to specify the concurrent modification of data held on routers in a network as a transactional model. The correctness of the routing protocol and ensuring safety and reliability then corresponds to the serializability of the transactions. We specify an example of routing in a network and the corresponding serializability condition in LTL. This is then coded up in the NuSMV model checker and proofs are performed. The novelty of this part is that no previous research has used a method for detecting serializablity and cycles for unlimited number of transactions accessing the data on routers where the transactions way of accessing the data items on the routers have a gap. In addition to this, linear temporal logic has not been used in this scenario to prove correctness of the network system. This part is very helpful in network administrative protocols where it is critical to maintain correctness of the system. This safety property can be maintained using the presented work where detection of cycles in transactions accessing the data items can be detected by only checking a limited number of cycles rather than checking all possible cycles that can be caused by the network transactions. The second part of the thesis offers two contributions. Firstly, we specify the basic synchronous network flooding algorithm, for any fixed size of network, in LTL. The specification can be customized to any single network topology or class of topologies. A specification for the termination problem is formulated and used to compare different topologies with regards to earlier termination. We give a worked example of one topology resulting in earlier termination than another, for which we perform a formal verification using the NuSMV model checker. The novelty of the second part comes in using linear temporal logic and the NuSMV model checker to specify and verify the liveness property of the flooding algorithm. The presented work shows a very difficult scenario where the network nodes are memoryless. This makes detecting the termination of network flooding very complicated especially with networks of complex topologies. In the literature, researchers focussed on using testing and simulations to detect flooding termination. In this work, we used a robust technique and a rigorous method to specify and verify the synchronous flooding algorithm and its termination. We also showed that we can use linear temporal logic and the model checker NuSMV to compare synchronous flooding termination between topologies. Adding to the novelty of the second contribution, in addition to the synchronous form of the network flooding algorithm, we further provide a formal model of bounded asynchronous network flooding by extending the synchronous flooding model to allow a sent message, non-deterministically, to either be received instantaneously, or enter a transit phase prior to being received. A generalization of `rounds' from synchronous flooding to the asynchronous case is used as a unit of time to provide a measure of time to termination, as the number of rounds taken, for a run of an asynchronous system. The model is encoded into temporal logic and a proof obligation is given for comparing the termination times of asynchronous and synchronous systems. Worked examples are formally verified using the NuSMV model checker. This work offers a constraint-based methodology for the verification of liveness properties of software algorithms distributed across the nodes in a network.</div

    Model Checking of Fault-Tolerant Distributed Algorithms: from Classics towards Contemporary

    Get PDF
    International audienceFault-tolerant distributed algorithms-such as agreement, reliable broadcast, and consensus-lie at the heart of distributed systems. Although these algorithms are tiny in comparison to the rest of the system code, they are hard to design and verify. In this short research statement, we discuss the Byzantine model checker, which was developed for automatic verification of asynchronous fault-tolerant distributed algorithms. Further, we discuss the challenges that are posed by contemporary protocols for Blockchain consensus

    Survey on Parameterized Verification with Threshold Automata and the Byzantine Model Checker

    Get PDF
    Threshold guards are a basic primitive of many fault-tolerant algorithms that solve classical problems in distributed computing, such as reliable broadcast, two-phase commit, and consensus. Moreover, threshold guards can be found in recent blockchain algorithms such as, e.g., Tendermint consensus. In this article, we give an overview of techniques for automated verification of threshold-guarded fault-tolerant distributed algorithms, implemented in the Byzantine Model Checker (ByMC). These threshold-guarded algorithms have the following features: (1) up to tt of processes may crash or behave Byzantine; (2) the correct processes count messages and make progress when they receive sufficiently many messages, e.g., at least t+1t+1; (3) the number nn of processes in the system is a parameter, as well as the number tt of faults; and (4) the parameters are restricted by a resilience condition, e.g., n>3tn > 3t. Traditionally, these algorithms were implemented in distributed systems with up to ten participating processes. Nowadays, they are implemented in distributed systems that involve hundreds or thousands of processes. To make sure that these algorithms are still correct for that scale, it is imperative to verify them for all possible values of the parameters

    Guard Automata for the Verification of Safety and Liveness of Distributed Algorithms

    Get PDF
    Distributed algorithms typically run over arbitrary many processes and may involve unboundedly many rounds, making the automated verification of their correctness challenging. Building on domain theory, we introduce a framework that abstracts infinite-state distributed systems that represent distributed algorithms into finite-state guard automata. The soundness of the approach corresponds to the Scott-continuity of the abstraction, which relies on the assumption that the distributed algorithms are layered. Guard automata thus enable the verification of safety and liveness properties of distributed algorithms

    A Formal Approach to Verify Parameterized Protocols in Mobile Cyber-Physical Systems

    Get PDF
    • …
    corecore