8,817 research outputs found

    Separation of Circulating Tokens

    Full text link
    Self-stabilizing distributed control is often modeled by token abstractions. A system with a single token may implement mutual exclusion; a system with multiple tokens may ensure that immediate neighbors do not simultaneously enjoy a privilege. For a cyber-physical system, tokens may represent physical objects whose movement is controlled. The problem studied in this paper is to ensure that a synchronous system with m circulating tokens has at least d distance between tokens. This problem is first considered in a ring where d is given whilst m and the ring size n are unknown. The protocol solving this problem can be uniform, with all processes running the same program, or it can be non-uniform, with some processes acting only as token relays. The protocol for this first problem is simple, and can be expressed with Petri net formalism. A second problem is to maximize d when m is given, and n is unknown. For the second problem, the paper presents a non-uniform protocol with a single corrective process.Comment: 22 pages, 7 figures, epsf and pstricks in LaTe

    Automated Synthesis of Distributed Self-Stabilizing Protocols

    Full text link
    In this paper, we introduce an SMT-based method that automatically synthesizes a distributed self-stabilizing protocol from a given high-level specification and network topology. Unlike existing approaches, where synthesis algorithms require the explicit description of the set of legitimate states, our technique only needs the temporal behavior of the protocol. We extend our approach to synthesize ideal-stabilizing protocols, where every state is legitimate. We also extend our technique to synthesize monotonic-stabilizing protocols, where during recovery, each process can execute an most once one action. Our proposed methods are fully implemented and we report successful synthesis of well-known protocols such as Dijkstra's token ring, a self-stabilizing version of Raymond's mutual exclusion algorithm, ideal-stabilizing leader election and local mutual exclusion, as well as monotonic-stabilizing maximal independent set and distributed Grundy coloring

    Asynchronous neighborhood task synchronization

    Full text link
    Faults are likely to occur in distributed systems. The motivation for designing self-stabilizing system is to be able to automatically recover from a faulty state. As per Dijkstra\u27s definition, a system is self-stabilizing if it converges to a desired state from an arbitrary state in a finite number of steps. The paradigm of self-stabilization is considered to be the most unified approach to designing fault-tolerant systems. Any type of faults, e.g., transient, process crashes and restart, link failures and recoveries, and byzantine faults, can be handled by a self-stabilizing system; Many applications in distributed systems involve multiple phases. Solving these applications require some degree of synchronization of phases. In this thesis research, we introduce a new problem, called asynchronous neighborhood task synchronization ( NTS ). In this problem, processes execute infinite instances of tasks, where a task consists of a set of steps. There are several requirements for this problem. Simultaneous execution of steps by the neighbors is allowed only if the steps are different. Every neighborhood is synchronized in the sense that all neighboring processes execute the same instance of a task. Although the NTS problem is applicable in nonfaulty environments, it is more challenging to solve this problem considering various types of faults. In this research, we will present a self-stabilizing solution to the NTS problem. The proposed solution is space optimal, fault containing, fully localized, and fully distributed. One of the most desirable properties of our algorithm is that it works under any (including unfair) daemon. We will discuss various applications of the NTS problem

    Self-Stabilization in the Distributed Systems of Finite State Machines

    Get PDF
    The notion of self-stabilization was first proposed by Dijkstra in 1974 in his classic paper. The paper defines a system as self-stabilizing if, starting at any, possibly illegitimate, state the system can automatically adjust itself to eventually converge to a legitimate state in finite amount of time and once in a legitimate state it will remain so unless it incurs a subsequent transient fault. Dijkstra limited his attention to a ring of finite-state machines and provided its solution for self-stabilization. In the years following his introduction, very few papers were published in this area. Once his proposal was recognized as a milestone in work on fault tolerance, the notion propagated among the researchers rapidly and many researchers in the distributed systems diverted their attention to it. The investigation and use of self-stabilization as an approach to fault-tolerant behavior under a model of transient failures for distributed systems is now undergoing a renaissance. A good number of works pertaining to self-stabilization in the distributed systems were proposed in the yesteryears most of which are very recent. This report surveys all previous works available in the literature of self-stabilizing systems

    Computing in the RAIN: a reliable array of independent nodes

    Get PDF
    The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology

    Structural Invariants for the Verification of Systems with Parameterized Architectures

    Full text link
    We consider parameterized concurrent systems consisting of a finite but unknown number of components, obtained by replicating a given set of finite state automata. Components communicate by executing atomic interactions whose participants update their states simultaneously. We introduce an interaction logic to specify both the type of interactions (e.g.\ rendez-vous, broadcast) and the topology of the system (e.g.\ pipeline, ring). The logic can be easily embedded in monadic second order logic of finitely many successors, and is therefore decidable. Proving safety properties of such a parameterized system, like deadlock freedom or mutual exclusion, requires to infer an inductive invariant that contains all reachable states of all system instances, and no unsafe state. We present a method to automatically synthesize inductive invariants directly from the formula describing the interactions, without costly fixed point iterations. We experimentally prove that this invariant is strong enough to verify safety properties of a large number of systems including textbook examples (dining philosophers, synchronization schemes), classical mutual exclusion algorithms, cache-coherence protocols and self-stabilization algorithms, for an arbitrary number of components.Comment: preprint; to be published in the proceedings of TACAS2

    Exclusion and Object Tracking in a Network of Processes

    Get PDF
    This paper concerns two fundamental problems in distributed computing---mutual exclusion and mobile object tracking. For a variant of the mutual exclusion problem where the network topology is taken into account, all existing distributed solutions make use of tokens. It turns out that these token-based solutions for mutual exclusion can also be adapted for object tracking, as the token behaves very much like a mobile object. To handle objects with replication, we go further to consider the more general kk-exclusion problem which has not been as well studied in a network setting. A strong fairness property for kk-exclusion requires that a process trying to enter the critical section will eventually succeed even if \emph{up to} k1k-1 processes stay in the critical section indefinitely. We present a comparative survey of existing token-based mutual exclusion algorithms, which have provided much inspiration for later kk-exclusion algorithms. We then propose two solutions to the kk-exclusion problem, the second of which meets the strong fairness requirement. Fault-tolerance issues are also discussed along with the suggestion of a third algorithm that is also strongly fair. Performances of the three algorithms are compared by simulation. Finally, we show how the various exclusion algorithms can be adapted for tracking mobile objects

    Exploiting replication in distributed systems

    Get PDF
    Techniques are examined for replicating data and execution in directly distributed systems: systems in which multiple processes interact directly with one another while continuously respecting constraints on their joint behavior. Directly distributed systems are often required to solve difficult problems, ranging from management of replicated data to dynamic reconfiguration in response to failures. It is shown that these problems reduce to more primitive, order-based consistency problems, which can be solved using primitives such as the reliable broadcast protocols. Moreover, given a system that implements reliable broadcast primitives, a flexible set of high-level tools can be provided for building a wide variety of directly distributed application programs

    A Taxonomy of Daemons in Self-stabilization

    Full text link
    We survey existing scheduling hypotheses made in the literature in self-stabilization, commonly referred to under the notion of daemon. We show that four main characteristics (distribution, fairness, boundedness, and enabledness) are enough to encapsulate the various differences presented in existing work. Our naming scheme makes it easy to compare daemons of particular classes, and to extend existing possibility or impossibility results to new daemons. We further examine existing daemon transformer schemes and provide the exact transformed characteristics of those transformers in our taxonomy.Comment: 26 page
    corecore