8,817 research outputs found
Separation of Circulating Tokens
Self-stabilizing distributed control is often modeled by token abstractions.
A system with a single token may implement mutual exclusion; a system with
multiple tokens may ensure that immediate neighbors do not simultaneously enjoy
a privilege. For a cyber-physical system, tokens may represent physical objects
whose movement is controlled. The problem studied in this paper is to ensure
that a synchronous system with m circulating tokens has at least d distance
between tokens. This problem is first considered in a ring where d is given
whilst m and the ring size n are unknown. The protocol solving this problem can
be uniform, with all processes running the same program, or it can be
non-uniform, with some processes acting only as token relays. The protocol for
this first problem is simple, and can be expressed with Petri net formalism. A
second problem is to maximize d when m is given, and n is unknown. For the
second problem, the paper presents a non-uniform protocol with a single
corrective process.Comment: 22 pages, 7 figures, epsf and pstricks in LaTe
Automated Synthesis of Distributed Self-Stabilizing Protocols
In this paper, we introduce an SMT-based method that automatically
synthesizes a distributed self-stabilizing protocol from a given high-level
specification and network topology. Unlike existing approaches, where synthesis
algorithms require the explicit description of the set of legitimate states,
our technique only needs the temporal behavior of the protocol. We extend our
approach to synthesize ideal-stabilizing protocols, where every state is
legitimate. We also extend our technique to synthesize monotonic-stabilizing
protocols, where during recovery, each process can execute an most once one
action. Our proposed methods are fully implemented and we report successful
synthesis of well-known protocols such as Dijkstra's token ring, a
self-stabilizing version of Raymond's mutual exclusion algorithm,
ideal-stabilizing leader election and local mutual exclusion, as well as
monotonic-stabilizing maximal independent set and distributed Grundy coloring
Asynchronous neighborhood task synchronization
Faults are likely to occur in distributed systems. The motivation for designing self-stabilizing system is to be able to automatically recover from a faulty state. As per Dijkstra\u27s definition, a system is self-stabilizing if it converges to a desired state from an arbitrary state in a finite number of steps. The paradigm of self-stabilization is considered to be the most unified approach to designing fault-tolerant systems. Any type of faults, e.g., transient, process crashes and restart, link failures and recoveries, and byzantine faults, can be handled by a self-stabilizing system; Many applications in distributed systems involve multiple phases. Solving these applications require some degree of synchronization of phases. In this thesis research, we introduce a new problem, called asynchronous neighborhood task synchronization ( NTS ). In this problem, processes execute infinite instances of tasks, where a task consists of a set of steps. There are several requirements for this problem. Simultaneous execution of steps by the neighbors is allowed only if the steps are different. Every neighborhood is synchronized in the sense that all neighboring processes execute the same instance of a task. Although the NTS problem is applicable in nonfaulty environments, it is more challenging to solve this problem considering various types of faults. In this research, we will present a self-stabilizing solution to the NTS problem. The proposed solution is space optimal, fault containing, fully localized, and fully distributed. One of the most desirable properties of our algorithm is that it works under any (including unfair) daemon. We will discuss various applications of the NTS problem
Self-Stabilization in the Distributed Systems of Finite State Machines
The notion of self-stabilization was first proposed by Dijkstra in 1974 in his classic paper. The paper defines a system as self-stabilizing if, starting at any, possibly illegitimate, state the system can automatically adjust itself to eventually converge to a legitimate state in finite amount of time and once in a legitimate state it will remain so unless it incurs a subsequent transient fault. Dijkstra limited his attention to a ring of finite-state machines and provided its solution for self-stabilization. In the years following his introduction, very few papers were published in this area. Once his proposal was recognized as a milestone in work on fault tolerance, the notion propagated among the researchers rapidly and many researchers in the distributed systems diverted their attention to it. The investigation and use of self-stabilization as an approach to fault-tolerant behavior under a model of transient failures for distributed systems is now undergoing a renaissance. A good number of works pertaining to self-stabilization in the distributed systems were proposed in the yesteryears most of which are very recent. This report surveys all previous works available in the literature of self-stabilizing systems
Computing in the RAIN: a reliable array of independent nodes
The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology
Structural Invariants for the Verification of Systems with Parameterized Architectures
We consider parameterized concurrent systems consisting of a finite but
unknown number of components, obtained by replicating a given set of finite
state automata. Components communicate by executing atomic interactions whose
participants update their states simultaneously. We introduce an interaction
logic to specify both the type of interactions (e.g.\ rendez-vous, broadcast)
and the topology of the system (e.g.\ pipeline, ring). The logic can be easily
embedded in monadic second order logic of finitely many successors, and is
therefore decidable.
Proving safety properties of such a parameterized system, like deadlock
freedom or mutual exclusion, requires to infer an inductive invariant that
contains all reachable states of all system instances, and no unsafe state. We
present a method to automatically synthesize inductive invariants directly from
the formula describing the interactions, without costly fixed point iterations.
We experimentally prove that this invariant is strong enough to verify safety
properties of a large number of systems including textbook examples (dining
philosophers, synchronization schemes), classical mutual exclusion algorithms,
cache-coherence protocols and self-stabilization algorithms, for an arbitrary
number of components.Comment: preprint; to be published in the proceedings of TACAS2
Exclusion and Object Tracking in a Network of Processes
This paper concerns two fundamental problems in distributed computing---mutual exclusion and mobile object tracking. For a variant of the mutual exclusion problem where the network topology is taken into account, all existing distributed solutions make use of tokens. It turns out that these token-based solutions for mutual exclusion can also be adapted for object tracking, as the token behaves very much like a mobile object. To handle objects with replication, we go further to consider the more general -exclusion problem which has not been as well studied in a network setting. A strong fairness property for -exclusion requires that a process trying to enter the critical section will eventually succeed even if \emph{up to} processes stay in the critical section indefinitely. We present a comparative survey of existing token-based mutual exclusion algorithms, which have provided much inspiration for later -exclusion algorithms. We then propose two solutions to the -exclusion problem, the second of which meets the strong fairness requirement. Fault-tolerance issues are also discussed along with the suggestion of a third algorithm that is also strongly fair. Performances of the three algorithms are compared by simulation. Finally, we show how the various exclusion algorithms can be adapted for tracking mobile objects
Exploiting replication in distributed systems
Techniques are examined for replicating data and execution in directly distributed systems: systems in which multiple processes interact directly with one another while continuously respecting constraints on their joint behavior. Directly distributed systems are often required to solve difficult problems, ranging from management of replicated data to dynamic reconfiguration in response to failures. It is shown that these problems reduce to more primitive, order-based consistency problems, which can be solved using primitives such as the reliable broadcast protocols. Moreover, given a system that implements reliable broadcast primitives, a flexible set of high-level tools can be provided for building a wide variety of directly distributed application programs
A Taxonomy of Daemons in Self-stabilization
We survey existing scheduling hypotheses made in the literature in
self-stabilization, commonly referred to under the notion of daemon. We show
that four main characteristics (distribution, fairness, boundedness, and
enabledness) are enough to encapsulate the various differences presented in
existing work. Our naming scheme makes it easy to compare daemons of particular
classes, and to extend existing possibility or impossibility results to new
daemons. We further examine existing daemon transformer schemes and provide the
exact transformed characteristics of those transformers in our taxonomy.Comment: 26 page
- …