13,089 research outputs found
Separation of Circulating Tokens
Self-stabilizing distributed control is often modeled by token abstractions.
A system with a single token may implement mutual exclusion; a system with
multiple tokens may ensure that immediate neighbors do not simultaneously enjoy
a privilege. For a cyber-physical system, tokens may represent physical objects
whose movement is controlled. The problem studied in this paper is to ensure
that a synchronous system with m circulating tokens has at least d distance
between tokens. This problem is first considered in a ring where d is given
whilst m and the ring size n are unknown. The protocol solving this problem can
be uniform, with all processes running the same program, or it can be
non-uniform, with some processes acting only as token relays. The protocol for
this first problem is simple, and can be expressed with Petri net formalism. A
second problem is to maximize d when m is given, and n is unknown. For the
second problem, the paper presents a non-uniform protocol with a single
corrective process.Comment: 22 pages, 7 figures, epsf and pstricks in LaTe
Black Hole Search with Finite Automata Scattered in a Synchronous Torus
We consider the problem of locating a black hole in synchronous anonymous
networks using finite state agents. A black hole is a harmful node in the
network that destroys any agent visiting that node without leaving any trace.
The objective is to locate the black hole without destroying too many agents.
This is difficult to achieve when the agents are initially scattered in the
network and are unaware of the location of each other. Previous studies for
black hole search used more powerful models where the agents had non-constant
memory, were labelled with distinct identifiers and could either write messages
on the nodes of the network or mark the edges of the network. In contrast, we
solve the problem using a small team of finite-state agents each carrying a
constant number of identical tokens that could be placed on the nodes of the
network. Thus, all resources used in our algorithms are independent of the
network size. We restrict our attention to oriented torus networks and first
show that no finite team of finite state agents can solve the problem in such
networks, when the tokens are not movable. In case the agents are equipped with
movable tokens, we determine lower bounds on the number of agents and tokens
required for solving the problem in torus networks of arbitrary size. Further,
we present a deterministic solution to the black hole search problem for
oriented torus networks, using the minimum number of agents and tokens
Computing in the RAIN: a reliable array of independent nodes
The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology
Tight Bounds for Black Hole Search with Scattered Agents in Synchronous Rings
We study the problem of locating a particularly dangerous node, the so-called
black hole in a synchronous anonymous ring network with mobile agents. A black
hole is a harmful stationary process residing in a node of the network and
destroying destroys all mobile agents visiting that node without leaving any
trace. We consider the more challenging scenario when the agents are identical
and initially scattered within the network. Moreover, we solve the problem with
agents that have constant-sized memory and carry a constant number of identical
tokens, which can be placed at nodes of the network. In contrast, the only
known solutions for the case of scattered agents searching for a black hole,
use stronger models where the agents have non-constant memory, can write
messages in whiteboards located at nodes or are allowed to mark both the edges
and nodes of the network with tokens. This paper solves the problem for ring
networks containing a single black hole. We are interested in the minimum
resources (number of agents and tokens) necessary for locating all links
incident to the black hole. We present deterministic algorithms for ring
topologies and provide matching lower and upper bounds for the number of agents
and the number of tokens required for deterministic solutions to the black hole
search problem, in oriented or unoriented rings, using movable or unmovable
tokens
Distributed computing system with dual independent communications paths between computers and employing split tokens
This is a distributed computing system providing flexible fault tolerance; ease of software design and concurrency specification; and dynamic balance of the loads. The system comprises a plurality of computers each having a first input/output interface and a second input/output interface for interfacing to communications networks each second input/output interface including a bypass for bypassing the associated computer. A global communications network interconnects the first input/output interfaces for providing each computer the ability to broadcast messages simultaneously to the remainder of the computers. A meshwork communications network interconnects the second input/output interfaces providing each computer with the ability to establish a communications link with another of the computers bypassing the remainder of computers. Each computer is controlled by a resident copy of a common operating system. Communications between respective ones of computers is by means of split tokens each having a moving first portion which is sent from computer to computer and a resident second portion which is disposed in the memory of at least one of computer and wherein the location of the second portion is part of the first portion. The split tokens represent both functions to be executed by the computers and data to be employed in the execution of the functions. The first input/output interfaces each include logic for detecting a collision between messages and for terminating the broadcasting of a message whereby collisions between messages are detected and avoided
Crux: Locality-Preserving Distributed Services
Distributed systems achieve scalability by distributing load across many
machines, but wide-area deployments can introduce worst-case response latencies
proportional to the network's diameter. Crux is a general framework to build
locality-preserving distributed systems, by transforming an existing scalable
distributed algorithm A into a new locality-preserving algorithm ALP, which
guarantees for any two clients u and v interacting via ALP that their
interactions exhibit worst-case response latencies proportional to the network
latency between u and v. Crux builds on compact-routing theory, but generalizes
these techniques beyond routing applications. Crux provides weak and strong
consistency flavors, and shows latency improvements for localized interactions
in both cases, specifically up to several orders of magnitude for
weakly-consistent Crux (from roughly 900ms to 1ms). We deployed on PlanetLab
locality-preserving versions of a Memcached distributed cache, a Bamboo
distributed hash table, and a Redis publish/subscribe. Our results indicate
that Crux is effective and applicable to a variety of existing distributed
algorithms.Comment: 11 figure
- …