101,568 research outputs found

    Simple and Efficient Local Codes for Distributed Stable Network Construction

    Full text link
    In this work, we study protocols so that populations of distributed processes can construct networks. In order to highlight the basic principles of distributed network construction we keep the model minimal in all respects. In particular, we assume finite-state processes that all begin from the same initial state and all execute the same protocol (i.e. the system is homogeneous). Moreover, we assume pairwise interactions between the processes that are scheduled by an adversary. The only constraint on the adversary scheduler is that it must be fair. In order to allow processes to construct networks, we let them activate and deactivate their pairwise connections. When two processes interact, the protocol takes as input the states of the processes and the state of the their connection and updates all of them. Initially all connections are inactive and the goal is for the processes, after interacting and activating/deactivating connections for a while, to end up with a desired stable network. We give protocols (optimal in some cases) and lower bounds for several basic network construction problems such as spanning line, spanning ring, spanning star, and regular network. We provide proofs of correctness for all of our protocols and analyze the expected time to convergence of most of them under a uniform random scheduler that selects the next pair of interacting processes uniformly at random from all such pairs. Finally, we prove several universality results by presenting generic protocols that are capable of simulating a Turing Machine (TM) and exploiting it in order to construct a large class of networks.Comment: 43 pages, 7 figure

    Simple and efficient local codes for distributed stable network construction

    Get PDF
    In this work, we study protocols so that populations of distributed processes can construct networks. In order to highlight the basic principles of distributed network construction, we keep the model minimal in all respects. In particular, we assume finite-state processes that all begin from the same initial state and all execute the same protocol. Moreover, we assume pairwise interactions between the processes that are scheduled by a fair adversary. In order to allow processes to construct networks, we let them activate and deactivate their pairwise connections. When two processes interact, the protocol takes as input the states of the processes and the state of their connection and updates all of them. Initially all connections are inactive and the goal is for the processes, after interacting and activating/deactivating connections for a while, to end up with a desired stable network. We give protocols (optimal in some cases) and lower bounds for several basic network construction problems such as spanning line, spanning ring, spanning star, and regular network. The expected time to convergence of our protocols is analyzed under a uniform random scheduler. Finally, we prove several universality results by presenting generic protocols that are capable of simulating a Turing Machine (TM) and exploiting it in order to construct a large class of networks. We additionally show how to partition the population into k supernodes, each being a line of logk nodes, for the largest such k. This amount of local memory is sufficient for the supernodes to obtain unique names and exploit their names and their memory to realize nontrivial constructions

    Computing in the RAIN: a reliable array of independent nodes

    Get PDF
    The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology

    Storage and Search in Dynamic Peer-to-Peer Networks

    Full text link
    We study robust and efficient distributed algorithms for searching, storing, and maintaining data in dynamic Peer-to-Peer (P2P) networks. P2P networks are highly dynamic networks that experience heavy node churn (i.e., nodes join and leave the network continuously over time). Our goal is to guarantee, despite high node churn rate, that a large number of nodes in the network can store, retrieve, and maintain a large number of data items. Our main contributions are fast randomized distributed algorithms that guarantee the above with high probability (whp) even under high adversarial churn: 1. A randomized distributed search algorithm that (whp) guarantees that searches from as many as no(n)n - o(n) nodes (nn is the stable network size) succeed in O(logn){O}(\log n)-rounds despite O(n/log1+δn){O}(n/\log^{1+\delta} n) churn, for any small constant δ>0\delta > 0, per round. We assume that the churn is controlled by an oblivious adversary (that has complete knowledge and control of what nodes join and leave and at what time, but is oblivious to the random choices made by the algorithm). 2. A storage and maintenance algorithm that guarantees (whp) data items can be efficiently stored (with only Θ(logn)\Theta(\log{n}) copies of each data item) and maintained in a dynamic P2P network with churn rate up to O(n/log1+δn){O}(n/\log^{1+\delta} n) per round. Our search algorithm together with our storage and maintenance algorithm guarantees that as many as no(n)n - o(n) nodes can efficiently store, maintain, and search even under O(n/log1+δn){O}(n/\log^{1+\delta} n) churn per round. Our algorithms require only polylogarithmic in nn bits to be processed and sent (per round) by each node. To the best of our knowledge, our algorithms are the first-known, fully-distributed storage and search algorithms that provably work under highly dynamic settings (i.e., high churn rates per step).Comment: to appear at SPAA 201

    Network Information Flow with Correlated Sources

    Full text link
    In this paper, we consider a network communications problem in which multiple correlated sources must be delivered to a single data collector node, over a network of noisy independent point-to-point channels. We prove that perfect reconstruction of all the sources at the sink is possible if and only if, for all partitions of the network nodes into two subsets S and S^c such that the sink is always in S^c, we have that H(U_S|U_{S^c}) < \sum_{i\in S,j\in S^c} C_{ij}. Our main finding is that in this setup a general source/channel separation theorem holds, and that Shannon information behaves as a classical network flow, identical in nature to the flow of water in pipes. At first glance, it might seem surprising that separation holds in a fairly general network situation like the one we study. A closer look, however, reveals that the reason for this is that our model allows only for independent point-to-point channels between pairs of nodes, and not multiple-access and/or broadcast channels, for which separation is well known not to hold. This ``information as flow'' view provides an algorithmic interpretation for our results, among which perhaps the most important one is the optimality of implementing codes using a layered protocol stack.Comment: Final version, to appear in the IEEE Transactions on Information Theory -- contains (very) minor changes based on the last round of review
    corecore