2,814 research outputs found

    Hyperswitch communication network

    Get PDF
    The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed

    Broadcasting in Noisy Radio Networks

    Full text link
    The widely-studied radio network model [Chlamtac and Kutten, 1985] is a graph-based description that captures the inherent impact of collisions in wireless communication. In this model, the strong assumption is made that node vv receives a message from a neighbor if and only if exactly one of its neighbors broadcasts. We relax this assumption by introducing a new noisy radio network model in which random faults occur at senders or receivers. Specifically, for a constant noise parameter p[0,1)p \in [0,1), either every sender has probability pp of transmitting noise or every receiver of a single transmission in its neighborhood has probability pp of receiving noise. We first study single-message broadcast algorithms in noisy radio networks and show that the Decay algorithm [Bar-Yehuda et al., 1992] remains robust in the noisy model while the diameter-linear algorithm of Gasieniec et al., 2007 does not. We give a modified version of the algorithm of Gasieniec et al., 2007 that is robust to sender and receiver faults, and extend both this modified algorithm and the Decay algorithm to robust multi-message broadcast algorithms. We next investigate the extent to which (network) coding improves throughput in noisy radio networks. We address the previously perplexing result of Alon et al. 2014 that worst case coding throughput is no better than worst case routing throughput up to constants: we show that the worst case throughput performance of coding is, in fact, superior to that of routing -- by a Θ(log(n))\Theta(\log(n)) gap -- provided receiver faults are introduced. However, we show that any coding or routing scheme for the noiseless setting can be transformed to be robust to sender faults with only a constant throughput overhead. These transformations imply that the results of Alon et al., 2014 carry over to noisy radio networks with sender faults.Comment: Principles of Distributed Computing 201

    Energy Efficient Ant Colony Algorithms for Data Aggregation in Wireless Sensor Networks

    Get PDF
    In this paper, a family of ant colony algorithms called DAACA for data aggregation has been presented which contains three phases: the initialization, packet transmission and operations on pheromones. After initialization, each node estimates the remaining energy and the amount of pheromones to compute the probabilities used for dynamically selecting the next hop. After certain rounds of transmissions, the pheromones adjustment is performed periodically, which combines the advantages of both global and local pheromones adjustment for evaporating or depositing pheromones. Four different pheromones adjustment strategies are designed to achieve the global optimal network lifetime, namely Basic-DAACA, ES-DAACA, MM-DAACA and ACS-DAACA. Compared with some other data aggregation algorithms, DAACA shows higher superiority on average degree of nodes, energy efficiency, prolonging the network lifetime, computation complexity and success ratio of one hop transmission. At last we analyze the characteristic of DAACA in the aspects of robustness, fault tolerance and scalability.Comment: To appear in Journal of Computer and System Science

    Reliability Analysis of the Hypercube Architecture.

    Get PDF
    This dissertation presents improved techniques for analyzing network-connected (NCF), 2-connected (2CF), task-based (TBF), and subcube (SF) functionality measures in a hypercube multiprocessor with faulty processing elements (PE) and/or communication elements (CE). These measures help study system-level fault tolerance issues and relate to various application modes in the hypercube. Solutions discussed in the text fall into probabilistic and deterministic models. The probabilistic measure assumes a stochastic graph of the hypercube where PE\u27s and/or CE\u27s may fail with certain probabilities, while the deterministic model considers that some system components are already failed and aims to determine the system functionality. For probabilistic model, MIL-HDBK-217F is used to predict PE and CE failure rates for an Intel iPSC system. First, a technique called CAREL is presented. A proof of its correctness is included in an appendix. Using the shelling ordering concept, CAREL is shown to solve the exact probabilistic NCF measure for a hypercube in time polynomial in the number of spanning trees. However, this number increases exponentially in the hypercube dimension. This dissertation, then, aims to more efficiently obtain lower and upper bounds on the measures. Algorithms, presented in the text, generate tighter bounds than had been obtained previously and run in time polynomial in the cube dimension. The proposed algorithms for probabilistic 2CF measure consider PE and/or CE failures. In attempting to evaluate deterministic measures, a hybrid method for fault tolerant broadcasting in the hypercube is proposed. This method combines the favorable features of redundant and non-redundant techniques. A generalized result on the deterministic TBF measure for the hypercube is then described. Two distributed algorithms are proposed to identify the largest operational subcubes in a hypercube C\sb{n} with faulty PE\u27s. Method 1, called LOS1, requires a list of faulty components and utilizes the CMB operator of CAREL to solve the problem. In case the number of unavailable nodes (faulty or busy) increases, an alternative distributed approach, called LOS2, processes m available nodes in O(mn) time. The proposed techniques are simple and efficient

    Simple and Optimal Randomized Fault-Tolerant Rumor Spreading

    Full text link
    We revisit the classic problem of spreading a piece of information in a group of nn fully connected processors. By suitably adding a small dose of randomness to the protocol of Gasienic and Pelc (1996), we derive for the first time protocols that (i) use a linear number of messages, (ii) are correct even when an arbitrary number of adversarially chosen processors does not participate in the process, and (iii) with high probability have the asymptotically optimal runtime of O(logn)O(\log n) when at least an arbitrarily small constant fraction of the processors are working. In addition, our protocols do not require that the system is synchronized nor that all processors are simultaneously woken up at time zero, they are fully based on push-operations, and they do not need an a priori estimate on the number of failed nodes. Our protocols thus overcome the typical disadvantages of the two known approaches, algorithms based on random gossip (typically needing a large number of messages due to their unorganized nature) and algorithms based on fair workload splitting (which are either not {time-efficient} or require intricate preprocessing steps plus synchronization).Comment: This is the author-generated version of a paper which is to appear in Distributed Computing, Springer, DOI: 10.1007/s00446-014-0238-z It is available online from http://link.springer.com/article/10.1007/s00446-014-0238-z This version contains some new results (Section 6

    Message and time efficient multi-broadcast schemes

    Full text link
    We consider message and time efficient broadcasting and multi-broadcasting in wireless ad-hoc networks, where a subset of nodes, each with a unique rumor, wish to broadcast their rumors to all destinations while minimizing the total number of transmissions and total time until all rumors arrive to their destination. Under centralized settings, we introduce a novel approximation algorithm that provides almost optimal results with respect to the number of transmissions and total time, separately. Later on, we show how to efficiently implement this algorithm under distributed settings, where the nodes have only local information about their surroundings. In addition, we show multiple approximation techniques based on the network collision detection capabilities and explain how to calibrate the algorithms' parameters to produce optimal results for time and messages.Comment: In Proceedings FOMC 2013, arXiv:1310.459

    ShallowForest: Optimizing All-to-All Data Transmission in WANs

    Get PDF
    All-to-all data transmission is a typical data transmission pattern in both consensus protocols and blockchain systems. Developing an optimization scheme that provides high throughput and low latency data transmission can significantly benefit the performance of those systems. This thesis investigates the problem of optimizing all-to-all data transmission in a wide area network (WAN) using overlay multicast. I first prove that in a congestion-free core network model, using shallow tree overlays with height up to two is sufficient for all-to-all data transmission to achieve the optimal throughput allowed by the available network resources. Based on this finding, I build ShallowForest, a data plane optimization for consensus protocols and blockchain systems. The goal of ShallowForest is to improve consensus protocols' resilience to skewed client load distribution. Experiments with skewed client load across replicas in the Amazon cloud demonstrate that ShallowForest can improve the commit throughput of the EPaxos consensus protocol by up to 100% with up to 60% reduction in commit latenc
    corecore