48 research outputs found

    Dependability in Aggregation by Averaging

    Get PDF
    Aggregation is an important building block of modern distributed applications, allowing the determination of meaningful properties (e.g. network size, total storage capacity, average load, majorities, etc.) that are used to direct the execution of the system. However, the majority of the existing aggregation algorithms exhibit relevant dependability issues, when prospecting their use in real application environments. In this paper, we reveal some dependability issues of aggregation algorithms based on iterative averaging techniques, giving some directions to solve them. This class of algorithms is considered robust (when compared to common tree-based approaches), being independent from the used routing topology and providing an aggregation result at all nodes. However, their robustness is strongly challenged and their correctness often compromised, when changing the assumptions of their working environment to more realistic ones. The correctness of this class of algorithms relies on the maintenance of a fundamental invariant, commonly designated as "mass conservation". We will argue that this main invariant is often broken in practical settings, and that additional mechanisms and modifications are required to maintain it, incurring in some degradation of the algorithms performance. In particular, we discuss the behavior of three representative algorithms Push-Sum Protocol, Push-Pull Gossip protocol and Distributed Random Grouping under asynchronous and faulty (with message loss and node crashes) environments. More specifically, we propose and evaluate two new versions of the Push-Pull Gossip protocol, which solve its message interleaving problem (evidenced even in a synchronous operation mode).Comment: 14 pages. Presented in Inforum 200

    Geographic Gossip: Efficient Averaging for Sensor Networks

    Full text link
    Gossip algorithms for distributed computation are attractive due to their simplicity, distributed nature, and robustness in noisy and uncertain environments. However, using standard gossip algorithms can lead to a significant waste in energy by repeatedly recirculating redundant information. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is related to the slow mixing times of random walks on the communication graph. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing geographic routing combined with a simple resampling method, we demonstrate substantial gains over previously proposed gossip protocols. For regular graphs such as the ring or grid, our algorithm improves standard gossip by factors of nn and n\sqrt{n} respectively. For the more challenging case of random geometric graphs, our algorithm computes the true average to accuracy ϵ\epsilon using O(n1.5lognlogϵ1)O(\frac{n^{1.5}}{\sqrt{\log n}} \log \epsilon^{-1}) radio transmissions, which yields a nlogn\sqrt{\frac{n}{\log n}} factor improvement over standard gossip algorithms. We illustrate these theoretical results with experimental comparisons between our algorithm and standard methods as applied to various classes of random fields.Comment: To appear, IEEE Transactions on Signal Processin

    Global Computation in a Poorly Connected World: Fast Rumor Spreading with No Dependence on Conductance

    Get PDF
    In this paper, we study the question of how efficiently a collection of interconnected nodes can perform a global computation in the widely studied GOSSIP model of communication. In this model, nodes do not know the global topology of the network, and they may only initiate contact with a single neighbor in each round. This model contrasts with the much less restrictive LOCAL model, where a node may simultaneously communicate with all of its neighbors in a single round. A basic question in this setting is how many rounds of communication are required for the information dissemination problem, in which each node has some piece of information and is required to collect all others. In this paper, we give an algorithm that solves the information dissemination problem in at most O(D+polylog(n))O(D+\text{polylog}{(n)}) rounds in a network of diameter DD, withno dependence on the conductance. This is at most an additive polylogarithmic factor from the trivial lower bound of DD, which applies even in the LOCAL model. In fact, we prove that something stronger is true: any algorithm that requires TT rounds in the LOCAL model can be simulated in O(T+polylog(n))O(T +\mathrm{polylog}(n)) rounds in the GOSSIP model. We thus prove that these two models of distributed computation are essentially equivalent

    What you can do with Coordinated Samples

    Full text link
    Sample coordination, where similar instances have similar samples, was proposed by statisticians four decades ago as a way to maximize overlap in repeated surveys. Coordinated sampling had been since used for summarizing massive data sets. The usefulness of a sampling scheme hinges on the scope and accuracy within which queries posed over the original data can be answered from the sample. We aim here to gain a fundamental understanding of the limits and potential of coordination. Our main result is a precise characterization, in terms of simple properties of the estimated function, of queries for which estimators with desirable properties exist. We consider unbiasedness, nonnegativity, finite variance, and bounded estimates. Since generally a single estimator can not be optimal (minimize variance simultaneously) for all data, we propose {\em variance competitiveness}, which means that the expectation of the square on any data is not too far from the minimum one possible for the data. Surprisingly perhaps, we show how to construct, for any function for which an unbiased nonnegative estimator exists, a variance competitive estimator.Comment: 4 figures, 21 pages, Extended Abstract appeared in RANDOM 201

    Fault-tolerant aggregation by flow updating

    Get PDF
    Data aggregation plays an important role in the design of scalable systems, allowing the determination of meaningful system-wide properties to direct the execution of distributed applications. In the particular case of wireless sensor networks, data collection is often only practicable if aggregation is performed. Several aggregation algorithms have been proposed in the last few years, exhibiting different properties in terms of accuracy, speed and communication tradeoffs. Nonetheless, existing approaches are found lacking in terms of fault tolerance. In this paper, we introduce a novel fault-tolerant averaging based data aggregation algorithm. It tolerates substantial message loss (link failures), while competing algorithms in the same class can be affected by a single lost message. The algorithm is based on manipulating flows (in the graph theoretical sense), that are updated using idempotent messages, providing it with unique robustness capabilities. Furthermore, evaluation results obtained by comparing it with other averaging approaches have revealed that it outperforms them in terms of time and message complexity
    corecore