48 research outputs found
Dependability in Aggregation by Averaging
Aggregation is an important building block of modern distributed
applications, allowing the determination of meaningful properties (e.g. network
size, total storage capacity, average load, majorities, etc.) that are used to
direct the execution of the system. However, the majority of the existing
aggregation algorithms exhibit relevant dependability issues, when prospecting
their use in real application environments. In this paper, we reveal some
dependability issues of aggregation algorithms based on iterative averaging
techniques, giving some directions to solve them. This class of algorithms is
considered robust (when compared to common tree-based approaches), being
independent from the used routing topology and providing an aggregation result
at all nodes. However, their robustness is strongly challenged and their
correctness often compromised, when changing the assumptions of their working
environment to more realistic ones. The correctness of this class of algorithms
relies on the maintenance of a fundamental invariant, commonly designated as
"mass conservation". We will argue that this main invariant is often broken in
practical settings, and that additional mechanisms and modifications are
required to maintain it, incurring in some degradation of the algorithms
performance. In particular, we discuss the behavior of three representative
algorithms Push-Sum Protocol, Push-Pull Gossip protocol and Distributed Random
Grouping under asynchronous and faulty (with message loss and node crashes)
environments. More specifically, we propose and evaluate two new versions of
the Push-Pull Gossip protocol, which solve its message interleaving problem
(evidenced even in a synchronous operation mode).Comment: 14 pages. Presented in Inforum 200
Geographic Gossip: Efficient Averaging for Sensor Networks
Gossip algorithms for distributed computation are attractive due to their
simplicity, distributed nature, and robustness in noisy and uncertain
environments. However, using standard gossip algorithms can lead to a
significant waste in energy by repeatedly recirculating redundant information.
For realistic sensor network model topologies like grids and random geometric
graphs, the inefficiency of gossip schemes is related to the slow mixing times
of random walks on the communication graph. We propose and analyze an
alternative gossiping scheme that exploits geographic information. By utilizing
geographic routing combined with a simple resampling method, we demonstrate
substantial gains over previously proposed gossip protocols. For regular graphs
such as the ring or grid, our algorithm improves standard gossip by factors of
and respectively. For the more challenging case of random
geometric graphs, our algorithm computes the true average to accuracy
using radio
transmissions, which yields a factor improvement over
standard gossip algorithms. We illustrate these theoretical results with
experimental comparisons between our algorithm and standard methods as applied
to various classes of random fields.Comment: To appear, IEEE Transactions on Signal Processin
Global Computation in a Poorly Connected World: Fast Rumor Spreading with No Dependence on Conductance
In this paper, we study the question of how efficiently a collection of
interconnected nodes can perform a global computation in the widely studied
GOSSIP model of communication. In this model, nodes do not know the global
topology of the network, and they may only initiate contact with a single
neighbor in each round. This model contrasts with the much less restrictive
LOCAL model, where a node may simultaneously communicate with all of its
neighbors in a single round. A basic question in this setting is how many
rounds of communication are required for the information dissemination problem,
in which each node has some piece of information and is required to collect all
others. In this paper, we give an algorithm that solves the information
dissemination problem in at most rounds in a network
of diameter , withno dependence on the conductance. This is at most an
additive polylogarithmic factor from the trivial lower bound of , which
applies even in the LOCAL model. In fact, we prove that something stronger is
true: any algorithm that requires rounds in the LOCAL model can be
simulated in rounds in the GOSSIP model. We thus
prove that these two models of distributed computation are essentially
equivalent
What you can do with Coordinated Samples
Sample coordination, where similar instances have similar samples, was
proposed by statisticians four decades ago as a way to maximize overlap in
repeated surveys. Coordinated sampling had been since used for summarizing
massive data sets.
The usefulness of a sampling scheme hinges on the scope and accuracy within
which queries posed over the original data can be answered from the sample. We
aim here to gain a fundamental understanding of the limits and potential of
coordination. Our main result is a precise characterization, in terms of simple
properties of the estimated function, of queries for which estimators with
desirable properties exist. We consider unbiasedness, nonnegativity, finite
variance, and bounded estimates.
Since generally a single estimator can not be optimal (minimize variance
simultaneously) for all data, we propose {\em variance competitiveness}, which
means that the expectation of the square on any data is not too far from the
minimum one possible for the data. Surprisingly perhaps, we show how to
construct, for any function for which an unbiased nonnegative estimator exists,
a variance competitive estimator.Comment: 4 figures, 21 pages, Extended Abstract appeared in RANDOM 201
Fault-tolerant aggregation by flow updating
Data aggregation plays an important role in the design of scalable systems, allowing the determination of meaningful system-wide properties to direct the execution of distributed applications. In the particular case of wireless sensor networks, data collection is often only practicable if aggregation is performed. Several aggregation algorithms have been proposed in the last few years, exhibiting different properties in terms of accuracy, speed and communication tradeoffs. Nonetheless, existing approaches are found lacking in terms of fault tolerance. In this paper, we introduce a novel fault-tolerant averaging based data aggregation algorithm. It tolerates substantial message loss (link failures), while competing algorithms in the same class can be affected by a single lost message. The algorithm is based on manipulating flows (in the graph theoretical sense), that are updated using idempotent messages, providing it with unique robustness capabilities. Furthermore, evaluation results obtained by comparing it with other averaging approaches have revealed that it outperforms them in terms of time and message complexity