313 research outputs found
Spectra: Robust Estimation of Distribution Functions in Networks
Distributed aggregation allows the derivation of a given global aggregate
property from many individual local values in nodes of an interconnected
network system. Simple aggregates such as minima/maxima, counts, sums and
averages have been thoroughly studied in the past and are important tools for
distributed algorithms and network coordination. Nonetheless, this kind of
aggregates may not be comprehensive enough to characterize biased data
distributions or when in presence of outliers, making the case for richer
estimates of the values on the network. This work presents Spectra, a
distributed algorithm for the estimation of distribution functions over large
scale networks. The estimate is available at all nodes and the technique
depicts important properties, namely: robust when exposed to high levels of
message loss, fast convergence speed and fine precision in the estimate. It can
also dynamically cope with changes of the sampled local property, not requiring
algorithm restarts, and is highly resilient to node churn. The proposed
approach is experimentally evaluated and contrasted to a competing state of the
art distribution aggregation technique.Comment: Full version of the paper published at 12th IFIP International
Conference on Distributed Applications and Interoperable Systems (DAIS),
Stockholm (Sweden), June 201
Fault-Tolerant Aggregation: Flow-Updating Meets Mass-Distribution
Flow-Updating (FU) is a fault-tolerant technique that has proved to be
efficient in practice for the distributed computation of aggregate functions in
communication networks where individual processors do not have access to global
information. Previous distributed aggregation protocols, based on repeated
sharing of input values (or mass) among processors, sometimes called
Mass-Distribution (MD) protocols, are not resilient to communication failures
(or message loss) because such failures yield a loss of mass. In this paper, we
present a protocol which we call Mass-Distribution with Flow-Updating (MDFU).
We obtain MDFU by applying FU techniques to classic MD. We analyze the
convergence time of MDFU showing that stochastic message loss produces low
overhead. This is the first convergence proof of an FU-based algorithm. We
evaluate MDFU experimentally, comparing it with previous MD and FU protocols,
and verifying the behavior predicted by the analysis. Finally, given that MDFU
incurs a fixed deviation proportional to the message-loss rate, we adjust the
accuracy of MDFU heuristically in a new protocol called MDFU with Linear
Prediction (MDFU-LP). The evaluation shows that both MDFU and MDFU-LP behave
very well in practice, even under high rates of message loss and even changing
the input values dynamically.Comment: 18 pages, 5 figures, To appear in OPODIS 201
A survey of distributed data aggregation algorithms
Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like COUNT, SUM, and AVERAGE. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.info:eu-repo/semantics/publishedVersio
- …