6 research outputs found

    Dependability in Aggregation by Averaging

    Get PDF
    Aggregation is an important building block of modern distributed applications, allowing the determination of meaningful properties (e.g. network size, total storage capacity, average load, majorities, etc.) that are used to direct the execution of the system. However, the majority of the existing aggregation algorithms exhibit relevant dependability issues, when prospecting their use in real application environments. In this paper, we reveal some dependability issues of aggregation algorithms based on iterative averaging techniques, giving some directions to solve them. This class of algorithms is considered robust (when compared to common tree-based approaches), being independent from the used routing topology and providing an aggregation result at all nodes. However, their robustness is strongly challenged and their correctness often compromised, when changing the assumptions of their working environment to more realistic ones. The correctness of this class of algorithms relies on the maintenance of a fundamental invariant, commonly designated as "mass conservation". We will argue that this main invariant is often broken in practical settings, and that additional mechanisms and modifications are required to maintain it, incurring in some degradation of the algorithms performance. In particular, we discuss the behavior of three representative algorithms Push-Sum Protocol, Push-Pull Gossip protocol and Distributed Random Grouping under asynchronous and faulty (with message loss and node crashes) environments. More specifically, we propose and evaluate two new versions of the Push-Pull Gossip protocol, which solve its message interleaving problem (evidenced even in a synchronous operation mode).Comment: 14 pages. Presented in Inforum 200

    Spectra: Robust Estimation of Distribution Functions in Networks

    Get PDF
    Distributed aggregation allows the derivation of a given global aggregate property from many individual local values in nodes of an interconnected network system. Simple aggregates such as minima/maxima, counts, sums and averages have been thoroughly studied in the past and are important tools for distributed algorithms and network coordination. Nonetheless, this kind of aggregates may not be comprehensive enough to characterize biased data distributions or when in presence of outliers, making the case for richer estimates of the values on the network. This work presents Spectra, a distributed algorithm for the estimation of distribution functions over large scale networks. The estimate is available at all nodes and the technique depicts important properties, namely: robust when exposed to high levels of message loss, fast convergence speed and fine precision in the estimate. It can also dynamically cope with changes of the sampled local property, not requiring algorithm restarts, and is highly resilient to node churn. The proposed approach is experimentally evaluated and contrasted to a competing state of the art distribution aggregation technique.Comment: Full version of the paper published at 12th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS), Stockholm (Sweden), June 201

    Robust distributed data aggregation

    Get PDF
    Tese de doutoramento Programa Doutoral em Informática MAP-iDistributed aggregation algorithms are an important building block of modern large scale systems, as it allows the determination of meaningful system-wide properties (e.g., network size, total storage capacity, average load, or majorities) which are required to direct the execution of distributed applications. In the last decade, several algorithms have been proposed to address the distributed computation of aggregation functions (e.g., COUNT, SUM, AVERAGE, and MAX/MIN), exhibiting different properties in terms of accuracy, speed and communication tradeoffs. However, existing approaches exhibit many issues when challenged in faulty and dynamic environments, lacking in terms of fault-tolerance and support to churn. This study details a novel distributed aggregation approach, named Flow Updating, which is fault-tolerant and able to operate on dynamics networks. The algorithm is based on manipulating flows (inspired by the concept from graph theory), that are updated using idempotent messages, providing it with unique robustness capabilities. Experimental results showed that Flow Updating outperforms previous averaging algorithms in terms of time and message complexity, and unlike them it self adapts to churn and changes of the initial input values without requiring any periodic restart, supporting node crashes and high levels of message loss. In addition to this main contribution, others can also be found in this research work, namely: a definition of the aggregation problem is proposed; existing distributed aggregation algorithm are surveyed and classified into a comprehensive taxonomy; a novel algorithm is introduced, based on Flow Updating, to estimate the Cumulative Distribution Function (CDF) of a global system attribute. It is expected that this work will constitute a relevant contribution to the area of distributed computing, in particular to the robust distributed computation of aggregation functions in dynamic networks.Os algoritmos de agregação distribuídos têm um papel importante no desenho dos sistemas de larga escala modernos, uma vez que permitem determinar o valor de propriedades globais do sistema (e.g., tamanho da rede, capacidade total de armazenamento, carga média, ou maiorias) que são fundamentais para a execução de outras aplicações distribuídas. Ao longo da última década, diversos algoritmos têm sido propostos para calcular funções de agregação (e.g., CONTAGEM, SOMA, M´E DIA, ou MIN/MAX), possuindo diferentes características em termos de precisão, velocidade e comunicação. No entanto, as técnicas existentes exibem vários problemas quando executadas em ambientes com faltas e dinâmicos, deixando a desejar em termos de tolerância a faltas e não suportando a entrada/saída de nós. Este estudo descreve detalhadamente uma nova abordagem para calcular funções de agregação de forma distribuída, denominada Flow Updating, que é tolerante a faltas e capaz de operar em redes dinámicas. O algoritmo é baseada na manipulacão de fluxos (inspirado no conceito da teoria de grafos), que são atualizados por mensagens idempotentes, conferindo-lhe capacidades unicas em termos de robustez. Os resultados experimentais demonstram que o Flow Updating supera os anteriores algoritmos baseados em técnicas de averaging em termos de complexidade de tempo e mensagens, e, ao contrário destes, auto adapta-se a mudanc¸as da rede (i.e., entrada/saída de nós e alteraçãoo dos valores iniciais) sem necessitar de reiniciar periodicamente a sua execuçãoo, suportando falhas de nos e elevados níveis de perdas de mensagens. Para além desta contribuição principal, outras são também encontradas neste trabalho, nomeadamente: é proposta uma definição do problema da agregação; é descrito o estado da arte em termos dos algoritmos de agregação distribuídos, e estes são classificados numa taxonomia abrangente; é apresentado um novo algoritmo baseado no Flow Updating para estimar a Funcão de Distribuição Cumulativa (CDF) de um atributo global do sistema
    corecore