164 research outputs found

    Robust epidemic aggregation under churn

    Get PDF
    In large-scale distributed systems data aggregation is a fundamental task that provides a global synopsis over a distributed set of data values. Epidemic protocols are based on a randomised communication paradigm inspired by biological systems and have been proposed to provide decentralised, scalable and fault-tolerant solutions to the data aggregation problem. However, in epidemic aggregation, nodes failure and churn have a detrimental effect on the accuracy of the local estimates of the global aggregation target. In this paper, a novel approach, the Robust Epidemic Aggregation Protocol (REAP), is proposed to provide robustness in the presence of churn by detecting three distinct phases in the aggregation process. An analysis of the impact of each phase over the estimation accuracy is provided. In particular, a novel mechanism is introduced to improve the phase that is most critical for the protocol accuracy. REAP is validated by means of simulations and is shown to achieve convergence with a good level of accuracy for a reasonable range of node churn rates

    Fault tolerant decentralised K-Means clustering for asynchronous large-scale networks

    Get PDF
    The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale

    Agreement in epidemic data aggregation

    Get PDF
    Computing and spreading global information in large-scale distributed systems pose significant challenges when scalability, parallelism, resilience and consistency are demanded. Epidemic protocols are a robust and scalable computing and communication paradigm that can be effectively used for information dissemination and data aggregation in a fully decentralised context where each network node requires the local computation of a global synopsis function. Theoretical analysis of epidemic protocols for synchronous and static network models provide guarantees on the convergence to a global target and on the consistency among the network nodes. However, practical applications in real-world networks may require the explicit detection of both local convergence and global agreement (consensus). This work introduces the Epidemic Consensus Protocol (ECP) for the determination of consensus on the convergence of a decentralised data aggregation task. ECP adopts a heuristic method to locally detect convergence of the aggregation task and stochastic phase transitions to detect global agreement and reach consensus. The performance of ECP has been investigated by means of simulations and compared to a tree-based Three-Phase Commit protocol (3PC). Although, as expected, ECP exhibits total communication costs greater than the optimal tree-based protocol, it is shown to have better performance and scalability properties; ECP can achieve faster convergence to consensus for large system sizes and inherits the intrinsic decentralisation, fault-tolerance and robustness properties of epidemic protocols

    Rapid and Round-free Multi-pair Asynchronous Push-Pull Aggregation

    Get PDF
    As various distributed algorithms and services demand overall information on large scale networks, the protocols that aggregate data over networks are essential, and the quality of aggregations determines the quality of those distributed algorithms and services. Though a variety of aggregation protocols have been proposed, gossip-based iterative aggregations have outstanding advantages especially in accuracy, result distribution, topology-independence, and resilience to network churns. However, most of iterative aggregations, especially push-pull style aggregations, suffer from two synchronization constraints: synchronized rounds and synchronized communication. Namely, iterative protocols generally need prior configurations to synchronize rounds over all nodes, and messages should be exchanged in a synchronous way in order to ensure accurate estimates in push-pull or push-sum protocols. This paper proposes multi-pair asynchronous push-pull aggregation (MAPPA), which liberates the push-pull aggregations from the synchronization constraints, and pursues a way to accelerate the aggregation speed. MAPPA considerably reduces aggregation times, and shows an improvement in fault-tolerance. Thanks to topology independence, inherent from gossip mechanisms, and its rapidness, MAPPA is resilient to network churns, and thus suitable for dynamic networks

    The Bedrock of Byzantine Fault Tolerance: A Unified Platform for BFT Protocol Design and Implementation

    Full text link
    Byzantine Fault-Tolerant (BFT) protocols have recently been extensively used by decentralized data management systems with non-trustworthy infrastructures, e.g., permissioned blockchains. BFT protocols cover a broad spectrum of design dimensions from infrastructure settings such as the communication topology, to more technical features such as commitment strategy and even fundamental social choice properties like order-fairness. The proliferation of different BFT protocols has rendered it difficult to navigate the BFT landscape, let alone determine the protocol that best meets application needs. This paper presents Bedrock, a unified platform for BFT protocols design, analysis, implementation, and experiments. Bedrock proposes a design space consisting of a set of design choices capturing the trade-offs between different design space dimensions and providing fundamentally new insights into the strengths and weaknesses of BFT protocols. Bedrock enables users to analyze and experiment with BFT protocols within the space of plausible choices, evolve current protocols to design new ones, and even uncover previously unknown protocols. Our experimental results demonstrate the capability of Bedrock to uniformly evaluate BFT protocols in new ways that were not possible before due to the diverse assumptions made by these protocols. The results validate Bedrock's ability to analyze and derive BFT protocols

    Asynchronous epidemic algorithms for consistency in large-scale systems

    Get PDF
    Achieving and detecting a globally consistent state is essential to many services in the large and extreme-scale distributed systems, especially when the desired consistent state is critical for services operation. Centralised and deterministic approaches for synchronisation and distributed consistency are not scalable and not fault-tolerant. Alternatively, epidemic-based paradigms are decentralised computations based on randomised communications. They are scalable, resilient, fault-tolerant, and converge to the desired target in logarithmic time with respect to system size. Thus, many distributed services have adopted epidemic protocols to achieve the consensus and the consistent state, mainly due to scalability concerns. The convergence of epidemic protocols is stochastically guaranteed. However, the detection of the convergence is probabilistic and non-explicit. In a real-world environment, systems are unreliable, and epidemic protocols cannot converge to the desired state. Thus, achieving convergence by itself does not ensure making a system-wide consistent state under dynamic conditions. The research work presented in this thesis introduces the Phase Transition Algorithm (PTA) to achieve distributed consistent state based on the explicit detection of convergence. Each phase in PTA is a decentralised decision-making process that implements epidemic data aggregation, in which the detection of convergence implies achieving a global agreement. The phases in PTA can be cascaded to achieve higher certainty as desired. Following the PTA, two epidemic protocols, namely PTP and ECP, are proposed to acquire of consensus, i.e. for the consistency in data dissemination and data aggregation. The protocols are examined through simulations, and experimental results have validated the protocols ability to achieve and explicitly detect the consensus among system nodes. The research work has also studied the epidemic data aggregation under nodes churn and network failures, in which the analysis has identified three phases of the aggregation process. The investigations have shown a different impact of nodes churn on each phase. The phase that is critical for the aggregation process has been studied further, which led to propose new robust data aggregation protocols, REAP and REAP+. Each protocol has a different decentralised replication method, and both implements distributed failure detection and instantaneous mass restoration mechanisms. Simulations have validated the protocols, and results have shown protocols ability to converge, detect convergence, and produce competitive accuracy under various levels of nodes churn. Furthermore, distributed consistency in continuous systems is addressed in the research. The work has proposed a novel continuous epidemic protocol with the adaptive restart mechanism. The protocol restarts either upon the detection of system convergence or upon the detection of divergence. Also, the protocol introduces the seed selection method for the peak data distribution in decentralised approaches, which was a challenge that requires single-point initialisation and leader-election step. The simulations validated the performance of the algorithm under static and dynamic conditions and approved that convergence and divergence detection accuracy can be tuned as desired. Finally, the research work shows that combining and integrating of the proposed protocols enables extreme-scale distributed systems to achieve and detect global consistent states even under realistic and dynamical conditions

    Towards practicalization of blockchain-based decentralized applications

    Get PDF
    Blockchain can be defined as an immutable ledger for recording transactions, maintained in a distributed network of mutually untrusting peers. Blockchain technology has been widely applied to various fields beyond its initial usage of cryptocurrency. However, blockchain itself is insufficient to meet all the desired security or efficiency requirements for diversified application scenarios. This dissertation focuses on two core functionalities that blockchain provides, i.e., robust storage and reliable computation. Three concrete application scenarios including Internet of Things (IoT), cybersecurity management (CSM), and peer-to-peer (P2P) content delivery network (CDN) are utilized to elaborate the general design principles for these two main functionalities. Among them, the IoT and CSM applications involve the design of blockchain-based robust storage and management while the P2P CDN requires reliable computation. Such general design principles derived from disparate application scenarios have the potential to realize practicalization of many other blockchain-enabled decentralized applications. In the IoT application, blockchain-based decentralized data management is capable of handling faulty nodes, as designed in the cybersecurity application. But an important issue lies in the interaction between external network and blockchain network, i.e., external clients must rely on a relay node to communicate with the full nodes in the blockchain. Compromization of such relay nodes may result in a security breach and even a blockage of IoT sensors from the network. Therefore, a censorship-resistant blockchain-based decentralized IoT management system is proposed. Experimental results from proof-of-concept implementation and deployment in a real distributed environment show the feasibility and effectiveness in achieving censorship resistance. The CSM application incorporates blockchain to provide robust storage of historical cybersecurity data so that with a certain level of cyber intelligence, a defender can determine if a network has been compromised and to what extent. The CSM functions can be categorized into three classes: Network-centric (N-CSM), Tools-centric (T-CSM) and Application-centric (A-CSM). The cyber intelligence identifies new attackers, victims, or defense capabilities. Moreover, a decentralized storage network (DSN) is integrated to reduce on-chain storage costs without undermining its robustness. Experiments with the prototype implementation and real-world cyber datasets show that the blockchain-based CSM solution is effective and efficient. The P2P CDN application explores and utilizes the functionality of reliable computation that blockchain empowers. Particularly, P2P CDN is promising to provide benefits including cost-saving and scalable peak-demand handling compared with centralized CDNs. However, reliable P2P delivery requires proper enforcement of delivery fairness. Unfortunately, most existing studies on delivery fairness are based on non-cooperative game-theoretic assumptions that are arguably unrealistic in the ad-hoc P2P setting. To address this issue, an expressive security requirement for desired fair P2P content delivery is defined and two efficient approaches based on blockchain for P2P downloading and P2P streaming are proposed. The proposed system guarantees the fairness for each party even when all others collude to arbitrarily misbehave and achieves asymptotically optimal on-chain costs and optimal delivery communication

    Albatross: An optimistic consensus algorithm

    Full text link
    The area of distributed ledgers is a vast and quickly developing landscape. At the heart of most distributed ledgers is their consensus protocol. The consensus protocol describes the way participants in a distributed network interact with each other to obtain and agree on a shared state. While classical consensus Byzantine fault tolerant (BFT) algorithms are designed to work in closed, size-limited networks only, modern distributed ledgers -- and blockchains in particular -- often focus on open, permissionless networks. In this paper, we present a novel blockchain consensus algorithm, called Albatross, inspired by speculative BFT algorithms. Transactions in Albatross benefit from strong probabilistic finality. We describe the technical specification of Albatross in detail and analyse its security and performance. We conclude that the protocol is secure under regular PBFT security assumptions and has a performance close to the theoretical maximum for single-chain Proof-of-Stake consensus algorithms

    Project Final Report: HPC-Colony II

    Full text link
    • …
    corecore