514 research outputs found

    Scalable Byzantine Reliable Broadcast

    Get PDF
    Byzantine reliable broadcast is a powerful primitive that allows a set of processes to agree on a message from a designated sender, even if some processes (including the sender) are Byzantine. Existing broadcast protocols for this setting scale poorly, as they typically build on quorum systems with strong intersection guarantees, which results in linear per-process communication and computation complexity. We generalize the Byzantine reliable broadcast abstraction to the probabilistic setting, allowing each of its properties to be violated with a fixed, arbitrarily small probability. We leverage these relaxed guarantees in a protocol where we replace quorums with stochastic samples. Compared to quorums, samples are significantly smaller in size, leading to a more scalable design. We obtain the first Byzantine reliable broadcast protocol with logarithmic per-process communication and computation complexity. We conduct a complete and thorough analysis of our protocol, deriving bounds on the probability of each of its properties being compromised. During our analysis, we introduce a novel general technique that we call adversary decorators. Adversary decorators allow us to make claims about the optimal strategy of the Byzantine adversary without imposing any additional assumptions. We also introduce Threshold Contagion, a model of message propagation through a system with Byzantine processes. To the best of our knowledge, this is the first formal analysis of a probabilistic broadcast protocol in the Byzantine fault model. We show numerically that practically negligible failure probabilities can be achieved with realistic security parameters

    Observing the clouds : a survey and taxonomy of cloud monitoring

    Get PDF
    This research was supported by a Royal Society Industry Fellowship and an Amazon Web Services (AWS) grant. Date of Acceptance: 10/12/2014Monitoring is an important aspect of designing and maintaining large-scale systems. Cloud computing presents a unique set of challenges to monitoring including: on-demand infrastructure, unprecedented scalability, rapid elasticity and performance uncertainty. There are a wide range of monitoring tools originating from cluster and high-performance computing, grid computing and enterprise computing, as well as a series of newer bespoke tools, which have been designed exclusively for cloud monitoring. These tools express a number of common elements and designs, which address the demands of cloud monitoring to various degrees. This paper performs an exhaustive survey of contemporary monitoring tools from which we derive a taxonomy, which examines how effectively existing tools and designs meet the challenges of cloud monitoring. We conclude by examining the socio-technical aspects of monitoring, and investigate the engineering challenges and practices behind implementing monitoring strategies for cloud computing.Publisher PDFPeer reviewe

    NEEM: network-friendly epidemic multicast

    Get PDF
    Epidemic, or probabilistic, multicast protocols have emerged as a viable mechanism to circumvent the scalabil- ity problems of reliable multicast protocols. However, most existing epidemic approaches use connectionless transport protocols to exchange messages and rely on the intrinsic robustness of the epidemic dissemination to mask network omissions. Unfortunately, such an approach is not network- friendly, since the epidemic protocol makes no effort to re- duce the load imposed on the network when the system is congested. In this paper, we propose a novel epidemic protocol whose main characteristic is to be network-friendly. This property is achieved by relying on connection-oriented transport connections, such as TCP/IP, to support the com- munication among peers. Since during congestion mes- sages accumulate in the border of the network, the pro- tocol uses an innovative buffer management scheme, that combines different selection techniques to discard messages upon overflow. This technique improves the quality of the information delivered to the application during periods of network congestion. The protocol has been implemented and the benefits of the approach are illustrated using a com- bination of experimental and simulation results

    Epidemic broadcast trees

    Get PDF
    There is an inherent trade-off between epidemic and deterministic tree-based broadcast primitives. Tree-based approaches have a small message complexity in steady-state but are very fragile in the presence of faults. Gossip, or epidemic, protocols have a higher message complexity but also offer much higher resilience. This paper proposes an integrated broadcast scheme that combines both approaches. We use a low cost scheme to build and maintain broadcast trees embedded on a gossip-based overlay. The protocol sends the message payload preferably via tree branches but uses the remaining links of the gossip overlay for fast recovery and expedite tree healing. Experimental evaluation presented in the paper shows that our new strategy has a low overhead and that is able to support large number of faults while maintaining a high reliability.This work was partially supported by project P-SON: Probabilistically Structured Overlay Networks (POSC/EIA/60941/2004)

    Hybrid Dissemination: Adding Determinism to Probabilistic Multicasting in Large-Scale P2P Systems

    Get PDF
    Abstract. Epidemic protocols have demonstrated remarkable scalability and robustness in disseminating information on internet-scale, dynamic P2P systems. However, popular instances of such protocols suffer from a number of significant drawbacks, such as increased message overhead in push-based systems, or low dissemination speed in pull-based ones. In this paper we study push-based epidemic dissemination algorithms, in terms of hit ratio, communication overhead, dissemination speed, and resilience to failures and node churn. We devise a hybrid push-based dissemination algorithm, combining probabilistic with deterministic properties, which limits message overhead to an order of magnitude lower than that of the purely probabilistic dissemination model, while retaining strong probabilistic guarantees for complete dissemination of messages. Our extensive experimentation shows that our proposed algorithm outperforms that model both in static and dynamic network scenarios, as well as in the face of large-scale catastrophic failures. Moreover, the proposed algorithm distributes the dissemination load uniformly on all participating nodes. Keywords: Epidemic/Gossip protocols, Information Dissemination, Peer-to-Peer

    Emergent structure in unstructured epidemic multicast

    Get PDF
    In epidemic or gossip-based multicast protocols, each node simply relays each message to some random neighbors, such that all destinations receive it at least once with high proba- bility. In sharp contrast, structured multicast protocols explicitly build and use a spanning tree to take advantage of efficient paths, and aim at having each message received exactly once. Unfortunately, when failures occur, the tree must be rebuilt. Gossiping thus provides simplicity and resilience at the expense of performance and resource efficiency. In this paper we propose a novel technique that exploits knowledge about the environment to schedule payload transmission when gossiping. The resulting protocol retains the desirable qualities of gossip, but approximates the performance of structured multicast. In some sense, instead of imposing structure by construction, we let it emerge from the operation of the gossip protocol. Experimental evaluation shows that this approach is effective even when knowledge about the environment is only approximate.(undefined

    GOSSIPKIT: A Unified Component Framework for Gossip

    Get PDF
    International audienceAlthough the principles of gossip protocols are relatively easy to grasp, their variety can make their design and evaluation highly time consuming. This problem is compounded by the lack of a unified programming framework for gossip, which means developers cannot easily reuse, compose, or adapt existing solutions to fit their needs, and have limited opportunities to share knowledge and ideas. In this paper, we consider how component frameworks, which have been widely applied to implement middleware solutions, can facilitate the development of gossip-based systems in a way that is both generic and simple. We show how such an approach can maximise code reuse, simplify the implementation of gossip protocols, and facilitate dynamic evolution and re-deployment

    Mobile Computing in Digital Ecosystems: Design Issues and Challenges

    Full text link
    In this paper we argue that the set of wireless, mobile devices (e.g., portable telephones, tablet PCs, GPS navigators, media players) commonly used by human users enables the construction of what we term a digital ecosystem, i.e., an ecosystem constructed out of so-called digital organisms (see below), that can foster the development of novel distributed services. In this context, a human user equipped with his/her own mobile devices, can be though of as a digital organism (DO), a subsystem characterized by a set of peculiar features and resources it can offer to the rest of the ecosystem for use from its peer DOs. The internal organization of the DO must address issues of management of its own resources, including power consumption. Inside the DO and among DOs, peer-to-peer interaction mechanisms can be conveniently deployed to favor resource sharing and data dissemination. Throughout this paper, we show that most of the solutions and technologies needed to construct a digital ecosystem are already available. What is still missing is a framework (i.e., mechanisms, protocols, services) that can support effectively the integration and cooperation of these technologies. In addition, in the following we show that that framework can be implemented as a middleware subsystem that enables novel and ubiquitous forms of computation and communication. Finally, in order to illustrate the effectiveness of our approach, we introduce some experimental results we have obtained from preliminary implementations of (parts of) that subsystem.Comment: Proceedings of the 7th International wireless Communications and Mobile Computing conference (IWCMC-2011), Emergency Management: Communication and Computing Platforms Worksho
    • …
    corecore