158 research outputs found

    Practical impact of group communication theory

    Get PDF
    Practical impact of group communication theory Andre Schiper Group communication is an important topic in fault-tolerant distributed applications. The paper summarizes the main contributions of practical importance that contributed to our current understanding of group communication. These contributions are classified into ''abstractions'' and ''specifications'', ''paradigms'', ''system models'', ''algorithms'', and ''theoretical results''. Some open issues are discussed at the end of the paper

    Failure Detection vs. Group Membership in Fault-Tolerant Distributed Systems: Hidden Trade-Offs

    Get PDF
    Failure detection and group membership are two important components of fault-tolerant distributed systems. Understanding their role is essential when developing efficient solutions, not only in failure-free runs, but also in runs in which processes do crash. While group membership provides consistent information about the status of processes in the system, failure detectors provide inconsistent information. This paper discusses the trade-offs related to the use of these two components, and clarifies their roles using three examples. The first example shows a case where group membership may favourably be replaced by a failure detection mechanism. The second example illustrates a case where group membership is mandatory. Finally, the third example shows a case where neither group membership nor failure detectors are needed (they may be replaced by weak ordering oracles)

    Handling message semantics with Generic Broadcast protocols

    Get PDF

    Optimistic atomic broadcast: a pragmatic viewpoint

    Get PDF
    Optimistic atomic broadcast: a pragmatic viewpoint F.Pedone and A.Schiper This paper presents the Optimistic Atomic Broadcast algorithm (OPT-ABcast) which exploits the spontaneous total-order property experienced in local-area networks in order to allow fast delivery of messages. The OPT-ABcast algorithm is based on a sequence of stages, and messages can be delivered during a stage or at the end of a stage. During a stage, processes deliver messages fast. Whenever the spontaneous total-order property does not hold, processes terminate the current stage and start a new one by solving a Consensus problem which may lead to the delivery of some messages. We evaluate the efficiency of the OPT-ABcast algorithm using the notion of delivery latency. Keywords: Optimistic algorithms; Atomic broadcast; Efficient algorithms; Consensus; Asynchronous systems

    From Group Communication to Transactions in Distributed Systems

    Get PDF

    Fault-Tolerant Mobile Agent Execution

    Get PDF

    A new algorithm to implement causal ordering

    Get PDF

    Handling message semantics with Generic Broadcast protocols

    Full text link

    Performance Analysis of a Consensus Algorithm Combining Stochastic Activity Networks and Measurements

    Get PDF
    A. Coccoli, P. Urban, A. Bondavalli, and A. Schiper. Performance analysis of a consensus algorithm combining Stochastic Activity Networks and measurements. In Proc. Int'l Conf. on Dependable Systems and Networks (DSN), pages 551-560, Washington, DC, USA, June 2002. Protocols which solve agreement problems are essential building blocks for fault tolerant distributed applications. While many protocols have been published, little has been done to analyze their performance. This paper represents a starting point for such studies, by focusing on the consensus problem, a problem related to most other agreement problems. The paper analyzes the latency of a consensus algorithm designed for the asynchronous model with failure detectors, by combining experiments on a cluster of PCs and simulation using Stochastic Activity Networks. We evaluated the latency in runs (1) with no failures nor failure suspicions, (2) with failures but no wrong suspicions and (3) with no failures but with (wrong) failure suspicions. We validated the adequacy and the usability of the Stochastic Activity Network model by comparing experimental results with those obtained from the model. This has led us to identify limitations of the model and the measurements, and suggests new directions for evaluating the performance of agreement protocols. Keywords: quantitative analysis, distributed consensus, failure detectors, Stochastic Activity Networks, measurement
    • …
    corecore