21 research outputs found

    Performance comparison between the Paxos and Chandra-Toueg consensus algorithms

    Get PDF
    Protocols which solve agreement problems are essential building blocks for fault tolerant distributed applications. While many protocols have been published, little has been done to analyze their performance. This paper represents a starting point for such studies, by focusing on the consensus problem, a problem related to most other agreement problems. The paper compares the latency of two consensus algorithms designed for the asynchronous model with failure detectors: the Paxos algorithm and the Chandra-Toueg algorithm. We varied the number of processes which take part in the execution. Moreover, we evaluated the latency in different classes of runs: (1) runs with no failures nor failure suspicions, (2) runs with failures but no wrong suspicions. We determined the latency by measurements on a cluster of PCs interconnected with a 100 Mbps Ethernet network. We found that the Paxos algorithm is more efficient than the Chandra-Toueg algorithm when the process that coordinates the first round of the protocol crashes. The two algorithms have almost the same performance in all other cases

    Student mini-kernel project based on an FPGA board

    Full text link

    Solving atomic multicast when groups crash

    Get PDF
    In this paper, we study the atomic multicast problem, a fundamental abstraction for building faulttolerant systems. In the atomic multicast problem, the system is divided into non-empty and disjoint groups of processes. Multicast messages may be addressed to any subset of groups, each message possibly being multicast to a different subset. Several papers previously studied this problem either in local area networks [3, 9, 20] or wide area networks [13, 21]. However, none of them considered atomic multicast when groups may crash. We present two atomic multicast algorithms that tolerate the crash of groups. The first algorithm tolerates an arbitrary number of failures, is genuine (i.e., to deliver a message m, only addressees of m are involved in the protocol), and uses the perfect failures detector P. We show that among realistic failure detectors, i.e., those that do not predict the future, P is necessary to solve genuine atomic multicast if we do not bound the number of processes that may fail. Thus, P is the weakest realistic failure detector for solving genuine atomic multicast when an arbitrary number of processes may crash. Our second algorithm is non-genuine and less resilient to process failures than the first algorithm but has several advantages: (i) it requires perfect failure detection within groups only, and not across the system, (ii) as we show in the paper it can be modified to rely on unreliable failure detection at the cost of a weaker liveness guarantee, and (iii) it is fast, messages addressed to multiple groups may be delivered within two inter-group message delays only

    Semi-Passive Replication

    No full text
    This paper presents the semi-passive replication technique -- a variant of passive replication -- that can be implemented in the asynchronous system model without requiring agreement on a primary (usually done using a membership service). Passive replication is a popular replication technique since it can tolerate non-deterministic servers (e.g., multi-threaded servers) and uses little processing power when compared to other replication techniques. However, passive replication suffers from a high reconfiguration cost in case of the failure of the primary (e.g., the cost of running the membership service to define a new view). The semi-passive replication technique presented in the paper benefits from the same advantages than passive replication. However, since it does not require a group membership service, the semi-passive replication technique has a considerably lower cost in case of failure. As explained in the paper, this technique can benefit from an aggressive time-out value that is significantly lower than what a group membership allows. As a result, the reaction to crashes is greatly improved. The semi-passive replication algorithm uses failure detectors. The algorithm given in the paper is analysed in the failure free case and in the case of one server crash. The response time (for the client) of these two scenarios is analysed through simulation
    corecore