4 research outputs found
State Machine Replication Is More Expensive Than Consensus
Consensus and State Machine Replication (SMR) are generally considered to be equivalent problems. In certain system models, indeed, the two problems are computationally equivalent: any solution to the former problem leads to a solution to the latter, and vice versa.
In this paper, we study the relation between consensus and SMR from a complexity perspective. We find that, surprisingly, completing an SMR command can be more expensive than solving a consensus instance. Specifically, given a synchronous system model where every instance of consensus always terminates in constant time, completing an SMR command does not necessarily terminate in constant time. This result naturally extends to partially synchronous models. Besides theoretical interest, our result also corresponds to practical phenomena we identify empirically. We experiment with two well-known SMR implementations (Multi-Paxos and Raft) and show that, indeed, SMR is more expensive than consensus in practice. One important implication of our result is that - even under synchrony conditions - no SMR algorithm can ensure bounded response times
A Dual Digraph Approach for Leaderless Atomic Broadcast (Extended Version)
Many distributed systems work on a common shared state; in such systems,
distributed agreement is necessary for consistency. With an increasing number
of servers, these systems become more susceptible to single-server failures,
increasing the relevance of fault-tolerance. Atomic broadcast enables
fault-tolerant distributed agreement, yet it is costly to solve. Most practical
algorithms entail linear work per broadcast message. AllConcur -- a leaderless
approach -- reduces the work, by connecting the servers via a sparse resilient
overlay network; yet, this resiliency entails redundancy, limiting the
reduction of work. In this paper, we propose AllConcur+, an atomic broadcast
algorithm that lifts this limitation: During intervals with no failures, it
achieves minimal work by using a redundancy-free overlay network. When failures
do occur, it automatically recovers by switching to a resilient overlay
network. In our performance evaluation of non-failure scenarios, AllConcur+
achieves comparable throughput to AllGather -- a non-fault-tolerant distributed
agreement algorithm -- and outperforms AllConcur, LCR and Libpaxos both in
terms of throughput and latency. Furthermore, our evaluation of failure
scenarios shows that AllConcur+'s expected performance is robust with regard to
occasional failures. Thus, for realistic use cases, leveraging redundancy-free
distributed agreement during intervals with no failures improves performance
significantly.Comment: Overview: 24 pages, 6 sections, 3 appendices, 8 figures, 3 tables.
Modifications from previous version: extended the evaluation of AllConcur+
with a simulation of a multiple datacenters deploymen
Low cost consensus-based atomic broadcast
Theme 1 - Reseaux et systemes - Projet AdpSIGLEAvailable from INIST (FR), Document Supply Service, under shelf-number : 22588, issue : a.2000 n.1341 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc