23 research outputs found
Better Sooner Rather Than Later
This article unifies and generalizes fundamental results related to
-process asynchronous crash-prone distributed computing. More precisely, it
proves that for every , assuming that process failures occur
only before the number of participating processes bypasses a predefined
threshold that equals (a participating process is a process that has
executed at least one statement of its code), an asynchronous algorithm exists
that solves consensus for processes in the presence of crash failures
if and only if . In a very simple and interesting way, the "extreme"
case boils down to the celebrated FLP impossibility result (1985, 1987).
Moreover, the second extreme case, namely , captures the celebrated mutual
exclusion result by E.W. Dijkstra (1965) that states that mutual exclusion can
be solved for processes in an asynchronous read/write shared memory system
where any number of processes may crash (but only) before starting to
participate in the algorithm (that is, participation is not required, but once
a process starts participating it may not fail). More generally, the
possibility/impossibility stated above demonstrates that more failures can be
tolerated when they occur earlier in the computation (hence the title).Comment: 10 page
The eventual leadership in dynamic mobile networking environments
2007-2008 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe
A Look at Basics of Distributed Computing *
International audienceThis paper presents concepts and basics of distributed computing which are important (at least from the author's point of view), and should be known and mastered by Master students and engineers. Those include: (a) a characterization of distributed computing (which is too much often confused with parallel computing); (b) the notion of a synchronous system and its associated notions of a local algorithm and message adversaries; (c) the notion of an asynchronous shared memory system and its associated notions of universality and progress conditions; and (d) the notion of an asynchronous message-passing system with its associated broadcast and agreement abstractions, its impossibility results, and approaches to circumvent them. Hence, the paper can be seen as a guided tour to key elements that constitute basics of distributed computing
Liveness and Latency of Byzantine State-Machine Replication
Byzantine state-machine replication (SMR) ensures the consistency of replicated state in the presence of malicious replicas and lies at the heart of the modern blockchain technology. Byzantine SMR protocols often guarantee safety under all circumstances and liveness only under synchrony. However, guaranteeing liveness even under this assumption is nontrivial. So far we have lacked systematic ways of incorporating liveness mechanisms into Byzantine SMR protocols, which often led to subtle bugs. To close this gap, we introduce a modular framework to facilitate the design of provably live and efficient Byzantine SMR protocols. Our framework relies on a view abstraction generated by a special SMR synchronizer primitive to drive the agreement on command ordering. We present a simple formal specification of an SMR synchronizer and its bounded-space implementation under partial synchrony. We also apply our specification to prove liveness and analyze the latency of three Byzantine SMR protocols via a uniform methodology. In particular, one of these results yields what we believe is the first rigorous liveness proof for the algorithmic core of the seminal PBFT protocol
One for All and All for One: Scalable Consensus in a Hybrid Communication Model
International audienc
How to solve consensus in the smallest window of synchrony
This paper addresses the following question: what is the minimum-sized synchronous window needed to solve consensus in an otherwise asynchronous system? In answer to this question, we present the first optimally-resilient algorithm ASAP that solves consensus as soon as possible in an eventually synchronous system, i.e., a system that from some time GST onwards, delivers messages in a timely fashion. ASAP guarantees that, in an execution with at most f failures, every process decides no later than round GST + f + 2, which is optimal
Fast Genuine Generalized Consensus
International audienceConsensus (agreeing on a sequence of commands) is central to the operation and performance of distributed systems. A well-known solution to consensus is Fast Paxos. In a recent paper, Lamport enhances Fast Paxos by leveraging the commutativity of concurrent commands. The new primitive, called Generalized Paxos, reduces the collision rate, and thus the latency of Fast Paxos. However if a collision occurs, Generalized Paxos needs four communication steps to recover, which is slower than Fast Paxos. This paper presents FGGC, a novel consensus algorithm that reduces recovery delay when a collision occurs to one. FGGC tolerates f < n/2 replicas crashes, and during failure-free runs, processes learn commands in two steps if all commands commute, and three steps otherwise; this is optimal. Moreover, as long as no fault occurs, FGGC needs only f + 1 replicas to progress