6,513 research outputs found
In Search for an Optimal Authenticated Byzantine Agreement
In this paper, we challenge the conventional approach of state machine replication systems to design deterministic agreement protocols in the eventually synchronous communication model. We first prove that no such protocol can guarantee bounded communication cost before the global stabilization time and propose a different approach that hopes for the best (synchrony) but prepares for the worst (asynchrony). Accordingly, we design an optimistic byzantine agreement protocol that first tries an efficient deterministic algorithm that relies on synchrony for termination only, and then, only if an agreement was not reached due to asynchrony, the protocol uses a randomized asynchronous protocol for fallback that guarantees termination with probability 1.
We formally prove that our protocol achieves optimal communication complexity under all network conditions and failure scenarios. We first prove a lower bound of ?(ft+ t) for synchronous deterministic byzantine agreement protocols, where t is the failure threshold, and f is the actual number of failures. Then, we present a tight upper bound and use it for the synchronous part of the optimistic protocol. Finally, for the asynchronous fallback, we use a variant of the (optimal) VABA protocol, which we reconstruct to safely combine it with the synchronous part.
We believe that our adaptive to failures synchronous byzantine agreement protocol has an independent interest since it is the first protocol we are aware of which communication complexity optimally depends on the actual number of failures
A Peered Bulletin Board for Robust Use in Verifiable Voting Systems
The Web Bulletin Board (WBB) is a key component of verifiable election
systems. It is used in the context of election verification to publish evidence
of voting and tallying that voters and officials can check, and where
challenges can be launched in the event of malfeasance. In practice, the
election authority has responsibility for implementing the web bulletin board
correctly and reliably, and will wish to ensure that it behaves correctly even
in the presence of failures and attacks. To ensure robustness, an
implementation will typically use a number of peers to be able to provide a
correct service even when some peers go down or behave dishonestly. In this
paper we propose a new protocol to implement such a Web Bulletin Board,
motivated by the needs of the vVote verifiable voting system. Using a
distributed algorithm increases the complexity of the protocol and requires
careful reasoning in order to establish correctness. Here we use the Event-B
modelling and refinement approach to establish correctness of the peered design
against an idealised specification of the bulletin board behaviour. In
particular we show that for n peers, a threshold of t > 2n/3 peers behaving
correctly is sufficient to ensure correct behaviour of the bulletin board
distributed design. The algorithm also behaves correctly even if honest or
dishonest peers temporarily drop out of the protocol and then return. The
verification approach also establishes that the protocols used within the
bulletin board do not interfere with each other. This is the first time a
peered web bulletin board suite of protocols has been formally verified.Comment: 49 page
Optimistic Parallel State-Machine Replication
State-machine replication, a fundamental approach to fault tolerance,
requires replicas to execute commands deterministically, which usually results
in sequential execution of commands. Sequential execution limits performance
and underuses servers, which are increasingly parallel (i.e., multicore). To
narrow the gap between state-machine replication requirements and the
characteristics of modern servers, researchers have recently come up with
alternative execution models. This paper surveys existing approaches to
parallel state-machine replication and proposes a novel optimistic protocol
that inherits the scalable features of previous techniques. Using a replicated
B+-tree service, we demonstrate in the paper that our protocol outperforms the
most efficient techniques by a factor of 2.4 times
A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems
Supercomputing systems today often come in the form of large numbers of
commodity systems linked together into a computing cluster. These systems, like
any distributed system, can have large numbers of independent hardware
components cooperating or collaborating on a computation. Unfortunately, any of
this vast number of components can fail at any time, resulting in potentially
erroneous output. In order to improve the robustness of supercomputing
applications in the presence of failures, many techniques have been developed
to provide resilience to these kinds of system faults. This survey provides an
overview of these various fault-tolerance techniques.Comment: 11 page
- …