54 research outputs found
REDUCING OVERHEAD OF SELF-STABILIZING BYZANTINE AGREEMENT PROTOCOLS FOR BLOCKCHAIN USING HTTP/3 PROTOCOL: A PERSPECTIVE VIEW
Today, there is a tendency to reduce the dependence on local computation in favor of cloud computing. However, this inadvertently increases the reliance upon distributed fault-tolerant systems. In a condition that forced to work together, these systems often need to reach an agreement on some state or task, and possibly even in the presence of some misbehaving Byzantine nodes. Although non-trivial, Byzantine Agreement (BA) protocols now exist that are resilient to these types of faults. However, there is still a risk for inconsistencies in the application state in practice, even if a BA protocol is used. A single transient fault may put a node into an illegal state, creating a need for new self-stabilizing BA protocols to recover from illegal states. As self-stabilization often comes with a cost, primarily in the form of communication overhead, a potential lowering of latency - the cost of each message - could significantly impact how fast the protocol behaves overall. Thereby, there is a need for new network protocols such as QUIC, which, among other things, aims to reduce latency. In this paper, we survey current state-of-the-art agreement protocols. Based on previous work, some researchers try to implement pseudocode like QUIC protocol for Ethereum blockchain to have a secure network, resulting in slightly slower performance than the IP-based blockchain. We focus on consensus in the context of blockchain as it has prompted the development and usage of new open-source BA solutions that are related to proof of stake. We also discuss extensions to some of these protocols, specifically the possibility of achieving self-stabilization and the potential integration of the QUIC protocol, such as PoS and PBFT. Finally, further challenges faced in the field and how they might be overcome are discussed
Self-stabilizing Byzantine Multivalued Consensus
Consensus, abstracting a myriad of problems in which processes have to agree
on a single value, is one of the most celebrated problems of fault-tolerant
distributed computing. Consensus applications include fundamental services for
the environments of the Cloud and Blockchain, and in such challenging
environments, malicious behaviors are often modeled as adversarial Byzantine
faults.
At OPODIS 2010, Mostefaoui and Raynal (in short MR) presented a
Byzantine-tolerant solution to consensus in which the decided value cannot be a
value proposed only by Byzantine processes. MR has optimal resilience coping
with up to t < n/3 Byzantine nodes over n processes. MR provides this
multivalued consensus object (which accepts proposals taken from a finite set
of values) assuming the availability of a single Binary consensus object (which
accepts proposals taken from the set {0,1}).
This work, which focuses on multivalued consensus, aims at the design of an
even more robust solution than MR. Our proposal expands MR's fault-model with
self-stabilization, a vigorous notion of fault-tolerance. In addition to
tolerating Byzantine, self-stabilizing systems can automatically recover after
the occurrence of arbitrary transient-faults. These faults represent any
violation of the assumptions according to which the system was designed to
operate (provided that the algorithm code remains intact).
To the best of our knowledge, we propose the first self-stabilizing solution
for intrusion-tolerant multivalued consensus for asynchronous message-passing
systems prone to Byzantine failures. Our solution has a O(t) stabilization time
from arbitrary transient faults.Comment: arXiv admin note: text overlap with arXiv:2110.0859
LIPIcs
Fault-tolerant distributed algorithms play an important role in many critical/high-availability applications. These algorithms are notoriously difficult to implement correctly, due to asynchronous communication and the occurrence of faults, such as the network dropping messages or computers crashing. Nonetheless there is surprisingly little language and verification support to build distributed systems based on fault-tolerant algorithms. In this paper, we present some of the challenges that a designer has to overcome to implement a fault-tolerant distributed system. Then we review different models that have been proposed to reason about distributed algorithms and sketch how such a model can form the basis for a domain-specific programming language. Adopting a high-level programming model can simplify the programmer's life and make the code amenable to automated verification, while still compiling to efficiently executable code. We conclude by summarizing the current status of an ongoing language design and implementation project that is based on this idea
The Weakest Failure Detector for Eventual Consistency
In its classical form, a consistent replicated service requires all replicas
to witness the same evolution of the service state. Assuming a message-passing
environment with a majority of correct processes, the necessary and sufficient
information about failures for implementing a general state machine replication
scheme ensuring consistency is captured by the {\Omega} failure detector. This
paper shows that in such a message-passing environment, {\Omega} is also the
weakest failure detector to implement an eventually consistent replicated
service, where replicas are expected to agree on the evolution of the service
state only after some (a priori unknown) time. In fact, we show that {\Omega}
is the weakest to implement eventual consistency in any message-passing
environment, i.e., under any assumption on when and where failures might occur.
Ensuring (strong) consistency in any environment requires, in addition to
{\Omega}, the quorum failure detector {\Sigma}. Our paper thus captures, for
the first time, an exact computational difference be- tween building a
replicated state machine that ensures consistency and one that only ensures
eventual consistency
Distributed eventual leader election in the crash-recovery and general omission failure models.
102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism
SoK: Understanding BFT Consensus in the Age of Blockchains
Blockchain as an enabler to current Internet infrastructure has provided many unique features and revolutionized current distributed systems into a new era. Its decentralization, immutability, and transparency have attracted many applications to adopt the design philosophy of blockchain and customize various replicated solutions. Under the hood of blockchain, consensus protocols play the most important role to achieve distributed replication systems. The distributed system community has extensively studied the technical components of consensus to reach agreement among a group of nodes. Due to trust issues, it is hard to design a resilient system in practical situations because of the existence of various faults. Byzantine fault-tolerant (BFT) state machine replication (SMR) is regarded as an ideal candidate that can tolerate arbitrary faulty behaviors. However, the inherent complexity of BFT consensus protocols and their rapid evolution makes it hard to practically adapt themselves into application domains. There are many excellent Byzantine-based replicated solutions and ideas that have been contributed to improving performance, availability, or resource efficiency. This paper conducts a systematic and comprehensive study on BFT consensus protocols with a specific focus on the blockchain era. We explore both general principles and practical schemes to achieve consensus under Byzantine settings. We then survey, compare, and categorize the state-of-the-art solutions to understand BFT consensus in detail. For each representative protocol, we conduct an in-depth discussion of its most important architectural building blocks as well as the key techniques they used. We aim that this paper can provide system researchers and developers a concrete view of the current design landscape and help them find solutions to concrete problems. Finally, we present several critical challenges and some potential research directions to advance the research on exploring BFT consensus protocols in the age of blockchains
An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics
We analyze a dataset of 51 current (2019-2020) Distributed Systems syllabi
from top Computer Science programs, focusing on finding the prevalence and
context in which topics related to performance are being taught in these
courses. We also study the scale of the infrastructure mentioned in DS courses,
from small client-server systems to cloud-scale, peer-to-peer, global-scale
systems. We make eight main findings, covering goals such as performance, and
scalability and its variant elasticity; activities such as performance
benchmarking and monitoring; eight selected performance-enhancing techniques
(replication, caching, sharding, load balancing, scheduling, streaming,
migrating, and offloading); and control issues such as trade-offs that include
performance and performance variability.Comment: Accepted for publication at WEPPE 2021, to be held in conjunction
with ACM/SPEC ICPE 2021: https://doi.org/10.1145/3447545.3451197 This article
is a follow-up of our prior ACM SIGCSE publication, arXiv:2012.0055
- …