66 research outputs found
Randomized protocols for asynchronous consensus
The famous Fischer, Lynch, and Paterson impossibility proof shows that it is
impossible to solve the consensus problem in a natural model of an asynchronous
distributed system if even a single process can fail. Since its publication,
two decades of work on fault-tolerant asynchronous consensus algorithms have
evaded this impossibility result by using extended models that provide (a)
randomization, (b) additional timing assumptions, (c) failure detectors, or (d)
stronger synchronization mechanisms than are available in the basic model.
Concentrating on the first of these approaches, we illustrate the history and
structure of randomized asynchronous consensus protocols by giving detailed
descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of
Distributed Computin
Byzantine fault-tolerant agreement protocols for wireless Ad hoc networks
Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2010.The thesis investigates the problem of fault- and intrusion-tolerant consensus
in resource-constrained wireless ad hoc networks. This is a fundamental
problem in distributed computing because it abstracts the need
to coordinate activities among various nodes. It has been shown to be a
building block for several other important distributed computing problems
like state-machine replication and atomic broadcast.
The thesis begins by making a thorough performance assessment of existing
intrusion-tolerant consensus protocols, which shows that the performance
bottlenecks of current solutions are in part related to their system
modeling assumptions. Based on these results, the communication failure
model is identified as a model that simultaneously captures the reality
of wireless ad hoc networks and allows the design of efficient protocols.
Unfortunately, the model is subject to an impossibility result stating that
there is no deterministic algorithm that allows n nodes to reach agreement
if more than n2 omission transmission failures can occur in a communication
step. This result is valid even under strict timing assumptions (i.e.,
a synchronous system).
The thesis applies randomization techniques in increasingly weaker variants
of this model, until an efficient intrusion-tolerant consensus protocol
is achieved. The first variant simplifies the problem by restricting the
number of nodes that may be at the source of a transmission failure at
each communication step. An algorithm is designed that tolerates f dynamic
nodes at the source of faulty transmissions in a system with a total
of n 3f + 1 nodes.
The second variant imposes no restrictions on the pattern of transmission
failures. The proposed algorithm effectively circumvents the Santoro-
Widmayer impossibility result for the first time. It allows k out of n nodes
to decide despite dn
2 e(nk)+k2 omission failures per communication
step. This algorithm also has the interesting property of guaranteeing
safety during arbitrary periods of unrestricted message loss.
The final variant shares the same properties of the previous one, but relaxes
the model in the sense that the system is asynchronous and that a
static subset of nodes may be malicious. The obtained algorithm, called
Turquois, admits f < n
3 malicious nodes, and ensures progress in communication
steps where dnf
2 e(n k f) + k 2. The algorithm is
subject to a comparative performance evaluation against other intrusiontolerant
protocols. The results show that, as the system scales, Turquois
outperforms the other protocols by more than an order of magnitude.Esta tese investiga o problema do consenso tolerante a faltas acidentais
e maliciosas em redes ad hoc sem fios. Trata-se de um problema fundamental
que captura a essência da coordenação em actividades envolvendo
vários nós de um sistema, sendo um bloco construtor de outros importantes
problemas dos sistemas distribuídos como a replicação de máquina
de estados ou a difusão atómica.
A tese começa por efectuar uma avaliação de desempenho a protocolos
tolerantes a intrusões já existentes na literatura. Os resultados mostram
que as limitações de desempenho das soluções existentes estão em parte
relacionadas com o seu modelo de sistema. Baseado nestes resultados, é
identificado o modelo de falhas de comunicação como um modelo que simultaneamente
permite capturar o ambiente das redes ad hoc sem fios e
projectar protocolos eficientes. Todavia, o modelo é restrito por um resultado
de impossibilidade que afirma não existir algoritmo algum que permita
a n nós chegaram a acordo num sistema que admita mais do que n2
transmissões omissas num dado passo de comunicação. Este resultado é
válido mesmo sob fortes hipóteses temporais (i.e., em sistemas síncronos)
A tese aplica técnicas de aleatoriedade em variantes progressivamente
mais fracas do modelo até ser alcançado um protocolo eficiente e tolerante
a intrusões. A primeira variante do modelo, de forma a simplificar
o problema, restringe o número de nós que estão na origem de transmissões
faltosas. É apresentado um algoritmo que tolera f nós dinâmicos na
origem de transmissões faltosas em sistemas com um total de n 3f + 1
nós.
A segunda variante do modelo não impõe quaisquer restrições no padrão
de transmissões faltosas. É apresentado um algoritmo que contorna efectivamente
o resultado de impossibilidade Santoro-Widmayer pela primeira
vez e que permite a k de n nós efectuarem progresso nos passos de comunicação
em que o número de transmissões omissas seja dn
2 e(n
k) + k 2. O algoritmo possui ainda a interessante propriedade de tolerar
períodos arbitrários em que o número de transmissões omissas seja
superior a .
A última variante do modelo partilha das mesmas características da variante
anterior, mas com pressupostos mais fracos sobre o sistema. Em particular,
assume-se que o sistema é assíncrono e que um subconjunto estático
dos nós pode ser malicioso. O algoritmo apresentado, denominado
Turquois, admite f < n
3 nós maliciosos e assegura progresso nos passos
de comunicação em que dnf
2 e(n k f) + k 2. O algoritmo é
sujeito a uma análise de desempenho comparativa com outros protocolos
na literatura. Os resultados demonstram que, à medida que o número de
nós no sistema aumenta, o desempenho do protocolo Turquois ultrapassa
os restantes em mais do que uma ordem de magnitude.FC
PACE: Fully Parallelizable BFT from Reproposable Byzantine Agreement
The classic asynchronous Byzantine fault tolerance (BFT) framework of Ben-Or, Kemler, and Rabin (BKR) and its descendants rely on reliable broadcast (RBC) and asynchronous binary agreement (ABA). However, BKR does not allow all ABA instances to run in parallel, a well-known performance bottleneck. We propose PACE, a generic framework that removes the bottleneck, allowing fully parallelizable ABA instances. PACE is built on RBC and reproposable ABA (RABA). Different from the conventional ABA, RABA allows a replica to change its mind and vote twice. We show how to efficiently build RABA protocols from existing ABA protocols and a new ABA protocol that we introduce.
We implement six new BFT protocols: three in the BKR framework, and three in the PACE framework. Via a deployment using 91 replicas on Amazon EC2 across five continents, we show that all PACE instantiations, in both failure-free and failure scenarios, significantly outperform their BKR counterparts, and prior BFT protocols such as BEAT and Dumbo, in terms of latency, throughput, latency vs. throughput, and scalability
SoK: Understanding BFT Consensus in the Age of Blockchains
Blockchain as an enabler to current Internet infrastructure has provided many unique features and revolutionized current distributed systems into a new era. Its decentralization, immutability, and transparency have attracted many applications to adopt the design philosophy of blockchain and customize various replicated solutions. Under the hood of blockchain, consensus protocols play the most important role to achieve distributed replication systems. The distributed system community has extensively studied the technical components of consensus to reach agreement among a group of nodes. Due to trust issues, it is hard to design a resilient system in practical situations because of the existence of various faults. Byzantine fault-tolerant (BFT) state machine replication (SMR) is regarded as an ideal candidate that can tolerate arbitrary faulty behaviors. However, the inherent complexity of BFT consensus protocols and their rapid evolution makes it hard to practically adapt themselves into application domains. There are many excellent Byzantine-based replicated solutions and ideas that have been contributed to improving performance, availability, or resource efficiency. This paper conducts a systematic and comprehensive study on BFT consensus protocols with a specific focus on the blockchain era. We explore both general principles and practical schemes to achieve consensus under Byzantine settings. We then survey, compare, and categorize the state-of-the-art solutions to understand BFT consensus in detail. For each representative protocol, we conduct an in-depth discussion of its most important architectural building blocks as well as the key techniques they used. We aim that this paper can provide system researchers and developers a concrete view of the current design landscape and help them find solutions to concrete problems. Finally, we present several critical challenges and some potential research directions to advance the research on exploring BFT consensus protocols in the age of blockchains
Trustless communication across distributed ledgers: impossibility and practical solutions
Since the advent of Bitcoin as the first decentralized digital currency in 2008, a plethora of distributed ledgers has been created, differing in design and purpose. Considering the heterogeneous nature of these systems, it is safe to say there shall not be ``one coin to rule them all". However, despite the growing and thriving ecosystem, blockchains continue to operate almost exclusively in complete isolation from one another: by design, blockchain protocols provide no means by which to communicate or exchange data with external systems. To this date, centralized providers hence remain the preferred route to exchange assets and information across blockchains~-- undermining the very nature of decentralized currencies.
The contribution of this thesis is threefold.
First, we critically evaluate the (im)possibilty, requirements, and challenges of cross-chain communication by contributing the first systematization of this field. We formalize the problem of Cross-Chain Communication (CCC) and show it is impossible without a trusted third party by relating CCC to the Fair Exchange problem. With this impossibility result in mind, we develop a framework to design new and evaluate existing CCC protocols, focusing on the inherent trust assumptions thereof, and derive a classification covering the field of cross-chain communication to date.
We then present XCLAIM, the first generic framework for transferring assets and information across permissionless distributed ledgers without relying on a centralized third party.
XCLAIM leverages so-called cryptocurrency-backed assets, blockchain-based assets one-to-one backed by other cryptocurrencies, such as Bitcoin-backed tokens on Ethereum. Through the secure issuance, transfer, and redemption of these assets, users can perform cross-chain exchanges in a financially trustless and non-interactive manner, overcoming the limitations of existing solutions.
To ensure the security of user funds, XCLAIM relies on collateralization of intermediaries and a proof-or-punishment approach, enforced via smart contracts equipped with cross-chain light clients, so-called chain relays.
XCLAIM has been adopted in practice, among others by the Polkadot blockchain, as a bridge to Bitcoin and other cryptocurrencies.
Finally, we contribute to advancing the state of the art in cross-chain light clients.
We develop TxChain, a novel mechanism to significantly reduce storage and bandwidth costs of modern blockchain light clients using contingent transaction aggregation, and apply our scheme to Bitcoin and Ethereum individually, as well as in the cross-chain setting.Open Acces
Notes on Theory of Distributed Systems
Notes for the Yale course CPSC 465/565 Theory of Distributed Systems
Tamper-Resistant Peer-to-Peer Storage for File Integrity Checking.
“... oba es gibt kan Kompromiß, zwischen ehrlich sein und link, a wann’s no so afoch ausschaut, und wann’s noch so üblich is...” — Wolfgang Ambros, 1975 One of the activities of most successful intruders of a computer system is to modify data on the victim, either to hide his/her presence and to destroy the evidence of the break-in, or to subvert the system completely and make it accessible for further abuse without triggering alarms. File integrity checking is one common method to mitigate the effects of successful intrusions by detecting the changes an intruder makes to files on a computer system. Historically file integrity checking has been implemented using tools that operate locally on a single system, which imposes quite some restrictions regarding maintenance and scalability. Recent improvements for large scale environments have introduced trusted central servers which provide secure fingerprint storage and logging facilities, but such centralism presents some new shortcomings
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications
Computer Aided Verification
This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications
- …