48 research outputs found
Optimistic Parallel State-Machine Replication
State-machine replication, a fundamental approach to fault tolerance,
requires replicas to execute commands deterministically, which usually results
in sequential execution of commands. Sequential execution limits performance
and underuses servers, which are increasingly parallel (i.e., multicore). To
narrow the gap between state-machine replication requirements and the
characteristics of modern servers, researchers have recently come up with
alternative execution models. This paper surveys existing approaches to
parallel state-machine replication and proposes a novel optimistic protocol
that inherits the scalable features of previous techniques. Using a replicated
B+-tree service, we demonstrate in the paper that our protocol outperforms the
most efficient techniques by a factor of 2.4 times
Rethinking State-Machine Replication for Parallelism
State-machine replication, a fundamental approach to designing fault-tolerant
services, requires commands to be executed in the same order by all replicas.
Moreover, command execution must be deterministic: each replica must produce
the same output upon executing the same sequence of commands. These
requirements usually result in single-threaded replicas, which hinders service
performance. This paper introduces Parallel State-Machine Replication (P-SMR),
a new approach to parallelism in state-machine replication. P-SMR scales better
than previous proposals since no component plays a centralizing role in the
execution of independent commands---those that can be executed concurrently, as
defined by the service. The paper introduces P-SMR, describes a "commodified
architecture" to implement it, and compares its performance to other proposals
using a key-value store and a networked file system
Byzantine Fault Tolerance for Distributed Systems
The growing reliance on online services imposes a high dependability requirement on the computer systems that provide these services. Byzantine fault tolerance (BFT) is a promising technology to solidify such systems for the much needed high dependability. BFT employs redundant copies of the servers and ensures that a replicated system continues providing correct services despite the attacks on a small portion of the system. In this dissertation research, I developed novel algorithms and mechanisms to control various types of application nondeterminism and to ensure the long-term reliability of BFT systems via a migration-based proactive recovery scheme. I also investigated a new approach to significantly improve the overall system throughput by enabling concurrent processing using Software Transactional Memory (STM). Controlling application nondeterminism is essential to achieve strong replica consistency because the BFT technology is based on state-machine replication, which requires deterministic operation of each replica. Proactive recovery is necessary to ensure that the fundamental assumption of using the BFT technology is not violated over long term, i.e., less than one-third of replicas remain correct. Without proactive recovery, more and more replicas will be compromised under continuously attacks, which would render BFT ineffective. STM based concurrent processing maximized the system throughput by utilizing the power of multi-core CPUs while preserving strong replication consistenc
Byzantine Fault Tolerance for Distributed Systems
The growing reliance on online services imposes a high dependability requirement on the computer systems that provide these services. Byzantine fault tolerance (BFT) is a promising technology to solidify such systems for the much needed high dependability. BFT employs redundant copies of the servers and ensures that a replicated system continues providing correct services despite the attacks on a small portion of the system. In this dissertation research, I developed novel algorithms and mechanisms to control various types of application nondeterminism and to ensure the long-term reliability of BFT systems via a migration-based proactive recovery scheme. I also investigated a new approach to significantly improve the overall system throughput by enabling concurrent processing using Software Transactional Memory (STM). Controlling application nondeterminism is essential to achieve strong replica consistency because the BFT technology is based on state-machine replication, which requires deterministic operation of each replica. Proactive recovery is necessary to ensure that the fundamental assumption of using the BFT technology is not violated over long term, i.e., less than one-third of replicas remain correct. Without proactive recovery, more and more replicas will be compromised under continuously attacks, which would render BFT ineffective. STM based concurrent processing maximized the system throughput by utilizing the power of multi-core CPUs while preserving strong replication consistenc
Security and Privacy Issues in Wireless Mesh Networks: A Survey
This book chapter identifies various security threats in wireless mesh
network (WMN). Keeping in mind the critical requirement of security and user
privacy in WMNs, this chapter provides a comprehensive overview of various
possible attacks on different layers of the communication protocol stack for
WMNs and their corresponding defense mechanisms. First, it identifies the
security vulnerabilities in the physical, link, network, transport, application
layers. Furthermore, various possible attacks on the key management protocols,
user authentication and access control protocols, and user privacy preservation
protocols are presented. After enumerating various possible attacks, the
chapter provides a detailed discussion on various existing security mechanisms
and protocols to defend against and wherever possible prevent the possible
attacks. Comparative analyses are also presented on the security schemes with
regards to the cryptographic schemes used, key management strategies deployed,
use of any trusted third party, computation and communication overhead involved
etc. The chapter then presents a brief discussion on various trust management
approaches for WMNs since trust and reputation-based schemes are increasingly
becoming popular for enforcing security in wireless networks. A number of open
problems in security and privacy issues for WMNs are subsequently discussed
before the chapter is finally concluded.Comment: 62 pages, 12 figures, 6 tables. This chapter is an extension of the
author's previous submission in arXiv submission: arXiv:1102.1226. There are
some text overlaps with the previous submissio
Intrusion-Tolerant Middleware: the MAFTIA approach
The pervasive interconnection of systems all over the world has given computer services a significant socio-economic value, which can be affected both by accidental faults and by malicious activity. It would be appealing to address both problems in a seamless manner, through a common approach to security and dependability. This is the proposal of intrusion tolerance, where it is assumed that systems remain to some extent faulty and/or vulnerable and subject to attacks that can be successful, the idea being to ensure that the overall system nevertheless remains secure and operational. In this paper, we report some of the advances made in the European project MAFTIA, namely in what concerns a basis of concepts unifying security and dependability, and a modular and versatile architecture, featuring several intrusion-tolerant middleware building blocks. We describe new architectural constructs and algorithmic strategies, such as: the use of trusted components at several levels of abstraction; new randomization techniques; new replica control and access control algorithms. The paper concludes by exemplifying the construction of intrusion-tolerant applications on the MAFTIA middleware, through a transaction support servic
Intrusion Tolerant Routing Protocols for Wireless Sensor Networks
This MSc thesis is focused in the study, solution proposal and experimental evaluation of security solutions for Wireless Sensor Networks (WSNs). The objectives are centered on intrusion tolerant routing services, adapted for the characteristics and requirements of WSN nodes and operation behavior.
The main contribution addresses the establishment of pro-active intrusion tolerance properties at the network level, as security mechanisms for the proposal of a reliable and secure routing protocol. Those properties and mechanisms will augment a secure communication base layer supported by light-weigh cryptography methods, to improve the global network resilience capabilities against possible intrusion-attacks on the WSN nodes. Adapting to WSN characteristics, the design of the intended security services also pushes complexity away from resource-poor sensor nodes towards resource-rich and trustable base stations.
The devised solution will construct, securely and efficiently, a secure tree-structured routing service for data-dissemination in large scale deployed WSNs. The purpose is to tolerate the damage caused by adversaries modeled according with the Dolev-Yao threat model and ISO X.800 attack typology and framework, or intruders that can compromise maliciously the deployed sensor nodes, injecting, modifying, or blocking packets, jeopardizing the correct behavior of internal network routing processing and topology management.
The proposed enhanced mechanisms, as well as the design and implementation of a new intrusiontolerant routing protocol for a large scale WSN are evaluated by simulation. For this purpose, the evaluation is based on a rich simulation environment, modeling networks from hundreds to tens of thousands of wireless sensors, analyzing different dimensions: connectivity conditions, degree-distribution patterns, latency and average short-paths, clustering, reliability metrics and energy cost
Byzantine fault-tolerant agreement protocols for wireless Ad hoc networks
Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2010.The thesis investigates the problem of fault- and intrusion-tolerant consensus
in resource-constrained wireless ad hoc networks. This is a fundamental
problem in distributed computing because it abstracts the need
to coordinate activities among various nodes. It has been shown to be a
building block for several other important distributed computing problems
like state-machine replication and atomic broadcast.
The thesis begins by making a thorough performance assessment of existing
intrusion-tolerant consensus protocols, which shows that the performance
bottlenecks of current solutions are in part related to their system
modeling assumptions. Based on these results, the communication failure
model is identified as a model that simultaneously captures the reality
of wireless ad hoc networks and allows the design of efficient protocols.
Unfortunately, the model is subject to an impossibility result stating that
there is no deterministic algorithm that allows n nodes to reach agreement
if more than n2 omission transmission failures can occur in a communication
step. This result is valid even under strict timing assumptions (i.e.,
a synchronous system).
The thesis applies randomization techniques in increasingly weaker variants
of this model, until an efficient intrusion-tolerant consensus protocol
is achieved. The first variant simplifies the problem by restricting the
number of nodes that may be at the source of a transmission failure at
each communication step. An algorithm is designed that tolerates f dynamic
nodes at the source of faulty transmissions in a system with a total
of n 3f + 1 nodes.
The second variant imposes no restrictions on the pattern of transmission
failures. The proposed algorithm effectively circumvents the Santoro-
Widmayer impossibility result for the first time. It allows k out of n nodes
to decide despite dn
2 e(nk)+k2 omission failures per communication
step. This algorithm also has the interesting property of guaranteeing
safety during arbitrary periods of unrestricted message loss.
The final variant shares the same properties of the previous one, but relaxes
the model in the sense that the system is asynchronous and that a
static subset of nodes may be malicious. The obtained algorithm, called
Turquois, admits f < n
3 malicious nodes, and ensures progress in communication
steps where dnf
2 e(n k f) + k 2. The algorithm is
subject to a comparative performance evaluation against other intrusiontolerant
protocols. The results show that, as the system scales, Turquois
outperforms the other protocols by more than an order of magnitude.Esta tese investiga o problema do consenso tolerante a faltas acidentais
e maliciosas em redes ad hoc sem fios. Trata-se de um problema fundamental
que captura a essência da coordenação em actividades envolvendo
vários nós de um sistema, sendo um bloco construtor de outros importantes
problemas dos sistemas distribuÃdos como a replicação de máquina
de estados ou a difusão atómica.
A tese começa por efectuar uma avaliação de desempenho a protocolos
tolerantes a intrusões já existentes na literatura. Os resultados mostram
que as limitações de desempenho das soluções existentes estão em parte
relacionadas com o seu modelo de sistema. Baseado nestes resultados, é
identificado o modelo de falhas de comunicação como um modelo que simultaneamente
permite capturar o ambiente das redes ad hoc sem fios e
projectar protocolos eficientes. Todavia, o modelo é restrito por um resultado
de impossibilidade que afirma não existir algoritmo algum que permita
a n nós chegaram a acordo num sistema que admita mais do que n2
transmissões omissas num dado passo de comunicação. Este resultado é
válido mesmo sob fortes hipóteses temporais (i.e., em sistemas sÃncronos)
A tese aplica técnicas de aleatoriedade em variantes progressivamente
mais fracas do modelo até ser alcançado um protocolo eficiente e tolerante
a intrusões. A primeira variante do modelo, de forma a simplificar
o problema, restringe o número de nós que estão na origem de transmissões
faltosas. É apresentado um algoritmo que tolera f nós dinâmicos na
origem de transmissões faltosas em sistemas com um total de n 3f + 1
nós.
A segunda variante do modelo não impõe quaisquer restrições no padrão
de transmissões faltosas. É apresentado um algoritmo que contorna efectivamente
o resultado de impossibilidade Santoro-Widmayer pela primeira
vez e que permite a k de n nós efectuarem progresso nos passos de comunicação
em que o número de transmissões omissas seja dn
2 e(n
k) + k 2. O algoritmo possui ainda a interessante propriedade de tolerar
perÃodos arbitrários em que o número de transmissões omissas seja
superior a .
A última variante do modelo partilha das mesmas caracterÃsticas da variante
anterior, mas com pressupostos mais fracos sobre o sistema. Em particular,
assume-se que o sistema é assÃncrono e que um subconjunto estático
dos nós pode ser malicioso. O algoritmo apresentado, denominado
Turquois, admite f < n
3 nós maliciosos e assegura progresso nos passos
de comunicação em que dnf
2 e(n k f) + k 2. O algoritmo é
sujeito a uma análise de desempenho comparativa com outros protocolos
na literatura. Os resultados demonstram que, à medida que o número de
nós no sistema aumenta, o desempenho do protocolo Turquois ultrapassa
os restantes em mais do que uma ordem de magnitude.FC
SoK: Understanding BFT Consensus in the Age of Blockchains
Blockchain as an enabler to current Internet infrastructure has provided many unique features and revolutionized current distributed systems into a new era. Its decentralization, immutability, and transparency have attracted many applications to adopt the design philosophy of blockchain and customize various replicated solutions. Under the hood of blockchain, consensus protocols play the most important role to achieve distributed replication systems. The distributed system community has extensively studied the technical components of consensus to reach agreement among a group of nodes. Due to trust issues, it is hard to design a resilient system in practical situations because of the existence of various faults. Byzantine fault-tolerant (BFT) state machine replication (SMR) is regarded as an ideal candidate that can tolerate arbitrary faulty behaviors. However, the inherent complexity of BFT consensus protocols and their rapid evolution makes it hard to practically adapt themselves into application domains. There are many excellent Byzantine-based replicated solutions and ideas that have been contributed to improving performance, availability, or resource efficiency. This paper conducts a systematic and comprehensive study on BFT consensus protocols with a specific focus on the blockchain era. We explore both general principles and practical schemes to achieve consensus under Byzantine settings. We then survey, compare, and categorize the state-of-the-art solutions to understand BFT consensus in detail. For each representative protocol, we conduct an in-depth discussion of its most important architectural building blocks as well as the key techniques they used. We aim that this paper can provide system researchers and developers a concrete view of the current design landscape and help them find solutions to concrete problems. Finally, we present several critical challenges and some potential research directions to advance the research on exploring BFT consensus protocols in the age of blockchains