6,290 research outputs found
Byzantine Fault Tolerant Coordination for Web Services Atomic Transactions
This thesis describes a Byzantine fault tolerant coordination framework for Web services atomic transactions. In the framework, all core services, including transaction activation, registration, completion, and distributed commit, are replicated and protected by Byzantine fault tolerance mechanisms. The traditional two-phase commit protocol is extended by a Byzantine fault tolerant version that can tolerate arbitrary faults on the coordinator and the initiator sides, and some types of malicious faults on the participant side. To achieve Byzantine fault tolerance in an efficient manner, and to limit the types of malicious behaviors of the coordinator, a novel decision certificate is introduced. The decision certificate includes a signed copy of the participants\u27 vote records, and it is piggybacked with all decision notifications to the participants for each participant to verify the legitimacy of the decision. The Byzantine fault tolerance mechanisms, together with the extended two-phase commit protocol, have been incorporated into an open-source framework supporting the standard Web services atomic transactions specification. Performance characterizations of the framework show that the implementation is fairly efficient. Such a Byzantine fault tolerant coordination framework can be useful for many transactional Web services that require a high degree of security and dependabilit
Byzantine Fault Tolerant Coordination for Web Services Atomic Transactions
This thesis describes a Byzantine fault tolerant coordination framework for Web services atomic transactions. In the framework, all core services, including transaction activation, registration, completion, and distributed commit, are replicated and protected by Byzantine fault tolerance mechanisms. The traditional two-phase commit protocol is extended by a Byzantine fault tolerant version that can tolerate arbitrary faults on the coordinator and the initiator sides, and some types of malicious faults on the participant side. To achieve Byzantine fault tolerance in an efficient manner, and to limit the types of malicious behaviors of the coordinator, a novel decision certificate is introduced. The decision certificate includes a signed copy of the participants\u27 vote records, and it is piggybacked with all decision notifications to the participants for each participant to verify the legitimacy of the decision. The Byzantine fault tolerance mechanisms, together with the extended two-phase commit protocol, have been incorporated into an open-source framework supporting the standard Web services atomic transactions specification. Performance characterizations of the framework show that the implementation is fairly efficient. Such a Byzantine fault tolerant coordination framework can be useful for many transactional Web services that require a high degree of security and dependabilit
Brief Announcement: Communication-Efficient BFT Using Small Trusted Hardware to Tolerate Minority Corruption
Small trusted hardware primitives can improve fault tolerance of Byzantine Fault Tolerant (BFT) protocols to one-half faults. However, existing works achieve this at the cost of increased communication complexity. In this work, we explore the design of communication-efficient BFT protocols that can boost fault tolerance to one-half without worsening communication complexity. Our results include a version of HotStuff that retains linear communication complexity in each view and a version of the VABA protocol with quadratic communication, both leveraging trusted hardware to tolerate a minority of corruptions. As a building block, we present communication-efficient provable broadcast, a core broadcast primitive with increased fault tolerance. Our results use expander graphs to achieve efficient communication in a manner that may be of independent interest
Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs
Previous work has shown that there are two major complexity barriers in the
synthesis of fault-tolerant distributed programs: (1) generation of fault-span,
the set of states reachable in the presence of faults, and (2) resolving
deadlock states, from where the program has no outgoing transitions. Of these,
the former closely resembles with model checking and, hence, techniques for
efficient verification are directly applicable to it. Hence, we focus on
expediting the latter with the use of multi-core technology.
We present two approaches for parallelization by considering different design
choices. The first approach is based on the computation of equivalence classes
of program transitions (called group computation) that are needed due to the
issue of distribution (i.e., inability of processes to atomically read and
write all program variables). We show that in most cases the speedup of this
approach is close to the ideal speedup and in some cases it is superlinear. The
second approach uses traditional technique of partitioning deadlock states
among multiple threads. However, our experiments show that the speedup for this
approach is small. Consequently, our analysis demonstrates that a simple
approach of parallelizing the group computation is likely to be the effective
method for using multi-core computing in the context of deadlock resolution
Coordination-Free Byzantine Replication with Minimal Communication Costs
State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data.
To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability
Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication
This paper presents FT-GAIA, a software-based fault-tolerant parallel and
distributed simulation middleware. FT-GAIA has being designed to reliably
handle Parallel And Distributed Simulation (PADS) models, which are needed to
properly simulate and analyze complex systems arising in any kind of scientific
or engineering field. PADS takes advantage of multiple execution units run in
multicore processors, cluster of workstations or HPC systems. However, large
computing systems, such as HPC systems that include hundreds of thousands of
computing nodes, have to handle frequent failures of some components. To cope
with this issue, FT-GAIA transparently replicates simulation entities and
distributes them on multiple execution nodes. This allows the simulation to
tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some
protection against Byzantine failures, since interaction messages among the
simulated entities are replicated as well, so that the receiving entity can
identify and discard corrupted messages. Results from an analytical model and
from an experimental evaluation show that FT-GAIA provides a high degree of
fault tolerance, at the cost of a moderate increase in the computational load
of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731
- …