Search CORE

532 research outputs found

Byzantine Fault Tolerance for Nondeterministic Applications

Author: Zhao Wenbing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

All practical applications contain some degree of nondeterminism. When such applications are replicated to achieve Byzantine fault tolerance (BFT), their nondeterministic operations must be controlled to ensure replica consistency. To the best of our knowledge, only the most simplistic types of replica nondeterminism have been dealt with. Furthermore, there lacks a systematic approach to handling common types of nondeterminism. In this paper, we propose a classification of common types of replica nondeterminism with respect to the requirement of achieving Byzantine fault tolerance, and describe the design and implementation of the core mechanisms necessary to handle such nondeterminism within a Byzantine fault tolerance framework.Comment: To appear in the proceedings of the 3rd IEEE International Symposium on Dependable, Autonomic and Secure Computing, 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Replica determinism and flexible scheduling in hard real-time dependable systems

Author: Barrett P
Burns A
Poledna S
Wellings A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

Fault-tolerant real-time systems are typically based on active replication where replicated entities are required to deliver their outputs in an identical order within a given time interval. Distributed scheduling of replicated tasks, however, violates this requirement if on-line scheduling, preemptive scheduling, or scheduling of dissimilar replicated task sets is employed. This problem of inconsistent task outputs has been solved previously by coordinating the decisions of the local schedulers such that replicated tasks are executed in an identical order. Global coordination results either in an extremely high communication effort to agree on each schedule decision or in an overly restrictive execution model where on-line scheduling, arbitrary preemptions, and nonidentically replicated task sets are not allowed. To overcome these restrictions, a new method, called timed messages, is introduced. Timed messages guarantee deterministic operation by presenting consistent message versions to the replicated tasks. This approach is based on simulated common knowledge and a sparse time base. Timed messages are very effective since they neither require communication between the local scheduler nor do they restrict usage of on-line flexible scheduling, preemptions and nonidentically replicated task sets

CiteSeerX

Crossref

White Rose Research Online

Byzantine Fault Tolerance for Nondeterministic Applications

Author: Chen Bo
Publication venue: EngagedScholarship@CSU
Publication date: 01/01/2008
Field of study

The growing reliance on online services accessible on the Internet demands highly reliable system that would not be interrupted when encountering faults. A number of Byzantine fault tolerance (BFT) algorithms have been developed to mask the most complicated type of faults - Byzantine faults such as software bugs,operator mistakes, and malicious attacks, which are usually the major cause of service interruptions. However, it is often difficult to apply these algorithms to practical applications because such applications often exhibit sophisticated non-deterministic behaviors that the existing BFT algorithms could not cope with. In this thesis, we propose a classification of common types of replica nondeterminism with respect to the requirement of achieving Byzantine fault tolerance, and describe the design and implementation of the core mechanisms necessary to handle such replica nondeterminism within a Byzantine fault tolerance framework. In addition, we evaluated the performance of our BFT library, referred to as ND-BFT using both a micro-benchmark application and a more realistic online porker game application. The performance results show that the replicated online poker game performs approximately 13 slower than its nonreplicated counterpart in the presence of small number of player

Cleveland-Marshall College of Law

Byzantine Fault Tolerance for Nondeterministic Applications

Author: Chen Bo
Publication venue: EngagedScholarship@CSU
Publication date: 01/01/2008
Field of study

OhioLINK Electronic Thesis and Dissertation Center

Cleveland-Marshall College of Law

Byzantine Fault Tolerance for Nondeterministic Applications

Author: Chen Bo
Publication venue: EngagedScholarship@CSU
Publication date: 01/01/2008
Field of study

Application Agreement and Integration Services

Author: Driscoll Kevin R.
Hall Brendan
Schweiker Kevin
Publication venue
Publication date
Field of study

Application agreement and integration services are required by distributed, fault-tolerant, safety critical systems to assure required performance. An analysis of distributed and hierarchical agreement strategies are developed against the backdrop of observed agreement failures in fielded systems. The documented work was performed under NASA Task Order NNL10AB32T, Validation And Verification of Safety-Critical Integrated Distributed Systems Area 2. This document is intended to satisfy the requirements for deliverable 5.2.11 under Task 4.2.2.3. This report discusses the challenges of maintaining application agreement and integration services. A literature search is presented that documents previous work in the area of replica determinism. Sources of non-deterministic behavior are identified and examples are presented where system level agreement failed to be achieved. We then explore how TTEthernet services can be extended to supply some interesting application agreement frameworks. This document assumes that the reader is familiar with the TTEthernet protocol. The reader is advised to read the TTEthernet protocol standard [1] before reading this document. This document does not re-iterate the content of the standard

NASA Technical Reports Server

Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication

Author: D'Angelo Gabriele
Ferretti Stefano
Marzolla Moreno
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

This paper presents FT-GAIA, a software-based fault-tolerant parallel and distributed simulation middleware. FT-GAIA has being designed to reliably handle Parallel And Distributed Simulation (PADS) models, which are needed to properly simulate and analyze complex systems arising in any kind of scientific or engineering field. PADS takes advantage of multiple execution units run in multicore processors, cluster of workstations or HPC systems. However, large computing systems, such as HPC systems that include hundreds of thousands of computing nodes, have to handle frequent failures of some components. To cope with this issue, FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some protection against Byzantine failures, since interaction messages among the simulated entities are replicated as well, so that the receiving entity can identify and discard corrupted messages. Results from an analytical model and from an experimental evaluation show that FT-GAIA provides a high degree of fault tolerance, at the cost of a moderate increase in the computational load of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Urbino

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Recommended from our members

Improvements Relating to Database Replication Protocols

Author: Popov P. T.
Stankovic V.
Publication venue
Publication date: 09/10/2013
Field of study

The present invention concerns improvements relating to database replication. More specifically, aspects of the present invention relate to a fault-tolerant node and a method for avoiding non-deterministic behaviour in the management of synchronous database systems

City Research Online

Optimistic Parallel State-Machine Replication

Author: Marandi Parisa Jalili
Pedone Fernando
Publication venue
Publication date: 27/04/2014
Field of study

State-machine replication, a fundamental approach to fault tolerance, requires replicas to execute commands deterministically, which usually results in sequential execution of commands. Sequential execution limits performance and underuses servers, which are increasingly parallel (i.e., multicore). To narrow the gap between state-machine replication requirements and the characteristics of modern servers, researchers have recently come up with alternative execution models. This paper surveys existing approaches to parallel state-machine replication and proposes a novel optimistic protocol that inherits the scalable features of previous techniques. Using a replicated B+-tree service, we demonstrate in the paper that our protocol outperforms the most efficient techniques by a factor of 2.4 times

arXiv.org e-Print Archive

Crossref