Search CORE

9,733 research outputs found

Low cost management of replicated data in fault-tolerant distributed systems

Author: Birman Kenneth P.
Joseph Thomas A.
Publication venue
Publication date
Field of study

Many distributed systems replicate data for fault tolerance or availability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. A technique is described that relaxes the usual degree of synchronization, permitting replicated data items to be updated concurrently with other operations, while at the same time ensuring that correctness is not violated. The additional concurrency thus obtained results in better response time when performing operations on replicated data. How this technique performs in conjunction with a roll-back and a roll-forward failure recovery mechanism is also discussed

NASA Technical Reports Server

Coordination-Free Byzantine Replication with Minimal Communication Costs

Author: Hellings Jelle
Sadoghi Mohammad
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data. To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability

Dagstuhl Research Online Publication Server

Recommended from our members

Analysis of operating system diversity for intrusion tolerance

Author: Abd-El-Malek
Alhazmi
Anbalagan
Bessani
Castro
Castro
Chou
Clement
Correia
Deswarte
Garcia
Gashi
Gorbenko
Hatton
Hofmeyr
Joseph
Kapitza
Kim
Knight
Koopman
Kotla
Lamport
Littlewood
Littlewood
Miller
Moniz
Nikora
Ozment
Randell
Rescorla
Roeder
Sousa
Veríssimo
Wood
Yin
Publication venue: 'Wiley'
Publication date: 15/01/2013
Field of study

One of the key benefits of using intrusion-tolerant systems is the possibility of ensuring correct behavior in the presence of attacks and intrusions. These security gains are directly dependent on the components exhibiting failure diversity. To what extent failure diversity is observed in practical deployment depends on how diverse are the components that constitute the system. In this paper, we present a study with operating system's (OS's) vulnerability data from the NIST National Vulnerability Database (NVD). We have analyzed the vulnerabilities of 11 different OSs over a period of 18 years, to check how many of these vulnerabilities occur in more than one OS. We found this number to be low for several combinations of OSs. Hence, although there are a few caveats on the use of NVD data to support definitive conclusions, our analysis shows that by selecting appropriate OSs, one can preclude (or reduce substantially) common vulnerabilities from occurring in the replicas of the intrusion-tolerant system

City Research Online

Crossref

Fault tolerant architectures for integrated aircraft electronics systems

Author: Levitt K. N.
Melliar-Smith P. M.
Schwartz R. L.
Publication venue
Publication date
Field of study

Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered

NASA Technical Reports Server

CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores

Author: Arad Cosmin
Haridi Seif
Shafaat Tallat M.
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/2012
Field of study

Distributed key-value stores provide scalable, fault-tolerant, and self-organizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic consistency. We present the design and implementation of CATS, a distributed key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Our system shows that consistency can be achieved with practical performance and modest throughput overhead (5%) for read-intensive workloads

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Recommended from our members