Search CORE

146 research outputs found

Parallel Deferred Update Replication

Author: Pacheco Leandro
Pedone Fernando
Sciascia Daniele
Publication venue
Publication date: 03/12/2013
Field of study

Deferred update replication (DUR) is an established approach to implementing highly efficient and available storage. While the throughput of read-only transactions scales linearly with the number of deployed replicas in DUR, the throughput of update transactions experiences limited improvements as replicas are added. This paper presents Parallel Deferred Update Replication (P-DUR), a variation of classical DUR that scales both read-only and update transactions with the number of cores available in a replica. In addition to introducing the new approach, we describe its full implementation and compare its performance to classical DUR and to Berkeley DB, a well-known standalone database

arXiv.org e-Print Archive

Crossref

The ISIS project: Fault-tolerance in large distributed systems

Author: Birman Kenneth P.
Marzullo Keith
Publication venue
Publication date
Field of study

The semi-annual status report covers activities of the ISIS project during the second half of 1989. The project had several independent objectives: (1) At the level of the ISIS Toolkit, ISIS release V2.0 was completed, containing bypass communication protocols. Performance of the system is greatly enhanced by this change, but the initial software release is limited in some respects. (2) The Meta project focused on the definition of the Lomita programming language for specifying rules that monitor sensors for conditions of interest and triggering appropriate reactions. This design was completed, and implementation of Lomita is underway on the Meta 2.0 platform. (3) The Deceit file system effort completed a prototype. It is planned to make Deceit available for use in two hospital information systems. (4) A long-haul communication subsystem project was completed and can be used as part of ISIS. This effort resulted in tools for linking ISIS systems on different LANs together over long-haul communications lines. (5) Magic Lantern, a graphical tool for building application monitoring and control interfaces, is included as part of the general ISIS releases

NASA Technical Reports Server

Process membership in asynchronous environments

Author: Birman Kenneth P.
Ricciardi Aleta M.
Publication venue
Publication date: 01/02/1993
Field of study

The development of reliable distributed software is simplified by the ability to assume a fail-stop failure model. The emulation of such a model in an asynchronous distributed environment is discussed. The solution proposed, called Strong-GMP, can be supported through a highly efficient protocol, and was implemented as part of a distributed systems software project at Cornell University. The precise definition of the problem, the protocol, correctness proofs, and an analysis of costs are addressed

NASA Technical Reports Server

eCommons@Cornell

Report on the Second European SIGOPS Workshop "making distributed systems work"

Author: Mullender Sape
Publication venue: ACM
Publication date: 01/01/1987
Field of study

University of Twente Research Information

Fault-Tolerant Partial Replication in Large-Scale Database Systems

Author: Shapiro Marc
Sutra Pierre
Publication venue
Publication date: 01/01/2008
Field of study

We investigate a decentralised approach to committing transactions in a replicated database, under partial replication. Previous protocols either re-execute transactions entirely and/or compute a total order of transactions. In contrast, ours applies update values, and orders only conflicting transactions. It results that transactions execute faster, and distributed databases commit in small committees. Both effects contribute to preserve scalability as the number of databases and transactions increase. Our algorithm ensures serializability, and is live and safe in spite of faults

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

On the many faces of atomic multicast

Author: Coelho Paulo
Pedone Fernando
Publication venue
Publication date: 09/07/2019
Field of study

Many current online services need to serve clients distributed across geographic areas. Coordinating highly available and scalable geographically distributed replicas, however, is challenging. While State Machine Replication is the most direct way of achieving availability, no scalability comes from the traditional approach. Typically, scalability is obtained by partitioning the original application state among groups of servers, which leads to further challenges. Atomic multicast is a group communication abstraction that groups processes, providing reliability and ordering guarantees, and can be explored to provide partially replicated applications a scalable and consistent alternative. This work confronts the challenges of providing practical group communication abstractions for crash fault-tolerant and Byzantine fault-tolerant (BFT) models. Although there are plenty of atomic multicast algorithms that tolerate crash failures, they suffer from two major issues: (a) high latency for messages addressed to multiple groups, and (b) low performance when proportion of messages to multiple groups is high. To solve the first problem and reduce the latency of multi-group messages, this work presents FastCast, an algorithm with unprecedented four communication delays. The second problem can be addressed by maximizing the proportion of single- group messages and eliminating additional communication among groups to execute operations. In this direction, this document introduces GeoPaxos, a protocol that partitions the ordering of operations like atomic multicast while still keeping the state fully replicated. In the BFT model, the task is more challenging, since servers can behave arbitrarily. This thesis presents ByzCast, the first algorithm that tolerates Byzantine failures. ByzCast is hierarchical and introduces a new class of atomic multicast defined as partially genuine. Lastly, since at the very core of most strong consistent replicated system resides a consensus protocol, the thesis concludes with Kernel Paxos, a Paxos implementation provided as a loadable kernel module, providing at the same time high performance, and abstracting ordering from the application execution

RERO DOC Digital Library

Evaluating certification protocols in the partial database state machine

Author: Correia Júnior Alfrânio Tavares
Moura Francisco Coelho Soares
Oliveira Rui Carlos Mendes de
Pereira José
Sousa António Luís Pinto Ferreira de
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Partial replication is an alluring technique to ensure the reliability of very large and geographically distributed databases while, at the same time, offering good performance. By correctly exploiting access locality most transactions become confined to a small subset of the database replicas thus reducing processing, storage access and communication overhead associated with replication. The advantages of partial replication have however to be weighted against the added complexity that is required to manage it. In fact, if the chosen replica configuration prevents the local execution of transactions or if the overhead of consistency protocols offsets the savings of locality, potential gains cannot be realized. These issues are heavily dependent on the application used for evaluation and render simplistic benchmarks useless. In this paper, we present a detailed analysis of Partial Database State Machine (PDBSM) replication by comparing alternative partial replication protocols with full replication. This is done using a realistic scenario based on a detailed network simulator and access patterns from an industry standard database benchmark. The results obtained allow us to identify the best configuration for typical on-line transaction processing applications.União Europeia - GORDA Project (FP6-IST/004758)

Universidade do Minho: RepositoriUM

HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers

Author: Agarwal Ajay
Kumar Vinit
Publication venue
Publication date: 04/07/2014
Field of study

Paxos is a prominent theory of state machine replication. Recent data intensive Systems those implement state machine replication generally require high throughput. Earlier versions of Paxos as few of them are classical Paxos, fast Paxos and generalized Paxos have a major focus on fault tolerance and latency but lacking in terms of throughput and scalability. A major reason for this is the heavyweight leader. Through offloading the leader, we can further increase throughput of the system. Ring Paxos, Multi Ring Paxos and S-Paxos are few prominent attempts in this direction for clustered data centers. In this paper, we are proposing HT-Paxos, a variant of Paxos that one is the best suitable for any large clustered data center. HT-Paxos further offloads the leader very significantly and hence increases the throughput and scalability of the system. While at the same time, among high throughput state-machine replication protocols, HT-Paxos provides reasonably low latency and response time

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central