Search CORE

2,440 research outputs found

Exploiting replication in distributed systems

Author: Birman Kenneth P.
Joseph T. A.
Publication venue
Publication date
Field of study

Techniques are examined for replicating data and execution in directly distributed systems: systems in which multiple processes interact directly with one another while continuously respecting constraints on their joint behavior. Directly distributed systems are often required to solve difficult problems, ranging from management of replicated data to dynamic reconfiguration in response to failures. It is shown that these problems reduce to more primitive, order-based consistency problems, which can be solved using primitives such as the reliable broadcast protocols. Moreover, given a system that implements reliable broadcast primitives, a flexible set of high-level tools can be provided for building a wide variety of directly distributed application programs

NASA Technical Reports Server

Achieving Robust Self-Management for Large-Scale Distributed Applications

Author: Al-Shishtawy Ahmad
Asif Fayyaz Muhammad
Popov Konstantin
Vlassov Vladimir
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/2010
Field of study

Autonomic managers are the main architectural building blocks for constructing self-management capabilities of computing systems and applications. One of the major challenges in developing self-managing applications is robustness of management elements which form autonomic managers. We believe that transparent handling of the effects of resource churn (joins/leaves/failures) on management should be an essential feature of a platform for self-managing large-scale dynamic distributed applications, because it facilitates the development of robust autonomic managers and hence improves robustness of self-managing applications. This feature can be achieved by providing a robust management element abstraction that hides churn from the programmer. In this paper, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of nodes hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn. Our proposed decentralized algorithm uses peer-to-peer replica placement schemes to automate replicated state machine migration in order to tolerate churn. Our algorithm exploits lookup and failure detection facilities of a structured overlay network for managing the set of active replicas. Using the proposed approach, we can achieve a long running and highly available service, without human intervention, in the presence of resource churn. In order to validate and evaluate our approach, we have implemented a prototype that includes the proposed algorithm

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Incremental Consistency Guarantees for Replicated Objects

Author: Guerraoui Rachid
Pavlovic Matej
Seredinschi Dragos-Adrian
Publication venue
Publication date: 08/09/2016
Field of study

Programming with replicated objects is difficult. Developers must face the fundamental trade-off between consistency and performance head on, while struggling with the complexity of distributed storage stacks. We introduce Correctables, a novel abstraction that hides most of this complexity, allowing developers to focus on the task of balancing consistency and performance. To aid developers with this task, Correctables provide incremental consistency guarantees, which capture successive refinements on the result of an ongoing operation on a replicated object. In short, applications receive both a preliminary---fast, possibly inconsistent---result, as well as a final---consistent---result that arrives later. We show how to leverage incremental consistency guarantees by speculating on preliminary values, trading throughput and bandwidth for improved latency. We experiment with two popular storage systems (Cassandra and ZooKeeper) and three applications: a Twissandra-based microblogging service, an ad serving system, and a ticket selling system. Our evaluation on the Amazon EC2 platform with YCSB workloads A, B, and C shows that we can reduce the latency of strongly consistent operations by up to 40% (from 100ms to 60ms) at little cost (10% bandwidth increase, 6% throughput drop) in the ad system. Even if the preliminary result is frequently inconsistent (25% of accesses), incremental consistency incurs a bandwidth overhead of only 27%.Comment: 16 total pages, 12 figures. OSDI'16 (to appear

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Majority Quorum Protocol Dedicated to General Threshold Schemes

Author: Jorda Jacques
M'zoughi Abdelaziz
Relaza Théodore Jean Richard
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceIn this paper, we introduce a majority quorum system dedicated to p-m-n general threshold schemes where p, n and m are respectively the minimal number of chunks that provide some information (but not necessarily all) on the original data, the total number of nodes in which the chunks of an object are stored and the minimal number of nodes needed to retrieve the original data using this protocol. In other words, less than p chunks reveal absolutely no information about the original data and less than m chunks can't reconstruct the original data. The p-m-n general threshold schemes optimize the usage of storage resources by reducing the total size of data to write and ensure fault-tolerance up to (n é m) nodes failure. With such a data distribution, a specific value of m can be set to have a good tradeoff between resources utilization and fault-tolerance. The only drawback of such schemes is the lack of any consistency protocol. If fact, consistency protocols like classical majority quorum are based on full replication. To successfully read or write a data using the majority quorum protocol, an absolute majority of replicas must be read / written correctly. This condition ensures that any read and write operations will contain at least one common replica, which guarantees their consistency. However, when a threshold scheme is used, an adaptation is needed. In fact, classical majority quorum protocol can no longer ensure that m chunks will have the latest version. In this paper, we introduce a new majority quorum protocol dedicated to general threshold schemes. As for the classical majority quorum protocol, the complexity of the quorum size of our protocol is O(n) but the utilization of storage resources is greatly optimized

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte