17 research outputs found
CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores
Distributed key-value stores provide scalable, fault-tolerant, and self-organizing
storage services, but fall short of guaranteeing linearizable consistency
in partially synchronous, lossy, partitionable, and dynamic networks, when data
is distributed and replicated automatically by the principle of consistent hashing.
This paper introduces consistent quorums as a solution for achieving atomic
consistency. We present the design and implementation of CATS, a distributed
key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is
scalable, elastic, and self-organizing; key properties for modern cloud storage
middleware. Our system shows that consistency can be achieved with practical
performance and modest throughput overhead (5%) for read-intensive workloads
Recommended from our members
Optimizing and Evaluating Algorithms for Replicated Data Concurrency Control
Replication Control in Distributed File Systems
We present a replication control protocol for distributed file systems that can guarantee strict consistency or sequential consistency while imposing no performance overhead for normal reads. The protocol uses a primary-copy scheme with server redirection when concurrent writes occur. It tolerates any number of component omission and performance failures, even when these lead to network partition. Failure detection and recovery are driven by client accesses. No heartbeat messages or expensive group communication services are required. We have implemented the protocol in NFSv4, the emerging Internet standard for distributed filing.http://deepblue.lib.umich.edu/bitstream/2027.42/107880/1/citi-tr-04-1.pd
RamboNodes for the Metropolitan Ad Hoc Network
We present an algorithm to store data robustly in a large, geographically distributed network by means of localized regions of data storage that move in response to changing conditions. For example, data might migrate away from failures or toward regions of high demand. The PersistentNode algorithm provides this service robustly, but with limited safety guarantees. We use the RAMBO framework to transform PersistentNode into RamboNode, an algorithm that guarantees atomic consistency in exchange for increased cost and decreased liveness. In addition, a half-life analysis of RamboNode shows that it is robust against continuous low-rate failures. Finally, we provide experimental simulations for the algorithm on 2000 nodes, demonstrating how it services requests and examining how it responds to failures
Distributed Shared State with History Maintenance
Shared mutable state is challenging to maintain in a distributed environment. We develop a technique, based on the Operational Transform, that guides independent agents into producing consistent states through inconsistent but equivalent histories of operations. Our technique, history maintenance, extends and streamlines the Operational Transform for general distributed systems. We describe how to use history maintenance to create eventually-consistent, strongly-consistent, and hybrid systems whose correctness is easy to reason about
A replicated file system for Grid computing
To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd
CloudTPS: Scalable Transactions for Web Applications in the Cloud
NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to execute the ACID transactions of web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud
Prinzipien der Replikationskontrolle in verteilten Datenbanksystemen?
Durch Datenreplikation können prinzipiell schnellere Zugriffszeiten und eine beliebig hohe Fehlertoleranz
in verteilten Datenbanksystemen erreicht werden.
Anderseits erhöht Replikation die Gefahr von Inkonsistenzen
und den Aufwand von Änderungsoperationen. Zur Lösung dieses Zielkonflikts wurden in der Literatur viele unterschiedliche Replikationsverfahren vorgeschlagen. Dieser Überblicksartikel beschreibt die den einzelnen Verfahren
zugrundeliegenden Prinzipien zur Replikationskontrolle. Dazu
werden die durch den Kopieneinsatz resultierenden Probleme
erläutert, daraus Kriterien zur anschließenden Klassifikation abgeleitet und danach ausgewählte Replikationsverfahren näher vorgestellt
The Bedrock of Byzantine Fault Tolerance: A Unified Platform for BFT Protocol Design and Implementation
Byzantine Fault-Tolerant (BFT) protocols have recently been extensively used
by decentralized data management systems with non-trustworthy infrastructures,
e.g., permissioned blockchains. BFT protocols cover a broad spectrum of design
dimensions from infrastructure settings such as the communication topology, to
more technical features such as commitment strategy and even fundamental social
choice properties like order-fairness. The proliferation of different BFT
protocols has rendered it difficult to navigate the BFT landscape, let alone
determine the protocol that best meets application needs. This paper presents
Bedrock, a unified platform for BFT protocols design, analysis, implementation,
and experiments. Bedrock proposes a design space consisting of a set of design
choices capturing the trade-offs between different design space dimensions and
providing fundamentally new insights into the strengths and weaknesses of BFT
protocols. Bedrock enables users to analyze and experiment with BFT protocols
within the space of plausible choices, evolve current protocols to design new
ones, and even uncover previously unknown protocols. Our experimental results
demonstrate the capability of Bedrock to uniformly evaluate BFT protocols in
new ways that were not possible before due to the diverse assumptions made by
these protocols. The results validate Bedrock's ability to analyze and derive
BFT protocols