17 research outputs found

    CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores

    Get PDF
    Distributed key-value stores provide scalable, fault-tolerant, and self-organizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic consistency. We present the design and implementation of CATS, a distributed key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Our system shows that consistency can be achieved with practical performance and modest throughput overhead (5%) for read-intensive workloads

    Replication Control in Distributed File Systems

    Full text link
    We present a replication control protocol for distributed file systems that can guarantee strict consistency or sequential consistency while imposing no performance overhead for normal reads. The protocol uses a primary-copy scheme with server redirection when concurrent writes occur. It tolerates any number of component omission and performance failures, even when these lead to network partition. Failure detection and recovery are driven by client accesses. No heartbeat messages or expensive group communication services are required. We have implemented the protocol in NFSv4, the emerging Internet standard for distributed filing.http://deepblue.lib.umich.edu/bitstream/2027.42/107880/1/citi-tr-04-1.pd

    RamboNodes for the Metropolitan Ad Hoc Network

    Get PDF
    We present an algorithm to store data robustly in a large, geographically distributed network by means of localized regions of data storage that move in response to changing conditions. For example, data might migrate away from failures or toward regions of high demand. The PersistentNode algorithm provides this service robustly, but with limited safety guarantees. We use the RAMBO framework to transform PersistentNode into RamboNode, an algorithm that guarantees atomic consistency in exchange for increased cost and decreased liveness. In addition, a half-life analysis of RamboNode shows that it is robust against continuous low-rate failures. Finally, we provide experimental simulations for the algorithm on 2000 nodes, demonstrating how it services requests and examining how it responds to failures

    Distributed Shared State with History Maintenance

    Get PDF
    Shared mutable state is challenging to maintain in a distributed environment. We develop a technique, based on the Operational Transform, that guides independent agents into producing consistent states through inconsistent but equivalent histories of operations. Our technique, history maintenance, extends and streamlines the Operational Transform for general distributed systems. We describe how to use history maintenance to create eventually-consistent, strongly-consistent, and hybrid systems whose correctness is easy to reason about

    A replicated file system for Grid computing

    Full text link
    To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd

    CloudTPS: Scalable Transactions for Web Applications in the Cloud

    Get PDF
    NoSQL Cloud data services provide scalability and high availability properties for web applications but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager to allow cloud database services to execute the ACID transactions of web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud

    Prinzipien der Replikationskontrolle in verteilten Datenbanksystemen?

    Get PDF
    Durch Datenreplikation können prinzipiell schnellere Zugriffszeiten und eine beliebig hohe Fehlertoleranz in verteilten Datenbanksystemen erreicht werden. Anderseits erhöht Replikation die Gefahr von Inkonsistenzen und den Aufwand von Änderungsoperationen. Zur Lösung dieses Zielkonflikts wurden in der Literatur viele unterschiedliche Replikationsverfahren vorgeschlagen. Dieser Überblicksartikel beschreibt die den einzelnen Verfahren zugrundeliegenden Prinzipien zur Replikationskontrolle. Dazu werden die durch den Kopieneinsatz resultierenden Probleme erläutert, daraus Kriterien zur anschließenden Klassifikation abgeleitet und danach ausgewählte Replikationsverfahren näher vorgestellt

    The Bedrock of Byzantine Fault Tolerance: A Unified Platform for BFT Protocol Design and Implementation

    Full text link
    Byzantine Fault-Tolerant (BFT) protocols have recently been extensively used by decentralized data management systems with non-trustworthy infrastructures, e.g., permissioned blockchains. BFT protocols cover a broad spectrum of design dimensions from infrastructure settings such as the communication topology, to more technical features such as commitment strategy and even fundamental social choice properties like order-fairness. The proliferation of different BFT protocols has rendered it difficult to navigate the BFT landscape, let alone determine the protocol that best meets application needs. This paper presents Bedrock, a unified platform for BFT protocols design, analysis, implementation, and experiments. Bedrock proposes a design space consisting of a set of design choices capturing the trade-offs between different design space dimensions and providing fundamentally new insights into the strengths and weaknesses of BFT protocols. Bedrock enables users to analyze and experiment with BFT protocols within the space of plausible choices, evolve current protocols to design new ones, and even uncover previously unknown protocols. Our experimental results demonstrate the capability of Bedrock to uniformly evaluate BFT protocols in new ways that were not possible before due to the diverse assumptions made by these protocols. The results validate Bedrock's ability to analyze and derive BFT protocols
    corecore