33 research outputs found

    Composable consistency for large-scale peer replication

    Get PDF
    technical reportThe lack of a flexible consistency management solution hinders P2P implementation of applications involving updates, such as directory services, online auctions and collaboration. Managing shared data in a P2P setting requires a consistency solution that can operate in a heterogenous network, support pervasive replication for scaling, and give peers autonomy to tune consistency to their sharing needs and resource constraints. Existing solutions lack one or more of these features. In this paper, we propose a new way to structure consistency management for P2P sharing of mutable data called composable consistency. It lets applications compose a rich variety of consistency solutions appropriate for their sharing needs, out of a small set of primitive options. Our approach splits consistency management into design choices along five orthogonal aspects, namely, concurrency, consistency, availability, update visibility and isolation. Various combinations of these choices can be employed to yield numerous consistency semantics and to fine-tune resource use at each replica. Our experience with a prototype implementation suggests that composable consistency can effectively support diverse P2P applications

    Khazana: a flexible wide area data store

    Get PDF
    technical reportKhazana is a peer-to-peer data service that supports efficient sharing and aggressive caching of mutable data across the wide area while giving clients significant control over replica divergence. Previous work on wide-area replicated services focussed on at most two of the following three properties: aggressive replication, customizable consistency, and generality. In contrast, Khazana provides scalable support for large numbers of replicas while giving applications considerable flexibility in trading off consistency for availability and performance. Its flexibility enables applications to effectively exploit inherent data locality while meeting consistency needs. Khazana exports a file system-like interface with a small set of consistency controls which can be combined to yield a broad spectrum of consistency flavors ranging from strong consistency to best-effort eventual consistency. Khazana servers form failure-resilient dynamic replica hierarchies to manage replicas across variable quality network links. In this report, we outline Khazana?s design and show how its flexibility enables three diverse network services built on top of it to meet their individual consistency and performance needs: (i) a wide-area replicated file system that supports serializable writes as well as traditional file sharing across wide area, (ii) an enterprise data service that exploits locality by caching enterprise data closer to end-users while ensuring strong consistency for data integrity, and (iii) a replicated database that reaps order of magnitude gains in throughput by relaxing consistency

    Flexible consistency for wide area peer replication

    Get PDF
    technical reportThe lack of a flexible consistency management solution hinders P2P implementation of applications involving updates, such as read-write file sharing, directory services, online auctions and wide area collaboration. Managing mutable shared data in a P2P setting requires a consistency solution that can operate efficiently over variable-quality failure-prone networks, support pervasive replication for scaling, and give peers autonomy to tune consistency to their sharing needs and resource constraints. Existing solutions lack one or more of these features. In this paper, we describe a new consistency model for P2P sharing of mutable data called composable consistency, and outline its implementation in a wide area middleware file service called Swarm1. Composable consistency lets applications compose consistency semantics appropriate for their sharing needs by combining a small set of primitive options. Swarm implements these options efficiently to support scalable, pervasive, failure-resilient, wide-area replication behind a simple yet flexible interface. We present two applications to demonstrate the expressive power and effectiveness of composable consistency: a wide area file system that outperforms Coda in providing close-to-open consistency overWANs, and a replicated BerkeleyDB database that reaps order-of-magnitude performance gains by relaxing consistency for queries and updates

    Middleware support for locality-aware wide area replication

    Get PDF
    technical reportCoherent wide-area data caching can improve the scalability and responsiveness of distributed services such as wide-area file access, database and directory services, and content distribution. However, distributed services differ widely in the frequency of read/write sharing, the amount of contention between clients for the same data, and their ability to make tradeoffs between consistency and availability. Aggressive replication enhances the scalability and availability of services with read-mostly data or data that need not be kept strongly consistent. However, for applications that require strong consistency of writeshared data, you must throttle replication to achieve reasonable performance. We have developed a middleware data store called Swarm designed to support the widearea data sharing needs of distributed services. To support the needs of diverse distributed services, Swarm provides: (i) a failure-resilient proximity-aware data replication mechanism that adjusts the replication hierarchy based on observed network characteristics and node availability, (ii) a customizable consistency mechanism that allows applications to specify allowable consistency-availability tradeoffs, and (iii) a contention-aware caching mechanism that monitors contention between replicas and adjusts its replication policies accordingly. On a 240-node P2P file sharing system, Swarm's proximity-aware caching and replica hierarchy maintenance mechanisms improve latency by 80%, reduce WAN bandwidth consumed by 80%, and limit the impact of high node churn (5 node deaths/sec) to roughly one-fifth that of random replication. In addition, Swarm's contention-aware caching mechanism outperforms RPCs and static caching mechanisms at all levels of contention on an enterprise service workload

    ORLease: Optimistically Replicated Lease Using Lease Version Vector For Higher Replica Consistency in Optimistic Replication Systems

    Get PDF
    There is a tradeoff between the availability and consistency properties of any distributed replication system. Optimistic replication favors high availability over strong consistency so that the replication system can support disconnected replicas as well as high network latency between replicas. Optimistic replication improves the availability of these systems by allowing data updates to be committed at their originating replicas first before they are asynchronously replicated out and committed later at the rest of the replicas. This leads the whole system to suffer from a relaxed data consistency. This is due to the lack of any locking mechanism to synchronize access to the replicated data resources in order to mutually exclude one another. When consistency is relaxed, there is a potential of reading from stale data as well as introducing data conflicts due to the concurrent data updates that might have been introduced at different replicas. These issues could be ameliorated if the optimistic replication system is aggressively propagating the data updates at times of good network connectivity between replicas. However, aggressive propagation for data updates does not scale well in write intensive environments and leads to communication overhead in order to keep all replicas in sync. In pursuance of a solution to mitigate the relaxed consistency drawback, a new technique has been developed that improves the consistency of optimistic replication systems without sacrificing its availability and with minimal communication overhead. This new methodology is based on applying the concurrency control technique of leasing in an optimistic way. The optimistic lease technique is built on top of a replication framework that prioritizes metadata replication over data replication. The framework treats the lease requests as replication metadata updates and replicates them aggressively in order to optimistically acquire leases on replicated data resources. The technique is demonstrating a best effort semi-locking semantics that improves the overall system consistency while avoiding any locking issues that could arise in optimistic replication systems

    Replication Control in Distributed File Systems

    Full text link
    We present a replication control protocol for distributed file systems that can guarantee strict consistency or sequential consistency while imposing no performance overhead for normal reads. The protocol uses a primary-copy scheme with server redirection when concurrent writes occur. It tolerates any number of component omission and performance failures, even when these lead to network partition. Failure detection and recovery are driven by client accesses. No heartbeat messages or expensive group communication services are required. We have implemented the protocol in NFSv4, the emerging Internet standard for distributed filing.http://deepblue.lib.umich.edu/bitstream/2027.42/107880/1/citi-tr-04-1.pd

    A replicated file system for Grid computing

    Full text link
    To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd

    Exploring heterogeneity of unreliable machines for p2p backup

    Full text link
    P2P architecture is a viable option for enterprise backup. In contrast to dedicated backup servers, nowadays a standard solution, making backups directly on organization's workstations should be cheaper (as existing hardware is used), more efficient (as there is no single bottleneck server) and more reliable (as the machines are geographically dispersed). We present the architecture of a p2p backup system that uses pairwise replication contracts between a data owner and a replicator. In contrast to standard p2p storage systems using directly a DHT, the contracts allow our system to optimize replicas' placement depending on a specific optimization strategy, and so to take advantage of the heterogeneity of the machines and the network. Such optimization is particularly appealing in the context of backup: replicas can be geographically dispersed, the load sent over the network can be minimized, or the optimization goal can be to minimize the backup/restore time. However, managing the contracts, keeping them consistent and adjusting them in response to dynamically changing environment is challenging. We built a scientific prototype and ran the experiments on 150 workstations in the university's computer laboratories and, separately, on 50 PlanetLab nodes. We found out that the main factor affecting the quality of the system is the availability of the machines. Yet, our main conclusion is that it is possible to build an efficient and reliable backup system on highly unreliable machines (our computers had just 13% average availability)
    corecore