7 research outputs found

    Khazana: a flexible wide area data store

    Get PDF
    technical reportKhazana is a peer-to-peer data service that supports efficient sharing and aggressive caching of mutable data across the wide area while giving clients significant control over replica divergence. Previous work on wide-area replicated services focussed on at most two of the following three properties: aggressive replication, customizable consistency, and generality. In contrast, Khazana provides scalable support for large numbers of replicas while giving applications considerable flexibility in trading off consistency for availability and performance. Its flexibility enables applications to effectively exploit inherent data locality while meeting consistency needs. Khazana exports a file system-like interface with a small set of consistency controls which can be combined to yield a broad spectrum of consistency flavors ranging from strong consistency to best-effort eventual consistency. Khazana servers form failure-resilient dynamic replica hierarchies to manage replicas across variable quality network links. In this report, we outline Khazana?s design and show how its flexibility enables three diverse network services built on top of it to meet their individual consistency and performance needs: (i) a wide-area replicated file system that supports serializable writes as well as traditional file sharing across wide area, (ii) an enterprise data service that exploits locality by caching enterprise data closer to end-users while ensuring strong consistency for data integrity, and (iii) a replicated database that reaps order of magnitude gains in throughput by relaxing consistency

    Composable consistency for large-scale peer replication

    Get PDF
    technical reportThe lack of a flexible consistency management solution hinders P2P implementation of applications involving updates, such as directory services, online auctions and collaboration. Managing shared data in a P2P setting requires a consistency solution that can operate in a heterogenous network, support pervasive replication for scaling, and give peers autonomy to tune consistency to their sharing needs and resource constraints. Existing solutions lack one or more of these features. In this paper, we propose a new way to structure consistency management for P2P sharing of mutable data called composable consistency. It lets applications compose a rich variety of consistency solutions appropriate for their sharing needs, out of a small set of primitive options. Our approach splits consistency management into design choices along five orthogonal aspects, namely, concurrency, consistency, availability, update visibility and isolation. Various combinations of these choices can be employed to yield numerous consistency semantics and to fine-tune resource use at each replica. Our experience with a prototype implementation suggests that composable consistency can effectively support diverse P2P applications

    Flexible consistency for wide area peer replication

    Get PDF
    technical reportThe lack of a flexible consistency management solution hinders P2P implementation of applications involving updates, such as read-write file sharing, directory services, online auctions and wide area collaboration. Managing mutable shared data in a P2P setting requires a consistency solution that can operate efficiently over variable-quality failure-prone networks, support pervasive replication for scaling, and give peers autonomy to tune consistency to their sharing needs and resource constraints. Existing solutions lack one or more of these features. In this paper, we describe a new consistency model for P2P sharing of mutable data called composable consistency, and outline its implementation in a wide area middleware file service called Swarm1. Composable consistency lets applications compose consistency semantics appropriate for their sharing needs by combining a small set of primitive options. Swarm implements these options efficiently to support scalable, pervasive, failure-resilient, wide-area replication behind a simple yet flexible interface. We present two applications to demonstrate the expressive power and effectiveness of composable consistency: a wide area file system that outperforms Coda in providing close-to-open consistency overWANs, and a replicated BerkeleyDB database that reaps order-of-magnitude performance gains by relaxing consistency for queries and updates

    Middleware support for locality-aware wide area replication

    Get PDF
    technical reportCoherent wide-area data caching can improve the scalability and responsiveness of distributed services such as wide-area file access, database and directory services, and content distribution. However, distributed services differ widely in the frequency of read/write sharing, the amount of contention between clients for the same data, and their ability to make tradeoffs between consistency and availability. Aggressive replication enhances the scalability and availability of services with read-mostly data or data that need not be kept strongly consistent. However, for applications that require strong consistency of writeshared data, you must throttle replication to achieve reasonable performance. We have developed a middleware data store called Swarm designed to support the widearea data sharing needs of distributed services. To support the needs of diverse distributed services, Swarm provides: (i) a failure-resilient proximity-aware data replication mechanism that adjusts the replication hierarchy based on observed network characteristics and node availability, (ii) a customizable consistency mechanism that allows applications to specify allowable consistency-availability tradeoffs, and (iii) a contention-aware caching mechanism that monitors contention between replicas and adjusts its replication policies accordingly. On a 240-node P2P file sharing system, Swarm's proximity-aware caching and replica hierarchy maintenance mechanisms improve latency by 80%, reduce WAN bandwidth consumed by 80%, and limit the impact of high node churn (5 node deaths/sec) to roughly one-fifth that of random replication. In addition, Swarm's contention-aware caching mechanism outperforms RPCs and static caching mechanisms at all levels of contention on an enterprise service workload

    DataStations: ubiquitous transient storage for mobile users

    Get PDF
    technical reportIn this paper, we describe DataStations, an architecture that provides ubiquitous transient storage to arbitrary mobile applications. Mobile users can utilize a nearby DataStation as a proxy cache for their remote home file servers, as a file server to meet transient storage needs, and as a platform to share data and collaborate with other users over the wide area. A user can roam among DataStations, creating, updating and sharing files via a native file interface using a uniform file name space throughout. Our architecture provides transparent migration of file ownership and responsibility among DataStations and a user?s home file server. This design not only ensures file permanence, but also allows DataStations to reclaim their resources autonomously, allowing the system to incrementally scale to a large number of DataStations and users. The unique aspects of our DataStation design are its decentralized but uniform name space, its locality-aware peer replication mechanism, and its highly flexible consistency framework that lets users select the appropriate consistency mechanism on a per-file replica basis. Our evaluation demonstrates that DataStations can support low-latency access to remote files as well as ad-hoc data sharing and collaboration by mobile users, without compromising consistency or data safety

    Flexible multi-policy scheduling based on CPU inheritance

    Get PDF
    Journal ArticleTraditional processor scheduling mechanisms in operating systems are fairly rigid, often supporting only one fixed scheduling policy, or, at most, a few "scheduling classes" whose implementations are closely tied together in the OS kernel. This paper presents CPU inheritance scheduling, a novel processor scheduling framework in which arbitrary threads can act as schedulers for other threads. Widely different scheduling policies can be implemented under the framework, and many different policies can coexist in a single system, providing much greater scheduling flexibility. Modular, hierarchical control can be provided over the processor utilization of arbitrary administrative domains, such as processes, jobs, users, and groups, and the CPU resources consumed can be accounted for and attributed accurately. Applications as well as the OS can implement customized local scheduling policies; the framework ensures that all the different policies work together logically and predictably. As a side effect, the framework also cleanly addresses priority inversion by providing a generalized form of priority inheritance that automatically works within and among multiple diverse scheduling policies. CPU inheritance scheduling extends naturally to multiprocessors, and supports processor management techniques such as processor affinity [7] and scheduler activations [1]. Experimental results and simulations indicate that this framework can be provided with negligible overhead in typical situations, and fairly small (5-10%) performance degradation even in scheduling-intensive situations
    corecore