23 research outputs found

    Collaboration and Document Editing on Bandwidth-Limited Devices

    Get PDF
    This paper presents the design of CoFi, a novel architecture for supporting document editing and collaborative work over bandwidth-limited clients. CoFi combines the previously disjoint notions of consistency and fidelity in a unified architecture. CoFi enables bandwidth-limited clients to edit documents that are only partially present at the client (because parts of the documents were lossily transcoded, or only a portion of the document was fetched), and to propagate modifications incrementally by progressively increasing their fidelity

    Using Permuted States of Validated Simulation to Analyze Conflict Rates in Optimistic Replication

    Get PDF
    Optimistic replication provides high data availability in the presence of network outages. Although widely deployed, this relaxed consistency model introduces concurrent updates, whose behavior is poorly understood due to the vast state space. This paper introduces the notion of permuted states to eliminate system states that are redundant and unreachable, which can constitute the majority of states (4069 out of 4096 for four replicas). With the aid of permuted states, we are for the first time able to construct analytical models beyond the two-replica case. By examining the analysis for 2 to 4 replicas, we can demystify the process of forming identical conflicts—the most common conflict type at high replication factors. Additionally, we have automated and optimized the generation of permuted states, which allows us to explore higher replication factors (up to 10 replicas) using hybrid techniques. It also allows us to validate our results with existing simulations based on actual replication mechanisms, which previously were analytically validated with only one pair of replicas. Finally, we have discovered that update locality and bimodal access patterns are the primary factors contributing to the formation of identical conflicts

    A Privacy-Aware Distributed Storage and Replication Middleware for Heterogeneous Computing Platform

    Get PDF
    Cloud computing is an emerging research area that has drawn considerable interest in recent years. However, the current infrastructure raises significant concerns about how to protect users\u27 privacy, in part due to that users are storing their data in the cloud vendors\u27 servers. In this paper, we address this challenge by proposing and implementing a novel middleware, called Uno, which separates the storage of physical data and their associated metadata. In our design, users\u27 physical data are stored locally on those devices under a user\u27s full control, while their metadata can be uploaded to the commercial cloud. To ensure the reliability of users\u27 data, we develop a novel fine-grained file replication algorithm that exploits both data access patterns and device state patterns. Based on a quantitative analysis of the data set from Rice University, this algorithm replicates data intelligently in different time slots, so that it can not only significantly improve data availability, but also achieve a satisfactory performance on load balancing and storage diversification. We implement the Uno system on a heterogeneous testbed composed of both host servers and mobile devices, and demonstrate the programmability of Uno through implementation and evaluation of two sample applications, Uno@Home and Uno@Sense

    DataStations: ubiquitous transient storage for mobile users

    Get PDF
    technical reportIn this paper, we describe DataStations, an architecture that provides ubiquitous transient storage to arbitrary mobile applications. Mobile users can utilize a nearby DataStation as a proxy cache for their remote home file servers, as a file server to meet transient storage needs, and as a platform to share data and collaborate with other users over the wide area. A user can roam among DataStations, creating, updating and sharing files via a native file interface using a uniform file name space throughout. Our architecture provides transparent migration of file ownership and responsibility among DataStations and a user?s home file server. This design not only ensures file permanence, but also allows DataStations to reclaim their resources autonomously, allowing the system to incrementally scale to a large number of DataStations and users. The unique aspects of our DataStation design are its decentralized but uniform name space, its locality-aware peer replication mechanism, and its highly flexible consistency framework that lets users select the appropriate consistency mechanism on a per-file replica basis. Our evaluation demonstrates that DataStations can support low-latency access to remote files as well as ad-hoc data sharing and collaboration by mobile users, without compromising consistency or data safety

    Flexible consistency for wide area peer replication

    Get PDF
    technical reportThe lack of a flexible consistency management solution hinders P2P implementation of applications involving updates, such as read-write file sharing, directory services, online auctions and wide area collaboration. Managing mutable shared data in a P2P setting requires a consistency solution that can operate efficiently over variable-quality failure-prone networks, support pervasive replication for scaling, and give peers autonomy to tune consistency to their sharing needs and resource constraints. Existing solutions lack one or more of these features. In this paper, we describe a new consistency model for P2P sharing of mutable data called composable consistency, and outline its implementation in a wide area middleware file service called Swarm1. Composable consistency lets applications compose consistency semantics appropriate for their sharing needs by combining a small set of primitive options. Swarm implements these options efficiently to support scalable, pervasive, failure-resilient, wide-area replication behind a simple yet flexible interface. We present two applications to demonstrate the expressive power and effectiveness of composable consistency: a wide area file system that outperforms Coda in providing close-to-open consistency overWANs, and a replicated BerkeleyDB database that reaps order-of-magnitude performance gains by relaxing consistency for queries and updates

    Khazana: a flexible wide area data store

    Get PDF
    technical reportKhazana is a peer-to-peer data service that supports efficient sharing and aggressive caching of mutable data across the wide area while giving clients significant control over replica divergence. Previous work on wide-area replicated services focussed on at most two of the following three properties: aggressive replication, customizable consistency, and generality. In contrast, Khazana provides scalable support for large numbers of replicas while giving applications considerable flexibility in trading off consistency for availability and performance. Its flexibility enables applications to effectively exploit inherent data locality while meeting consistency needs. Khazana exports a file system-like interface with a small set of consistency controls which can be combined to yield a broad spectrum of consistency flavors ranging from strong consistency to best-effort eventual consistency. Khazana servers form failure-resilient dynamic replica hierarchies to manage replicas across variable quality network links. In this report, we outline Khazana?s design and show how its flexibility enables three diverse network services built on top of it to meet their individual consistency and performance needs: (i) a wide-area replicated file system that supports serializable writes as well as traditional file sharing across wide area, (ii) an enterprise data service that exploits locality by caching enterprise data closer to end-users while ensuring strong consistency for data integrity, and (iii) a replicated database that reaps order of magnitude gains in throughput by relaxing consistency

    An Analysis of Merge Conflicts and Resolutions in Git-based Open Source Projects

    Get PDF
    International audienceVersion control systems such as Git support parallel collaborative work and became very widespread in the open-source community. While Git offers some very interesting features, resolving conflicts that arise during synchronization of parallel changes is a time-consuming task. In this paper we present an analysis of concurrency and conflicts in official Git repository of four projects: Rails, IkiWiki, Samba and Linux Kernel. We analyse the collaboration process of these projects at specific periods revealing how change integration and conflict rates vary during project development life-cycle. We also analyse how often users decide to rollback to previous document version when the integration process generates conflicts. Finally, we discuss the mechanism adopted by Git to consider changes made on two continuous lines as conflicting

    Merging Semantics for Conflict Updates in Geo-Distributed File Systems

    Get PDF
    International audienceWe present our model of file systems and our merging semantics for resolving conflict updates in geo-distributed file systems. The system model fully describes a file system with all of its components including hard links. This model is able to identify all conflict cases which are classified into direct, such as concurrent updates to the same file, and indirect, such as cycles in the namespace of the file system. The merging semantics resolve all types of conflicts while being able to preserve the effect of all conflict updates. Our implementation of the system and the merging semantics outperforms the existing systems in terms of feature completeness
    corecore