33 research outputs found

    Building a collaborative peer-to-peer wiki system on a structured overlay

    Get PDF
    International audienceThe ever growing request for digital information raises the need for content distribution architectures providing high storage capacity, data availability and good performance. While many simple solutions for scalable distribution of quasi-static content exist, there are still no approaches that can ensure both scalability and consistency for the case of highly dynamic content, such as the data managed inside wikis. We propose a peer-to-peer solution for distributing and managing dynamic content, that combines two widely studied technologies: Distributed Hash Tables (DHT) and optimistic replication. In our “universal wiki” engine architecture (UniWiki), on top of a reliable, inexpensive and consistent DHT-based storage, any number of front-ends can be added, ensuring both read and write scalability, as well as suitability for large-scale scenarios. The implementation is based on Damon, a distributed AOP middleware, thus separating distribution, replication, and consistency responsibilities, and also making our system transparently usable by third party wiki engines. Finally, UniWiki has been proved viable and fairly efficient in large-scale scenarios

    UniWiki: A Reliable and Scalable Peer-to-Peer System for Distributing Wiki Applications

    Get PDF
    The ever growing request for digital information raises the need for content distribution architectures providing high storage capacity, data availability and good performance. While many simple solutions for scalable distribution of quasi-static content exist, there are still no approaches that can ensure both scalability and consistency for the case of highly dynamic content, such as the data managed inside wikis. In this paper, we propose a peer to peer solution for distributing and managing dynamic content, that combines two widely studied technologies: distributed hash tables (DHT) and optimistic replication. In our ``universal wiki'' engine architecture (UniWiki), on top of a reliable, inexpensive and consistent DHT-based storage, any number of front-ends can be added, ensuring both read and write scalability. The implementation is based on a Distributed Interception Middleware, thus separating distribution, replication, and consistency responsibilities, and also making our system usable by third party wiki engines in a transparent way. UniWiki has been proved viable and fairly efficient in large-scale scenarios

    Remove-Win: a Design Framework for Conflict-free Replicated Data Collections

    Full text link
    Internet-scale distributed systems often replicate data within and across data centers to provide low latency and high availability despite node and network failures. Replicas are required to accept updates without coordination with each other, and the updates are then propagated asynchronously. This brings the issue of conflict resolution among concurrent updates, which is often challenging and error-prone. The Conflict-free Replicated Data Type (CRDT) framework provides a principled approach to address this challenge. This work focuses on a special type of CRDT, namely the Conflict-free Replicated Data Collection (CRDC), e.g. list and queue. The CRDC can have complex and compound data items, which are organized in structures of rich semantics. Complex CRDCs can greatly ease the development of upper-layer applications, but also makes the conflict resolution notoriously difficult. This explains why existing CRDC designs are tricky, and hard to be generalized to other data types. A design framework is in great need to guide the systematic design of new CRDCs. To address the challenges above, we propose the Remove-Win Design Framework. The remove-win strategy for conflict resolution is simple but powerful. The remove operation just wipes out the data item, no matter how complex the value is. The user of the CRDC only needs to specify conflict resolution for non-remove operations. This resolution is destructed to three basic cases and are left as open terms in the CRDC design skeleton. Stubs containing user-specified conflict resolution logics are plugged into the skeleton to obtain concrete CRDC designs. We demonstrate the effectiveness of our design framework via a case study of designing a conflict-free replicated priority queue. Performance measurements also show the efficiency of the design derived from our design framework.Comment: revised after submissio

    Shelfaware: Accelerating Collaborative Awareness with Shelf CRDT

    Get PDF
    Collaboration has become a key feature of modern software, allowing teams to work together effectively in real-time while in different locations. In order for a user to communicate their intention to several distributed peers, computing devices must exchange high-frequency updates with transient metadata like mouse position, text range highlights, and temporary comments. Current peer-to-peer awareness solutions have high time and space complexity due to the ever-expanding logs that each client must maintain in order to ensure robust collaboration in eventually consistent environments. This paper proposes an awareness Conflict-Free Replicated Data Type (CRDT) library that provides the tooling to support an eventually consistent, decentralized, and robust multi-user collaborative environment. Our library is tuned for rapid iterative updates that communicate fine-grained user actions across a network of collaborators. Our approach holds memory constant for subsequent writes to an existing key on a shared resource and completely prunes stale data from shared documents. These features allow us to keep the CRDT\u27s memory footprint small, making it a feasible solution for memory constrained applications. Results show that our CRDT implementation is comparable to or exceeds the performance of similar data structures in high-frequency read/write scenarios

    XWiki Concerto: A P2P Wiki System Supporting Disconnected Work

    Get PDF
    International audienceThis paper presents the XWiki Concerto system, the P2P version of the XWiki server. This system is based on replicating wiki pages on a network of wiki servers. The approach, based on the Woot algorithm, has been designed to be scalable, to support the dynamic aspect of P2P networks and network partitions. These characteristics make our system capable of supporting disconnected edition and sub-groups, making it very flexible and usable

    Logoot: A Scalable Optimistic Replication Algorithm for Collaborative Editing on P2P Networks

    Get PDF
    International audienceMassive collaborative editing becomes a reality through leading projects such as Wikipedia. This massive collaboration is currently supported with a costly central service. In order to avoid such costs, we aim to provide a peer-to- peer collaborative editing system. Existing approaches to build distributed collaborative editing systems either do not scale in terms of number of users or in terms of number of edits. We present the Logoot approach that scales in these both dimensions while ensuring causality, consistency and intention preservation criteria. We evaluate the Logoot approach and compare it to others using a corpus of all the edits applied on a set of the most edited and the biggest pages of Wikipedia

    A Comparison of Optimistic Approaches to Collaborative Editing of Wiki Pages

    Get PDF
    Wikis, a popular tool for sharing knowledge, are basically collaborative editing systems. However, existing wiki systems offer limited support for co-operative authoring, and they do not scale well, because they are based on a centralised architecture. This paper compares the well-known centralised MediaWiki system with several peer-to-peer approaches to editing of wiki pages: an operational transformation approach (MOT2), a commutativity-oriented approach (WOOTO) and a conflict resolution approach (ACF). We evaluate and compare them, according to a number of qualitative and quantitative metrics

    The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing

    Full text link
    Most existing algorithms for replicated lists, which are widely used in collaborative text editors, suffer from a problem: when two users concurrently insert text at the same position in the document, the merged outcome may interleave the inserted text passages, resulting in corrupted and potentially unreadable text. The problem has gone unnoticed for decades, and it affects both CRDTs and Operational Transformation. This paper defines maximal non-interleaving, our new correctness property for replicated lists. We introduce two related CRDT algorithms, Fugue and FugueMax, and prove that FugueMax satisfies maximal non-interleaving. We also implement our algorithms and demonstrate that Fugue offers performance comparable to state-of-the-art CRDT libraries for text editing.Comment: 16 pages, 10 figure

    Verifying Strong Eventual Consistency in Distributed Systems

    Get PDF
    Data replication is used in distributed systems to maintain up-to-date copies of shared data across multiple computers in a network. However, despite decades of research, algorithms for achieving consistency in replicated systems are still poorly understood. Indeed, many published algorithms have later been shown to be incorrect, even some that were accompanied by supposed mechanised proofs of correctness. In this work, we focus on the correctness of Conflict-free Replicated Data Types (CRDTs), a class of algorithm that provides strong eventual consistency guarantees for replicated data. We develop a modular and reusable framework in the Isabelle/HOL interactive proof assistant for verifying the correctness of CRDT algorithms. We avoid correctness issues that have dogged previous mechanised proofs in this area by including a network model in our formalisation, and proving that our theorems hold in all possible network behaviours. Our axiomatic network model is a standard abstraction that accurately reflects the behaviour of real-world computer networks. Moreover, we identify an abstract convergence theorem, a property of order relations, which provides a formal definition of strong eventual consistency. We then obtain the first machine-checked correctness theorems for three concrete CRDTs: the Replicated Growable Array, the Observed-Remove Set, and an Increment-Decrement Counter. We find that our framework is highly reusable, developing proofs of correctness for the latter two CRDTs in a few hours and with relatively little CRDT-specific code

    Key-CRDT stores

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaThe Internet has opened opportunities to create world scale services. These systems require highavailability and fault tolerance, while preserving low latency. Replication is a widely adopted technique to provide these properties. Different replication techniques have been proposed through the years, but to support these properties for world scale services it is necessary to trade consistency for availability, fault-tolerance and low latency. In weak consistency models, it is necessary to deal with possible conflicts arising from concurrent updates. We propose the use of conflict free replicated data types (CRDTs) to address this issue. Cloud computing systems support world scale services, often relying on Key-Value stores for storing data. These systems partition and replicate data over multiple nodes, that can be geographically disperse over the network. For handling conflict, these systems either rely on solutions that lose updates (e.g. last-write-wins) or require application to handle concurrent updates. Additionally, these systems provide little support for transactions, a widely used abstraction for data access. In this dissertation, we present the design and implementation of SwiftCloud, a Key-CRDT store that extends a Key-Value store by incorporating CRDTs in the system’s data-model. The system provides automatic conflict resolution relying on properties of CRDTs. We also present a version of SwiftCloud that supports transactions. Unlike traditional transactional systems, transactions never abort due to write/write conflicts, as the system leverages CRDT properties to merge concurrent transactions. For implementing SwiftCloud, we have introduced a set of new techniques, including versioned CRDTs, composition of CRDTs and alternative serialization methods. The evaluation of the system, with both micro-benchmarks and the TPC-W benchmark, shows that SwiftCloud imposes little overhead over a key-value store. Allowing clients to access a datacenter close to them with SwiftCloud, can reduce latency without requiring any complex reconciliation mechanism. The experience of using SwiftCloud has shown that adapting an existing application to use SwiftCloud requires low effort.Project PTDC/EIA-EIA/108963/200