14 research outputs found
A Study of Implementation Methodologies for Distributed Real Time Collaboration
Collaboration drives our world and is almost unavoidable in the programming industry. From higher education to the top technological companies, people are working together to drive discovery and innovation. Software engineers must work with their peers to accomplish goals daily in their workplace. When working with others there are a variety of tools to choose from such as Google Docs, Google Colab and Overleaf. Each of the aforementioned collaborative tools utilizes the Operational Transform (OT) technique in order to implement their real time collaboration functionality. Operational transform is the technique seen amongst most if not all major collaborative tools in our industry today. However, there is another way of implementing real time collaboration through a data structure called Conflict-free Replicated Data Type (CRDT) which has made claims of superiority over OT. Previous studies have taken place with the focus on comparing the theory behind OT and CRDT\u27s, but as far as we know, there have not been studies which compare real time collaboration performance using an OT implementation versus a CRDT implementation in a popularly used product such as Google Docs or Overleaf.
Our work will focus on comparing OT and CRDT\u27s real time collaborative performance in Overleaf, an academic authorship tool, which allows for easy collaboration on academic and professional papers. Overleaf\u27s current published version implements real time collaboration using operational transform. This thesis will contribute an analysis of the current real time collaboration performance of operational transform in Overleaf, an implementation of CRDT\u27s for real time collaboration in Overleaf and an analysis of the performance of real time collaboration through the CRDT implementation in Overleaf. This thesis describes the main advantages and disadvantages of OT vs CRDTs, as well as, to our knowledge, the first results of a non-theoretical attempt at implementing CRDTs for handling document edits in a collaborative environment which was originally operating using an OT implementation
A Web Component for Real-Time Collaborative Text Editing
Real-time collaborative software allows physically distinct people to co-operate by working on a shared application state, receiving updates from each other in real-time. The goal of this thesis was to create a developer tool, which would allow web application developers to easily integrate a collaborative text editor into their applications. In order to remain technology agnostic and to utilize the latest web standards, this product was implemented as a web component, a reusable user interface component built with native web browser features.
The main challenge in developing a real-time collaboration tool is the handling of concurrent updates, which might conflict with one another. To tackle this issue, many consistency maintenance algorithms have been presented in the academic literature. Most of these techniques are variations of two main approaches: operational transformation and commutative replicated data types. In this thesis, we reviewed some of these methods and chose the GOTO operational transformation algorithm to be implemented in our component.
Besides selecting and implementing an appropriate consistency maintenance technique, the contributions of this thesis include the design of an easy-to-use application programming interface (API). Our solution also fulfills some practical requirements of group editors not covered by the consistency maintenance theory, such as session management and cleaning of the message queue. The created web component succeeds in encapsulating the complexity related to concurrency control and handling of joining peers in the client-side implementation, which allows the application logic to remain simplistic. This open-source product enables software developers to add a collaborative text editor to their web applications by broadcasting the updates provided by an event-based API to participating peers
Collabs: A Flexible and Performant CRDT Collaboration Framework
A collaboration framework is a distributed system that serves as the data
layer for a collaborative app. Conflict-free Replicated Data Types (CRDTs) are
a promising theoretical technique for implementing collaboration frameworks.
However, existing frameworks are inflexible: they are often one-off
implementations of research papers or only permit a restricted set of CRDT
semantics, and they do not allow app-specific optimizations. Until now, there
was no general framework that lets programmers mix, match, and modify CRDTs.
We solve this with Collabs, a CRDT-based collaboration framework that lets
programmers implement their own CRDTs, either from-scratch or by composing
existing building blocks. Collabs prioritizes both semantic flexibility and
performance flexibility: it allows arbitrary app-specific CRDT behaviors and
optimizations, while still providing strong eventual consistency. We
demonstrate Collabs's capabilities and programming model with example apps and
CRDT implementations. We then show that a collaborative rich-text editor using
Collabs's built-in CRDTs can scale to over 100 simultaneous users, unlike
existing CRDT frameworks and Google Docs. Collabs also has lower end-to-end
latency and server CPU usage than a popular Operational Transformation
framework, with acceptable CRDT metadata overhead.Comment: 18 pages, 19 figure
Recommended from our members
Privacy-preserving decentralised collaborative applications
Cloud-based applications are problematic from a privacy perspective because they typically have access to large amounts of user data and metadata. This centralisation of user data creates an attractive target for actors such as criminals, suppressive governments, and companies selling the data. At the same time, the popularity of mobile and web applications has led to a growing amount of sensitive data being stored in the cloud.
This dissertation focuses on collaborative applications, such as Google Docs and Microsoft Office Online, where users currently rely on cloud-based solutions. It explores decentralised alternatives that allow the use of end-to-end encryption and anonymous communication systems to improve both information privacy and communication privacy.
One approach for a collaborative application to synchronise data in a privacy-preserving way is to use Tor hidden services, providing end-to-end encrypted communication, while also hiding collaborators’ identity. However, running Tor comes at a cost. We explore the costs of running a hidden service on a smartphone. Smartphones are nowadays the most frequently used computing devices, but they are also relatively resource-constrained. We build an empirical model of monthly cellular data traffic, and estimate a median 198 MiB for a typical user. We further estimate that the network activity would cost at least 9.6% of daily battery capacity on a Nexus One using 3G Internet. We explore four optimisations that, in combination, reduce the estimated median data cost to 61 MiB.
We also consider the security and privacy properties of decentralised collaborative applications, and explore a challenge that is introduced by a decentralised design – the lack of a trusted server guaranteeing consistency between collaborators. We present a novel snapshot protocol that ensures consistency, whilst allowing the past edit history to be hidden from new collaborators, and without relying on a consensus mechanism.
Lastly, we evaluate the overhead of the snapshot protocol by replaying editing histories from 270 Wikipedia articles, and demonstrate how its correctness and security properties are achieved. Assuming the number of collaborators remains small, the protocol is scalable in terms of CPU, memory, and network usage. It substantially reduces the amount of data transferred to a new collaborator compared to a basic protocol that transmits the full history. The computational cost is in the order of milliseconds per operation, indicating the protocol is suitable for applications where the rate of edits is relatively low.Funding was provided by Microsoft Research, The Boeing Company, and the Computer Laboratory