Search CORE

1,646 research outputs found

Merging Semantics for Conflict Updates in Geo-Distributed File Systems

Author: Rancurel Vianney
Shapiro Marc
Tao Thanh Vinh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

International audienceWe present our model of file systems and our merging semantics for resolving conflict updates in geo-distributed file systems. The system model fully describes a file system with all of its components including hard links. This model is able to identify all conflict cases which are classified into direct, such as concurrent updates to the same file, and indirect, such as cycles in the namespace of the file system. The merging semantics resolve all types of conflicts while being able to preserve the effect of all conflict updates. Our implementation of the system and the merging semantics outperforms the existing systems in terms of feature completeness

INRIA a CCSD electronic archive server

SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine

Author: Balegas Valter
Baquero Carlos
Bieniusa Annette
Duarte Sérgio
Preguiça Nuno
Shapiro Marc
Zawirski Marek
Publication venue
Publication date: 06/10/2013
Field of study

Client-side logic and storage are increasingly used in web and mobile applications to improve response time and availability. Current approaches tend to be ad-hoc and poorly integrated with the server-side logic. We present a principled approach to integrate client- and server-side storage. We support mergeable and strongly consistent transactions that target either client or server replicas and provide access to causally-consistent snapshots efficiently. In the presence of infrastructure faults, a client-assisted failover solution allows client execution to resume immediately and seamlessly access consistent snapshots without waiting. We implement this approach in SwiftCloud, the first transactional system to bring geo-replication all the way to the client machine. Example applications show that our programming model is useful across a range of application areas. Our experimental evaluation shows that SwiftCloud provides better fault tolerance and at the same time can improve both latency and throughput by up to an order of magnitude, compared to classical geo-replication techniques

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

CRDTs for truly concurrent file systems

Author: Nguyen Thuy Linh
Shapiro Marc
Vaillant Romain
Vasilas Dimitrios
Publication venue: HAL CCSD
Publication date: 27/07/2021
Field of study

International audienceBuilding scalable and highly available geo-replicated file systems is hard. These systems need to resolve conflicts that emerge in concurrent operations in a way that maintains file system invariants, is meaningful to the user, and does not depart from the traditional file system interface. Conflict resolution in existing systems often leads to unexpected or inconsistent results. This paper introduces ElmerFS, a geo-replicated, truly concurrent file system designed with the aim of addressing these challenges. ElmerFS is based on two key ideas: (1) the use of Conflict-Free Replicated Data Types (CRDTs) for representing file system structures, which ensures that replicas converge to a correct state, and (2) conflict resolution rules, which are determined by the choice of CRDT types and their composition, designed with the principle of being intuitive to the user. We argue that if the state of the file system after resolving a conflict conveys to the user the resolved conflict in an intuitive way, the user can complement or reverse it using traditional file system operations. We discuss the challenges in the design of geo-replicated weakly consistent file systems, and present the design of ElmerFS

INRIA a CCSD electronic archive server

Eventual Consistency: Origin and Support

Author: Bernabéu-Aubán José M.
García-Escrivá José-Ramón
González de Mendívil José Ramón
Muñoz-Escoí Francesc D.
Sendra-Roig Juan Salvador
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 21/11/2018
Field of study

Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Nested Pure Operation-Based CRDTs

Author: Bauwens Jim
Gonzalez Boix Elisa
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th European Conference on Object-Oriented Programming (ECOOP 2023)
Publication date: 01/01/2023
Field of study

Modern distributed applications increasingly replicate data to guarantee high availability and optimal user experience. Conflict-free Replicated Data Types (CRDTs) are a family of data types specially designed for highly available systems that guarantee some form of eventual consistency. Designing CRDTs is very difficult because it requires devising designs that guarantee convergence in the presence of conflicting operations. Even though design patterns and structured frameworks have emerged to aid developers with this problem, they mostly focus on statically structured data; nesting and dynamically changing the structure of a CRDT remains to be an open issue. This paper explores support for nested CRDTs in a structured and systematic way. To this end, we define an approach for building nested CRDTs based on the work of pure operation-based CRDTs, resulting in nested pure operation-based CRDTs. We add constructs to control the nesting of CRDTs into a pure operation-based CRDT framework and show how several well-known CRDT designs can be defined in our framework. We provide an implementation of nested pure operation-based CRDTs as an extension to the Flec, an existing TypeScript-based framework for pure operation-based CRDTs. We validate our approach, 1) by implementing a portfolio of nested data structures, 2) by implementing and verifying our approach in the VeriFx language, and 3) by implementing a real-world application scenario and comparing its network usage against an implementation in the closest related work, Automerge. We show that the framework is general enough to nest well-known CRDT designs like maps and lists, and its performance in terms of network traffic is comparable to the state of the art

Dagstuhl Research Online Publication Server

A Semantic Consistency Model to Reduce Coordination in Replicated Systems

Author: Gomes Nuno Filipe Estêvão
Publication venue
Publication date: 01/02/2021
Field of study

Large-scale distributed applications need to be available and responsive to satisfy millions of users, which can be achieved by having data geo-replicated in multiple replicas. However, a partitioned system cannot sustain availability and consistency at fully. The usage of weak consistency models might lead to data integrity violations, triggered by problematic concurrent updates, such as selling twice the last ticket on a flight company service. To overcome possible conflicts, programmers might opt to apply strong consistency, which guarantees a total order between operations, while preserving data integrity. Nevertheless, the illusion of being a non-replicated system affects its availability. In contrast, weaker notions might be used, such as eventual consistency, that boosts responsiveness, as operations are executed directly at the source replica and their effects are propagated to remote replicas in the background. However, this approach might put data integrity at risk. Current protocols that preserve invariants rely on, at least, causal consistency, a consistency model that maintains causal dependencies between operations. In this dissertation, we propose a protocol that includes a semantic consistency model. This consistency model stands between eventual consistency and causal consistency. We guarantee better performance comparing with causal consistency, and ensure data integrity. Through semantic analysis, relying on the static analysis tool CISE3, we manage to limit the maximum number of dependencies that each operation will have. To support the protocol, we developed a communication algorithm in a cluster. Additionally, we present an architecture that uses Akka, an actor-based middleware in which actors communicate by exchanging messages. This architecture adopts the publish/subscribe pattern and includes data persistence. We also consider the stability of operations, as well as a dynamic cluster environment, ensuring the convergence of the replicated state. Finally, we perform an experimental evaluation regarding the performance of the algorithm using standard case studies. The evaluation confirms that by relying on semantic analysis, the system requires less coordination between the replicas than causal consistency, ensuring data integrity.Aplicações distribuídas em larga escala necessitam de estar disponíveis e de serem responsivas para satisfazer milhões de utilizadores, o que pode ser alcançado através da geo-replicação dos dados em múltiplas réplicas. No entanto, um sistema particionado não consegue garantir disponibilidade e consistência na sua totalidade. O uso de modelos de consistência fraca pode levar a violações da integridade dos dados, originadas por escritas concorrentes problemáticas. Para superar possíveis conflitos, os programadores podem optar por aplicar modelos de consistência forte, originando uma ordem total das operações, assegurando a integridade dos dados. Em contrapartida, podem ser utilizadas noções mais fracas, como a consistência eventual, que aumenta a capacidade de resposta, uma vez que as operações são executadas diretamente na réplica de origem e os seus efeitos são propagados para réplicas remotas. No entanto, esta abordagem pode colocar em risco a integridade dos dados. Os protocolos existentes que preservam as invariantes dependem, pelo menos, da consistência causal, um modelo de consistência que mantém as dependências causais entre operações. Nesta dissertação propomos um protocolo que inclui um modelo de consistência semântica. Este modelo situa-se entre a consistência eventual e a consistência causal. Garantimos um melhor desempenho em comparação com a consistência causal, e asseguramos a integridade dos dados. Através de uma análise semântica, obtida através da ferramenta de análise estática CISE3, conseguimos limitar o número de dependências de cada operação. Para suportar o protocolo, desenvolvemos um algoritmo de comunicação entre um aglomerado de réplicas. Adicionalmente, apresentamos uma arquitetura que utiliza Akka, um middleware baseado em atores que trocam mensagens entre si. Esta arquitetura utiliza o padrão publish/subscribe e inclui a persistência dos dados. Consideramos também a estabilidade das operações, bem como um ambiente dinâmico de réplicas, assegurando a convergência do estado. Por último, apresentamos a avaliação do desempenho do algoritmo desenvolvido, que confirma que a análise semântica das operações requer menos coordenação entre as réplicas que a consistência causal

Repositório da Universidade Nova de Lisboa

Invariant Safety for Distributed Applications

Author: Nair Sreeja
Petri Gustavo
Shapiro Marc
Publication venue
Publication date: 01/01/2019
Field of study

We study a proof methodology for verifying the safety of data invariants of highly-available distributed applications that replicate state. The proof is (1) modular: one can reason about each individual operation separately, and (2) sequential: one can reason about a distributed application as if it were sequential. We automate the methodology and illustrate the use of the tool with a representative example.Comment: Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC), Mar 2019, Dresden, Germany. https://novasys.di.fct.unl.pt/conferences/papoc19

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server