136 research outputs found

    Invariant Safety for Distributed Applications

    Get PDF
    We study a proof methodology for verifying the safety of data invariants of highly-available distributed applications that replicate state. The proof is (1) modular: one can reason about each individual operation separately, and (2) sequential: one can reason about a distributed application as if it were sequential. We automate the methodology and illustrate the use of the tool with a representative example.Comment: Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC), Mar 2019, Dresden, Germany. https://novasys.di.fct.unl.pt/conferences/papoc19

    Conflict-Free Replicated Data Types in Dynamic Environments

    Get PDF
    Over the years, mobile devices have become increasingly popular and gained improved computation capabilities allowing them to perform more complex tasks such as collaborative applications. Given the weak characteristic properties of mobile networks, which represent highly dynamic environments where users may experience regular involuntary disconnection periods, the big question arises of how to maintain data consistency. This issue is most pronounced in collaborative environments where multiple users interact with each other, sharing a replicated state that may diverge due to concurrency conflicts and loss of updates. To maintain consistency, one of today’s best solutions is Conflict-Free Replicated Data Types (CRDTs), which ensure low latency values and automatic conflict resolution, guaranteeing eventual consistency of the shared data. However, a limitation often found on CRDTs and the systems that employ them is the need for the knowledge of the replicas whom the state changes must be disseminated to. This constitutes a problem since it is inconceivable to maintain said knowledge in an environment where clients may leave and join at any given time and consequently get disconnected due to mobile network communications unreliability. In this thesis, we present the study and extension of the CRDT concept to dynamic environments by introducing the developed P/S-CRDTs model, where CRDTs are coupled with the publisher/subscriber interaction scheme and additional mechanisms to ensure users are able to cooperate and maintain consistency whilst accounting for the consequent volatile behaviors of mobile networks. The experimental results show that in volatile scenarios of disconnection, mobile users in collaborative activity maintain consistency among themselves and when compared to other available CRDT models, the P/S-CRDTs model is able to decouple the required knowledge of whom the updates must be disseminated to, while ensuring appropriate network traffic values

    Key-CRDT stores

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaThe Internet has opened opportunities to create world scale services. These systems require highavailability and fault tolerance, while preserving low latency. Replication is a widely adopted technique to provide these properties. Different replication techniques have been proposed through the years, but to support these properties for world scale services it is necessary to trade consistency for availability, fault-tolerance and low latency. In weak consistency models, it is necessary to deal with possible conflicts arising from concurrent updates. We propose the use of conflict free replicated data types (CRDTs) to address this issue. Cloud computing systems support world scale services, often relying on Key-Value stores for storing data. These systems partition and replicate data over multiple nodes, that can be geographically disperse over the network. For handling conflict, these systems either rely on solutions that lose updates (e.g. last-write-wins) or require application to handle concurrent updates. Additionally, these systems provide little support for transactions, a widely used abstraction for data access. In this dissertation, we present the design and implementation of SwiftCloud, a Key-CRDT store that extends a Key-Value store by incorporating CRDTs in the system’s data-model. The system provides automatic conflict resolution relying on properties of CRDTs. We also present a version of SwiftCloud that supports transactions. Unlike traditional transactional systems, transactions never abort due to write/write conflicts, as the system leverages CRDT properties to merge concurrent transactions. For implementing SwiftCloud, we have introduced a set of new techniques, including versioned CRDTs, composition of CRDTs and alternative serialization methods. The evaluation of the system, with both micro-benchmarks and the TPC-W benchmark, shows that SwiftCloud imposes little overhead over a key-value store. Allowing clients to access a datacenter close to them with SwiftCloud, can reduce latency without requiring any complex reconciliation mechanism. The experience of using SwiftCloud has shown that adapting an existing application to use SwiftCloud requires low effort.Project PTDC/EIA-EIA/108963/200

    Continious Synchronization of Conflict-Free Replicated Relations

    Get PDF
    Local-first software is an attempt to use the benefits of cloud service while reducing its drawbacks. Local-first software gives the clients ownership and control of their data and makes the service always available. It is achieved by having the primary copy of the service at the client. The most common way to implement local-first software is by utilizing Conflict-free Replicated Datatypes or CRDTs, which are conflict-free by design. Conflict-free Replicated Relations or CRR are CRDT applied to SQL databases. CRR systems require some form of communication middleware to propagate its state so that each site can converge to a common state. Earlier CRR systems have used SSH File Transfer Protocol to propagate states or SFTP, which means writing to disk many times. This thesis focuses mainly on the communication portion of a CRR system applied to an SQLite database called SynQLite. SynQLite is a service that can augment an SQLite database with CRR-support. It also includes the possibility to clone and synchronize with remote sites. The previous version of SynQLite allowed only to pull states from a remote site. We propose a solution where users can synchronize continuously with more than two sites simultaneously. The solution involves creating a centralized leader that other sites can connect to with TCP connections, and the other sites synchronize with the leader. The thesis includes an evaluation process including many experiments. These experiments are used to evaluate how well SynQLite supports local-first properties. This includes testing how SynQLite affects the offline database and how well SynQLite supports synchronizing sites

    Access Control in Weakly Consistent Systems

    Get PDF
    Eventually consistent models have become popular in the last years in data storage systems for cloud environments, allowing to give users better availability and lower latency. In this model, it is possible for replicas to be temporarily inconsistent, having been proposed various solutions to deal with this inconsistency and ensure the final convergence of data. However, defining and enforcing access control policies under this model is still an open challenge. The implementation of access control policies for these systems raises it’s own challenges, given the information about the permissions is itself kept in a weakly consistent form. In this dissertation, a solution for this problem is proposed, that allows to prevent the non authorized access and modification of data. The proposed solution allows concurrent modifications on the security policies, ensuring their convergence when they are used to verify and enforce access control the associated data. In this dissertation we present an evaluation of the proposed model, showing the solution respects the correct functioning over possible challenging situations, also discussing its application on scenarios that feature peer-to-peer communication between clients and additional replicas on the clients, with the goal of providing a lower latency and reduce the load on centralized components

    Modular Verification of State-Based CRDTs in Separation Logic

    Get PDF
    Conflict-free Replicated Datatypes (CRDTs) are a class of distributed data structures that are highly-available and weakly consistent. The CRDT taxonomy is further divided into two subclasses: state-based and operation-based (op-based). Recent prior work showed how to use separation logic to verify convergence and functional correctness of op-based CRDTs while (a) verifying implementations (as opposed to high-level protocols), (b) giving high level specifications that abstract from low-level implementation details, and (c) providing specifications that are modular (i.e. allow client code to use the CRDT like an abstract data type). We extend this separation logic approach to verification of CRDTs to handle state-based CRDTs, while respecting the desiderata (a)-(c). The key idea is to track the state of a CRDT as a function of the set of operations that produced that state. Using the observation that state-based CRDTs are automatically causally-consistent, we obtain CRDT specifications that are agnostic to whether a CRDT is state- or op-based. When taken together with prior work, our technique thus provides a unified approach to specification and verification of op- and state-based CRDTs. We have tested our approach by verifying StateLib, a library for building state-based CRDTs. Using StateLib, we have further verified convergence and functional correctness of multiple example CRDTs from the literature. Our proofs are written in the Aneris distributed separation logic and are mechanized in Coq

    syncope: Automatic Enforcement of Distributed Consistency Guarantees

    Get PDF

    A Semantic Consistency Model to Reduce Coordination in Replicated Systems

    Get PDF
    Large-scale distributed applications need to be available and responsive to satisfy millions of users, which can be achieved by having data geo-replicated in multiple replicas. However, a partitioned system cannot sustain availability and consistency at fully. The usage of weak consistency models might lead to data integrity violations, triggered by problematic concurrent updates, such as selling twice the last ticket on a flight company service. To overcome possible conflicts, programmers might opt to apply strong consistency, which guarantees a total order between operations, while preserving data integrity. Nevertheless, the illusion of being a non-replicated system affects its availability. In contrast, weaker notions might be used, such as eventual consistency, that boosts responsiveness, as operations are executed directly at the source replica and their effects are propagated to remote replicas in the background. However, this approach might put data integrity at risk. Current protocols that preserve invariants rely on, at least, causal consistency, a consistency model that maintains causal dependencies between operations. In this dissertation, we propose a protocol that includes a semantic consistency model. This consistency model stands between eventual consistency and causal consistency. We guarantee better performance comparing with causal consistency, and ensure data integrity. Through semantic analysis, relying on the static analysis tool CISE3, we manage to limit the maximum number of dependencies that each operation will have. To support the protocol, we developed a communication algorithm in a cluster. Additionally, we present an architecture that uses Akka, an actor-based middleware in which actors communicate by exchanging messages. This architecture adopts the publish/subscribe pattern and includes data persistence. We also consider the stability of operations, as well as a dynamic cluster environment, ensuring the convergence of the replicated state. Finally, we perform an experimental evaluation regarding the performance of the algorithm using standard case studies. The evaluation confirms that by relying on semantic analysis, the system requires less coordination between the replicas than causal consistency, ensuring data integrity.Aplicações distribuídas em larga escala necessitam de estar disponíveis e de serem responsivas para satisfazer milhões de utilizadores, o que pode ser alcançado através da geo-replicação dos dados em múltiplas réplicas. No entanto, um sistema particionado não consegue garantir disponibilidade e consistência na sua totalidade. O uso de modelos de consistência fraca pode levar a violações da integridade dos dados, originadas por escritas concorrentes problemáticas. Para superar possíveis conflitos, os programadores podem optar por aplicar modelos de consistência forte, originando uma ordem total das operações, assegurando a integridade dos dados. Em contrapartida, podem ser utilizadas noções mais fracas, como a consistência eventual, que aumenta a capacidade de resposta, uma vez que as operações são executadas diretamente na réplica de origem e os seus efeitos são propagados para réplicas remotas. No entanto, esta abordagem pode colocar em risco a integridade dos dados. Os protocolos existentes que preservam as invariantes dependem, pelo menos, da consistência causal, um modelo de consistência que mantém as dependências causais entre operações. Nesta dissertação propomos um protocolo que inclui um modelo de consistência semântica. Este modelo situa-se entre a consistência eventual e a consistência causal. Garantimos um melhor desempenho em comparação com a consistência causal, e asseguramos a integridade dos dados. Através de uma análise semântica, obtida através da ferramenta de análise estática CISE3, conseguimos limitar o número de dependências de cada operação. Para suportar o protocolo, desenvolvemos um algoritmo de comunicação entre um aglomerado de réplicas. Adicionalmente, apresentamos uma arquitetura que utiliza Akka, um middleware baseado em atores que trocam mensagens entre si. Esta arquitetura utiliza o padrão publish/subscribe e inclui a persistência dos dados. Consideramos também a estabilidade das operações, bem como um ambiente dinâmico de réplicas, assegurando a convergência do estado. Por último, apresentamos a avaliação do desempenho do algoritmo desenvolvido, que confirma que a análise semântica das operações requer menos coordenação entre as réplicas que a consistência causal
    corecore