574 research outputs found

    Eventual Consistency: Origin and Support

    Get PDF
    Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics

    Coordination Protocols for Verifiable Consistency in Distributed Storage Systems

    Get PDF
    Achieving consistency in a highly available distributed storage system has been formally proven to be an impossible task when the system faces network partitions and faulty processes. The complexity is exacerbated when the system allows concurrent processes to send transactions to all the other servers and coordinate the consistent commitment of different transactions. In the event of a partition, each server may allow clients to request updates involving the current state of the data, which makes achieving replicated consistency challenging. To solve the inconsistency problems, several consensus protocols are used, but have strict requirements in order to make progress and are not guaranteed to ever converge to a single value. Additionally, the coordination required to achieve consistency after a partition will be extremely high as each node must compare transaction times and conflicting data with all other servers in the system. To address the inconsistency in distributed systems, this thesis proposes a new coordination protocol that utilizes four ideas in order for clients to verify the consistency of data: (1) a universal timestamp signatory to certify the global order of events, (2) a relative consistency indicator to determine relative consistency during partitions, (3) an operation-based recency-weighted conflict resolution algorithm to simplify coordination for achieving global consistency, and (4) a rejection-oriented distributed transaction commit protocol to eliminate any guarantees required by atomic commit protocols and verify local consistency. This thesis will evaluate and analyze various issues related to coordination and concurrency under network partitions. The experimental results demonstrate that the proposed methods provide a verifiable consistency to the servers of the distributed storage systems

    Efficient middleware for database replication

    Get PDF
    Dissertação de Mestrado em Engenharia InformáticaDatabase systems are used to store data on the most varied applications, like Web applications, enterprise applications, scientific research, or even personal applications. Given the large use of database in fundamental systems for the users, it is necessary that database systems are efficient e reliable. Additionally, in order for these systems to serve a large number of users, databases must be scalable, to be able to process large numbers of transactions. To achieve this, it is necessary to resort to data replication. In a replicated system, all nodes contain a copy of the database. Then, to guarantee that replicas converge, write operations must be executed on all replicas. The way updates are propagated leads to two different replication strategies. The first is known as asynchronous or optimistic replication, and the updates are propagated asynchronously after the conclusion of an update transaction. The second is known as synchronous or pessimistic replication, where the updates are broadcasted synchronously during the transaction. In pessimistic replication, contrary to the optimistic replication, the replicas remain consistent. This approach simplifies the programming of the applications, since the replication of the data is transparent to the applications. However, this approach presents scalability issues, caused by the number of exchanged messages during synchronization, which forces a delay to the termination of the transaction. This leads the user to experience a much higher latency in the pessimistic approach. On this work is presented the design and implementation of a database replication system, with snapshot isolation semantics, using a synchronous replication approach. The system is composed by a primary replica and a set of secondary replicas that fully replicate the database- The primary replica executes the read-write transactions, while the remaining replicas execute the read-only transactions. After the conclusion of a read-write transaction on the primary replica the updates are propagated to the remaining replicas. This approach is proper to a model where the fraction of read operations is considerably higher than the write operations, allowing the reads load to be distributed over the multiple replicas. To improve the performance of the system, the clients execute some operations speculatively, in order to avoid waiting during the execution of a database operation. Thus, the client may continue its execution while the operation is executed on the database. If the result replied to the client if found to be incorrect, the transaction will be aborted, ensuring the correctness of the execution of the transactions

    SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine

    Get PDF
    Client-side logic and storage are increasingly used in web and mobile applications to improve response time and availability. Current approaches tend to be ad-hoc and poorly integrated with the server-side logic. We present a principled approach to integrate client- and server-side storage. We support mergeable and strongly consistent transactions that target either client or server replicas and provide access to causally-consistent snapshots efficiently. In the presence of infrastructure faults, a client-assisted failover solution allows client execution to resume immediately and seamlessly access consistent snapshots without waiting. We implement this approach in SwiftCloud, the first transactional system to bring geo-replication all the way to the client machine. Example applications show that our programming model is useful across a range of application areas. Our experimental evaluation shows that SwiftCloud provides better fault tolerance and at the same time can improve both latency and throughput by up to an order of magnitude, compared to classical geo-replication techniques

    Cloud-edge hybrid applications

    Get PDF
    Many modern applications are designed to provide interactions among users, including multi- user games, social networks and collaborative tools. Users expect application response time to be in the order of milliseconds, to foster interaction and interactivity. The design of these applications typically adopts a client-server model, where all interac- tions are mediated by a centralized component. This approach introduces availability and fault- tolerance issues, which can be mitigated by replicating the server component, and even relying on geo-replicated solutions in cloud computing infrastructures. Even in this case, the client-server communication model leads to unnecessary latency penalties for geographically close clients and high operational costs for the application provider. This dissertation proposes a cloud-edge hybrid model with secure and ecient propagation and consistency mechanisms. This model combines client-side replication and client-to-client propagation for providing low latency and minimizing the dependency on the server infras- tructure, fostering availability and fault tolerance. To realize this model, this works makes the following key contributions. First, the cloud-edge hybrid model is materialized by a system design where clients maintain replicas of the data and synchronize in a peer-to-peer fashion, and servers are used to assist clients’ operation. We study how to bring most of the application logic to the client-side, us- ing the centralized service primarily for durability, access control, discovery, and overcoming internetwork limitations. Second, we dene protocols for weakly consistent data replication, including a novel CRDT model (∆-CRDTs). We provide a study on partial replication, exploring the challenges and fundamental limitations in providing causal consistency, and the diculty in supporting client- side replicas due to their ephemeral nature. Third, we study how client misbehaviour can impact the guarantees of causal consistency. We propose new secure weak consistency models for insecure settings, and algorithms to enforce such consistency models. The experimental evaluation of our contributions have shown their specic benets and limitations compared with the state-of-the-art. In general, the cloud-edge hybrid model leads to faster application response times, lower client-to-client latency, higher system scalability as fewer clients need to connect to servers at the same time, the possibility to work oine or disconnected from the server, and reduced server bandwidth usage. In summary, we propose a hybrid of cloud-and-edge which provides lower user-to-user la- tency, availability under server disconnections, and improved server scalability – while being ecient, reliable, and secure.Muitas aplicações modernas são criadas para fornecer interações entre utilizadores, incluindo jogos multiutilizador, redes sociais e ferramentas colaborativas. Os utilizadores esperam que o tempo de resposta nas aplicações seja da ordem de milissegundos, promovendo a interação e interatividade. A arquitetura dessas aplicações normalmente adota um modelo cliente-servidor, onde todas as interações são mediadas por um componente centralizado. Essa abordagem apresenta problemas de disponibilidade e tolerância a falhas, que podem ser mitigadas com replicação no componente do servidor, até com a utilização de soluções replicadas geogracamente em infraestruturas de computação na nuvem. Mesmo neste caso, o modelo de comunicação cliente-servidor leva a penalidades de latência desnecessárias para clientes geogracamente próximos e altos custos operacionais para o provedor das aplicações. Esta dissertação propõe um modelo híbrido cloud-edge com mecanismos seguros e ecientes de propagação e consistência. Esse modelo combina replicação do lado do cliente e propagação de cliente para cliente para fornecer baixa latência e minimizar a dependência na infraestrutura do servidor, promovendo a disponibilidade e tolerância a falhas. Para realizar este modelo, este trabalho faz as seguintes contribuições principais. Primeiro, o modelo híbrido cloud-edge é materializado por uma arquitetura do sistema em que os clientes mantêm réplicas dos dados e sincronizam de maneira ponto a ponto e onde os servidores são usados para auxiliar na operação dos clientes. Estudamos como trazer a maior parte da lógica das aplicações para o lado do cliente, usando o serviço centralizado principalmente para durabilidade, controlo de acesso, descoberta e superação das limitações inter-rede. Em segundo lugar, denimos protocolos para replicação de dados fracamente consistentes, incluindo um novo modelo de CRDTs (∆-CRDTs). Fornecemos um estudo sobre replicação parcial, explorando os desaos e limitações fundamentais em fornecer consistência causal e a diculdade em suportar réplicas do lado do cliente devido à sua natureza efémera. Terceiro, estudamos como o mau comportamento da parte do cliente pode afetar as garantias da consistência causal. Propomos novos modelos seguros de consistência fraca para congurações inseguras e algoritmos para impor tais modelos de consistência. A avaliação experimental das nossas contribuições mostrou os benefícios e limitações em comparação com o estado da arte. Em geral, o modelo híbrido cloud-edge leva a tempos de resposta nas aplicações mais rápidos, a uma menor latência de cliente para cliente e à possibilidade de trabalhar oine ou desconectado do servidor. Adicionalmente, obtemos uma maior escalabilidade do sistema, visto que menos clientes precisam de estar conectados aos servidores ao mesmo tempo e devido à redução na utilização da largura de banda no servidor. Em resumo, propomos um modelo híbrido entre a orla (edge) e a nuvem (cloud) que fornece menor latência entre utilizadores, disponibilidade durante desconexões do servidor e uma melhor escalabilidade do servidor – ao mesmo tempo que é eciente, conável e seguro

    Constructing provenance-aware distributed systems with data propagation

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 93-96).Is it possible to construct a heterogeneous distributed computing architecture capable of solving interesting complex problems? Can we easily use this architecture to maintain a detailed history or provenance of the data processed by it? Most existing distributed architectures can perform only one operation at a time. While they are capable of tracing possession of data, these architectures do not always track the network of operations used to synthesize new data. This thesis presents a distributed implementation of data propagation, a computational model that provides for concurrent processing that is not constrained to a single distributed operation. This system is capable of distributing computation across a heterogeneous network. It allows for the division of multiple simultaneous operations in a single distributed system. I also identify four constraints that may be placed on general-purpose data propagation to allow for deterministic computation in such a distributed propagation network. This thesis also presents an application of distributed propagation by illustrating how a generic transformation may be applied to existing propagator networks to allow for the maintenance of data provenance. I show that the modular structure of data propagation permits the simple modification of a propagator network design to maintain the histories of data.by Ian Campbell Jacobi.S.M

    Integration of browser-to-browser architectures with third party legacy cloud storage

    Get PDF
    An increasing number of web applications run totally or partially in the client machines - from collaborative editing tools to multi-user games. Avoiding to continuously contact the server allows to reduce latency between clients and to minimize the load on the centralized component. Novel implementation techniques, such as WebRTC, allows to address network problems that previously prevented these systems to be deployed in practice. Legion is a newly developed framework that exploits these mechanisms, allowing client web applications to replicate data from servers, and synchronize these replicas directly among them. This work aims to extend the current Legion framework with the integration of an additional legacy storage system that can be used to support web applications, Antidote. We study the best way to support Legion’s data model into Antidote, we design a synchronization mechanism, and finally we measure the performance cost of such an integration

    Coda: a highly available file system for a distributed workstation environment

    Full text link
    corecore