Invariant preservation in geo-replicated data stores

Abstract

The Internet has enabled people from all around the globe to communicate with each other in a matter of milliseconds. This possibility has a great impact in the way we work, behave and communicate, while the full extent of possibilities are yet to be known. As we become more dependent of Internet services, the more important is to ensure that these systems operate correctly, with low latency and high availability for millions of clients scattered all around the globe. To be able to provide service to a large number of clients, and low access latency for clients in different geographical locations, Internet services typically rely on georeplicated storage systems. Replication comes with costs that may affect service quality. To propagate updates between replicas, systems either choose to lose consistency in favor of better availability and latency (weak consistency), or maintain consistency, but the system might become unavailable during partitioning (strong consistency). In practice, many production systems rely on weak consistency storage systems to enhance user experience, overlooking that applications can become incorrect due to the weaker consistency assumptions. In this thesis, we study how to exploit application’s semantics to build correct applications without affecting the availability and latency of operations. We propose a new consistency model that breaks apart from traditional knowledge that applications consistency is dependent on coordinating the execution of operations across replicas. We show that it is possible to execute most operations with low latency and in an highly available way, while preserving application’s correctness. Our approach consists in specifying the fundamental properties that define the correctness of applications, i.e. the application invariants, and identify and prevent concurrent executions that potentially can make the state of the database inconsistent, i.e. that may violate some invariant. We explore different, complementary, approaches to implement this model. The Indigo approach consists in preventing conflicting operations from executing concurrently, by restricting the operations that each replica can execute at each moment to maintain application’s correctness. The IPA approach does not preclude the execution of any operation, ensuring high availability. To maintain application correctness, operations are modified to prevent invariant violations during replica reconciliation, or, if modifying operations provides an unsatisfactory semantics, it is possible to correct any invariant violations before a client can read an inconsistent state, by executing compensations. Evaluation shows that our approaches can ensure both low latency and high availability for most operations in common Internet application workloads, with small execution overhead in comparison to unmodified weak consistency systems, while enforcing application invariants, as in strong consistency systems

    Similar works