    Sequentially consistent versus linearizable counting networks

    We compare the impact of timing conditions on implementing sequentially consistent and linearizable counters using (uniform) counting networks in distributed systems. For counting problems in application domains which do not require linearizability but will run correctly if only sequential consistency is provided, the results of our investigation, and their potential payoffs, are threefold: • First, we show that sequential consistency and linearizability cannot be distinguished by the timing conditions previously considered in the context of counting networks; thus, in contexts where these constraints apply, it is possible to rely on the stronger semantics of linearizability, which simplifies proofs and enhances compositionality. • Second, we identify local timing conditions that support sequential consistency but not linearizability; thus, we suggest weaker, easily implementable timing conditions that are likely to be sufficient in many applications. • Third, we show that any kind of synchronization that is too weak to support even sequential consistency may violate it significantly for some counting networks; hence

    Quantifiability: Concurrent Correctness from First Principles

    Architectural imperatives due to the slowing of Moore\u27s Law, the broad acceptance of relaxed semantics and the O(n!) worst case verification complexity of sequential histories motivate a new approach to concurrent correctness. Desiderata for a new correctness condition are that it be independent of sequential histories, compositional over objects, flexible as to timing, modular as to semantics and free of inherent locking or waiting. This dissertation proposes Quantifiability, a novel correctness condition based on intuitive first principles. Quantifiablity is formally defined with its system model. Useful properties of quantifiability such as compositionality, measurablility and observational refinement are demonstrated. Quantifiability models a system in vector space to launch a new mathematical analysis of concurrency. The vector space model is suitable for a wide range of concurrent systems and their associated data structures. Proof of correctness is facilitated with linear algebra, better supported and of more efficient time complexity than traditional combinatorial methods. Experimental results are presented showing that quantifiable data structures are highly scalable due to their use of relaxed semantics, an implementation trade-off that is explicitly permitted by quantifiability. The speedups attainable are theoretically analyzed. Because previous work lacked a metric for evaluating such trade-offs, a new measure is proposed here that applies communication theory to the disordered results of concurrent data structures. This entropy measure opens the way to analyze degrees of concurrent correctness across implementations to engineer system scalability and evaluate data structure quality under different workloads. With all its innovation, quantifiability is presented the context of previous work and existing correctness conditions

    Randomized versus Deterministic Implementations of Concurrent Data Structures

    One of the key trends in computing over the past two decades has been increased distribution, both at the processor level, where multi-core architectures are now the norm, and at the system level, where many key services are currently distributed overmultiple machines. Thus, understanding the power and limitations of computing in a concurrent, distributed setting is one of the major challenges in Computer Science. In this thesis, we analyze the complexity of implementing concurrent data structures in asynchronous shared memory systems. We focus on the complexity of a classic distributed coordination task called renaming, in which a set of processes need to pick distinct names from a small set of identifiers. We present the first tight bounds for the time complexity of this problem, both for deterministic and randomized implementations, solving a long-standing open problem in the field. For deterministic algorithms, we prove a tight linear lower bound; for randomized solutions, we provide logarithmic upper and lower bounds on time complexity. Together, these results show an exponential separation between deterministic and randomized renaming solutions. Importantly, the lower bounds extend to implementations of practical shared-memory data structures, such as queues, stacks, and counters. From a technical perspective, this thesis highlights new connections between the distributed renaming problem and other fundamental objects, such as sorting networks, mutual exclusion, and counters. In particular, we show that sorting networks can be used to obtain optimal randomized solutions to renaming, and that, in turn, the existence of these solutions implies a linear lower bound on the complexity of the problem. In sum, the results in this thesis suggest that deterministic implementations of shared-memory data structures do not scale well in terms of worst-case time complexity. On the positive side, we emphasize randomization as a natural alternative, which can circumvent the deterministic lower bounds with high probability. Thus, a promising direction for future work is to extend our randomized renaming techniques to obtain efficient implementations of concurrent data structures

    WACCO and LOKO: Strong Consistency at Global Scale

    Motivated by a vision for future global-scale services supporting frequent updates and widespread concurrent reads, we propose a scalable object-sharing system called WACCO offering strong consistency semantics. WACCO propagates read responses on a tree-based topology to satisfy broad demand and migrates objects dynamically to place them close to that demand. To demonstrate WACCO, we use it to develop a service called LOKO that could roughly encompass the current duties of the DNS and simultaneously support granular status updates (e.g., currently preferred routes) in a future Internet. We evaluate LOKO, including the performance impact of updates, migration, and fault tolerance, using both traces of DNS queries served by Akamai and traces of NFS traffic on the UNC campus. WACCO uses a novel consistency model that is both stronger than sequential consistency and more scalable than linearizability. Our results show that this model performs better in the DNS case than the NFS case because the former represents a global, shared-object system which better fits the design goals of WACCO. We evaluate two different migration techniques, one of which considers not just client-visible latency but also the budget for the network (e.g., for public and hybrid clouds) among other factors.Doctor of Philosoph

    Planetary Scale Data Storage

    The success of virtualization and container-based application deployment has fundamentally changed computing infrastructure from dedicated hardware provisioning to on-demand, shared clouds of computational resources. One of the most interesting effects of this shift is the opportunity to localize applications in multiple geographies and support mobile users around the globe. With relatively few steps, an application and its data systems can be deployed and scaled across continents and oceans, leveraging the existing data centers of much larger cloud providers. The novelty and ease of a global computing context means that we are closer to the advent of an Oceanstore, an Internet-like revolution in personalized, persistent data that securely travels with its users. At a global scale, however, data systems suffer from physical limitations that significantly impact its consistency and performance. Even with modern telecommunications technology, the latency in communication from Brazil to Japan results in noticeable synchronization delays that violate user expectations. Moreover, the required scale of such systems means that failure is routine. To address these issues, we explore consistency in the implementation of distributed logs, key/value databases and file systems that are replicated across wide areas. At the core of our system is hierarchical consensus, a geographically-distributed consensus algorithm that provides strong consistency, fault tolerance, durability, and adaptability to varying user access patterns. Using hierarchical consensus as a backbone, we further extend our system from data centers to edge regions using federated consistency, an adaptive consistency model that gives satellite replicas high availability at a stronger global consistency than existing weak consistency models. In a deployment of 105 replicas in 15 geographic regions across 5 continents, we show that our implementation provides high throughput, strong consistency, and resiliency in the face of failure. From our experimental validation, we conclude that planetary-scale data storage systems can be implemented algorithmically without sacrificing consistency or performance

    Coûts de Synchronization dans les Programmes Parallèles et les Structures de Donnèes Simultanées

    To use the computational power of modern computing machines, we have to deal with concurrent programs. Writing efficient concurrent programs is notoriously difficult, primarily due to the need of harnessing synchronization costs. In this thesis, we focus on synchronization costs in parallel programs and concurrent data structures.First, we present a novel granularity control technique for parallel programs designed for the dynamic multithreading environment. Then in the context of concurrent data structures, we consider the notion of concurrency-optimality and propose the first implementation of a concurrency-optimal binary search tree that, intuitively, accepts a concurrent schedule if and only if the schedule is correct. Also, we propose parallel combining, a technique that enables efficient implementations of concurrent data structures from their parallel batched counterparts. We validate the proposed techniques via experimental evaluations showing superior or comparable performance with respect to state-of-the-art algorithms.From a more formal perspective, we consider the phenomenon of helping in concurrent data structures. Intuitively, helping is observed when the order of some operation in a linearization is fixed by a step of another process. We show that no wait-free linearizable implementation of stack using read, write, compare&swap and fetch&add primitives can be help-free, correcting a mistake in an earlier proof by Censor-Hillel et al. Finally, we propose a simple way to analytically predict the throughput of data structures based on coarse-grained locking.Pour utiliser la puissance de calcul des ordinateurs modernes, nous devons écrire des programmes concurrents. L’écriture de programme concurrent efficace est notoirement difficile, principalement en raison de la nécessité de gérer les coûts de synchronization. Dans cette thèse, nous nous concentrons sur les coûts de synchronisation dans les programmes parallèles et les structures de données concurrentes.D’abord, nous présentons une nouvelle technique de contrôle de la granularité pour les programmes parallèles conçus pour un environnement de multi-threading dynamique. Ensuite, dans le contexte des structures de données concurrentes, nous considérons la notion d’optimalité de concurrence (concurrency-optimality) et proposons la première implémentation concurrence-optimal d’un arbre binaire de recherche qui, intuitivement, accepte un ordonnancement concurrent si et seulement si l’ordonnancement est correct. Nous proposons aussi la combinaison parallèle (parallel combining), une technique qui permet l’implémentation efficace des structures de données concurrences à partir de leur version parallèle par lots. Nous validons les techniques proposées par une évaluation expérimentale, qui montre des performances supérieures ou comparables à celles des algorithmes de l’état de l’art.Dans une perspective plus formelle, nous considérons le phénomène d’assistance (helping) dans des structures de données concurrentes. On observe un phénomène d’assistance quand l’ordre d’une opération d’un processus dans une trace linéarisée est fixée par une étape d’un autre processus. Nous montrons qu’aucune implémentation sans attente (wait-free) linéarisable d’une pile utilisant les primitives read, write, compare&swap et fetch&add ne peut être “sans assistance” (help-free), corrigeant une erreur dans une preuve antérieure de Censor-Hillel et al. Finalement, nous proposons une façon simple de prédire analytiquement le débit (throughput) des structures de données basées sur des verrous à gros grains

    Contention management for distributed data replication

    PhD ThesisOptimistic replication schemes provide distributed applications with access to shared data at lower latencies and greater availability. This is achieved by allowing clients to replicate shared data and execute actions locally. A consequence of this scheme raises issues regarding shared data consistency. Sometimes an action executed by a client may result in shared data that may conflict and, as a consequence, may conflict with subsequent actions that are caused by the conflicting action. This requires a client to rollback to the action that caused the conflicting data, and to execute some exception handling. This can be achieved by relying on the application layer to either ignore or handle shared data inconsistencies when they are discovered during the reconciliation phase of an optimistic protocol. Inconsistency of shared data has an impact on the causality relationship across client actions. In protocol design, it is desirable to preserve the property of causality between different actions occurring across a distributed application. Without application level knowledge, we assume an action causes all the subsequent actions at the same client. With application knowledge, we can significantly ease the protocol burden of provisioning causal ordering, as we can identify which actions do not cause other actions (even if they precede them). This, in turn, makes possible the client’s ability to rollback to past actions and to change them, without having to alter subsequent actions. Unfortunately, increased instances of application level causal relations between actions lead to a significant overhead in protocol. Therefore, minimizing the rollback associated with conflicting actions, while preserving causality, is seen as desirable for lower exception handling in the application layer. In this thesis, we present a framework that utilizes causality to create a scheduler that can inform a contention management scheme to reduce the rollback associated with the conflicting access of shared data. Our framework uses a backoff contention management scheme to provide causality preserving for those optimistic replication systems with high causality requirements, without the need for application layer knowledge. We present experiments which demonstrate that our framework reduces clients’ rollback and, more importantly, that the overall throughput of the system is improved when the contention management is used with applications that require causality to be preserved across all actions
