15 research outputs found

    Fast and Space-Efficient Queues via Relaxation

    Get PDF
    Efficient message-passing implementations of shared data types are a vital component of practical distributed systems, enabling them to work on shared data in predictable ways, but there is a long history of results showing that many of the most useful types of access to shared data are necessarily slow. A variety of approaches attempt to circumvent these bounds, notably weakening consistency guarantees and relaxing the sequential specification of the provided data type. These trade behavioral guarantees for performance. We focus on relaxing the sequential specification of a first-in, first-out queue type, which has been shown to allow faster linearizable implementations than are possible for traditional FIFO queues without relaxation. The algorithms which showed these improvements in operation time tracked a complete execution history, storing complete object state at all n processes in the system, leading to n copies of every stored data element. In this paper, we consider the question of reducing the space complexity of linearizable implementations of shared data types, which provide intuitive behavior through strong consistency guarantees. We improve the existing algorithm for a relaxed queue, showing that it is possible to store only one copy of each element in a shared queue, while still having a low amortized time cost. This is one of several important steps towards making these data types practical in real world systems

    Sequentially consistent versus linearizable counting networks

    Full text link
    We compare the impact of timing conditions on implementing sequentially consistent and linearizable counters using (uniform) counting networks in distributed systems. For counting problems in application domains which do not require linearizability but will run correctly if only sequential consistency is provided, the results of our investigation, and their potential payoffs, are threefold: • First, we show that sequential consistency and linearizability cannot be distinguished by the timing conditions previously considered in the context of counting networks; thus, in contexts where these constraints apply, it is possible to rely on the stronger semantics of linearizability, which simplifies proofs and enhances compositionality. • Second, we identify local timing conditions that support sequential consistency but not linearizability; thus, we suggest weaker, easily implementable timing conditions that are likely to be sufficient in many applications. • Third, we show that any kind of synchronization that is too weak to support even sequential consistency may violate it significantly for some counting networks; hence

    Distributed Transactions: Dissecting the Nightmare

    Get PDF
    Many distributed storage systems are transactional and a lot of work has been devoted to optimizing their performance, especially the performance of read-only transactions that are considered the most frequent in practice. Yet, the results obtained so far are rather disappointing, and some of the design decisions seem contrived. This paper contributes to explaining this state of affairs by proving intrinsic limitations of transactional storage systems, even those that need not ensure strong consistency but only causality. We first consider general storage systems where some transactions are read-only and some also involve write operations. We show that even read-only transactions cannot be "fast": their operations cannot be executed within one round-trip message exchange between a client seeking an object and the server storing it. We then consider systems (as sometimes implemented today) where all transactions are read-only, i.e., updates are performed as individual operations outside transactions. In this case, read-only transactions can indeed be "fast", but we prove that they need to be "visible". They induce inherent updates on the servers, which in turn impact their overall performance

    Time Bounds for Shared Objects in Partially Synchronous Systems

    Get PDF
    Shared objects are a key component in today's large distributed systems. Linearizability is a popular consistency condition for such shared objects which gives the illusion of sequential execution of operations. The time bound of an operation is the worst-case time complexity from the operation invocation to its response. Some time bounds have been proved for certain operations on linearizable shared objects in partially synchronous systems but there are some gaps between time upper bound and lower bound for each operation. In this work, the goal is to narrow or eliminate the gaps and find optimally fast implementations. To reach this goal, we prove larger lower bounds and show smaller upper bounds (compared to 2d for all operations in previous folklore implementations) by proposing an implementation for a shared object with an arbitrary data type in distributed systems of n processes in which every message delay is bounded within [d-u, d] and the maximum skew between processes' clocks is epsilon. Considering any operation for which there exist two instances such that individually, each instance is legal but in sequence they are not, we prove a lower bound of d + min{epsilon, u, d/3}, improving from d, and show this bound is tight when epsilon < d/3 and epsilon < u. Considering any operation for which there exist k instances such that each instance separately is legal and any sequence of them is legal, but the state of the object is different after different sequences, we prove a lower bound of (1-1/k)u, improving from u/2, and show this bound is tight when k = n. A pure mutator only modifies the object but does not return anything about the object. A pure accessor does not modify the object. For a pure mutator OP1 and a pure accessor OP2, if given a set of instances of OP1, the state of the object reflects the order in which the instances occur and an instance of OP2 can detect whether an instance of OP1 occurs, we prove the sum of the time bound for OP1 and OP2 is at least d + min{epsilon, u, d/3}, improving from d. The upper bound is d + 2*epsilon from our implementation

    New Production System for Finnish Meteorological Institute

    Get PDF
    This thesis presents the plans for replacing the production system of Finnish Meteorological Institute (FMI). It begins with a review of the state of the art in distributed systems research, and ends with a design for the replacement production system that is reliable, scalable, and maintainable. The subject production system is a framework for managing the production of different weather predictions and models. We use this framework to abstract away the actual execution of work from its description. This way the different production processes become easily monitored and configured through the production system. Since the amount of data processed by this system is too much for a single computer to handle, we have distributed the production system. Thus we are not dealing with just a framework for production but with a distributed system and hence a solid understanding of distributed systems theory is required in order to replace this production system. The first part of this thesis lays the groundwork for replacing the distributed production system: a review of the state of the art in distributed systems research. It is a concise document of its own which presents the essentials of distributed systems in a clear manner. This part can be used separately from the rest of this thesis as a short introduction to distributed systems. Second part of this thesis presents the subject production system, the need for its replacement, and our design for the new production system that is maintainable, performant, available, reliable, and scalable. We go even further than simply giving a design for this replacement production system, and instead present a practical plan to implement the new production system with Kubernetes, Brigade, and Riak CS

    Distributed Transactions: Dissecting the Nightmare

    Get PDF
    Many distributed storage systems are transactional and a lot of work has been devoted to optimizing their performance, especially the performance of read-only transactions that are considered the most frequent in practice. Yet, the results obtained so far are rather disappointing, and some of the design decisions seem contrived. This paper contributes to explaining this state of affairs by proving intrinsic limitations of transactional storage systems, even those that need not ensure strong consistency but only causality. We first consider general storage systems where some transactions are read-only and some also involve write operations. We show that even read-only transactions cannot be "fast": their operations cannot be executed within one round-trip message exchange between a client seeking an object and the server storing it. We then consider systems (as sometimes implemented today) where all transactions are read-only, i.e., updates are performed as individual operations outside transactions. In this case, read-only transactions can indeed be "fast", but we prove that they need to be "visible". They induce inherent updates on the servers, which in turn impact their overall performance

    The Complexity of Reliable and Secure Distributed Transactions

    Get PDF
    The use of transactions in distributed systems dates back to the 70's. The last decade has also seen the proliferation of transactional systems. In the existing transactional systems, many protocols employ a centralized approach in executing a distributed transaction where one single process coordinates the participants of a transaction. The centralized approach is usually straightforward and efficient in the failure-free setting, yet the coordinator then turns to be a single point of failure, undermining reliability/security in the failure-prone setting, or even be a performance bottleneck in practice. In this dissertation, we explore the complexity of decentralized solutions for reliable and secure distributed transactions, which do not use a distinguished coordinator or use the coordinator as little as possible. We show that for some problems in reliable distributed transactions, there are decentralized solutions that perform as efficiently as the classical centralized one, while for some others, we determine the complexity limitations by proving lower and upper bounds to have a better understanding of the state-of-the-art solutions. We first study the complexity on two aspects of reliable transactions: atomicity and consistency. More specifically, we do a systematic study on the time and message complexity of non-blocking atomic commit of a distributed transaction, and investigate intrinsic limitations of causally consistent transactions. Our study of distributed transaction commit focuses on the complexity of the most frequent executions in practice, i.e., failure-free, and willing to commit. Through our systematic study, we close many open questions like the complexity of synchronous non-blocking atomic commit. We also present an effective protocol which solves what we call indulgent atomic commit that tolerates practical distributed database systems which are synchronous "most of the time", and can perform as efficiently as the two-phase commit protocol widely used in distributed database systems. Our investigation of causal transactions focuses on the limitations of read-only transactions, which are considered the most frequent in practice. We consider "fast" read-only transactions where operations are executed within one round-trip message exchange between a client seeking an object and the server storing it (in which no process can be a coordinator). We show two impossibility results regarding "fast" read-only transactions. By our impossibility results, when read-only transactions are "fast", they have to be "visible", i.e., they induce inherent updates on the servers. We also present a "fast" read-only transaction protocol that is "visible" as an upper bound on the complexity of inherent updates. We then study the complexity of secure transactions in the model of secure multiparty computation: even in the face of malicious parties, no party obtains the computation result unless all other parties obtain the same result. As it is impossible to achieve without any trusted party, we focus on optimism where if all parties are honest, they can obtain the computation result without resorting to a trusted third party, and the complexity of every optimistic execution where all parties are honest. We prove a tight lower bound on the message complexity by relating the number of messages to the length of the permutation sequence in combinatorics, a necessary pattern for messages in every optimistic execution
    corecore