8 research outputs found

    Wait-Free and Obstruction-Free Snapshot

    Get PDF
    The snapshot problem was first proposed over a decade ago and has since been well-studied in the distributed algorithms community. The challenge is to design a data structure consisting of mm components, shared by upto nn concurrent processes, that supports two operations. The first, Update(i,v)Update(i,v), atomically writes vv to the iith component. The second, Scan()Scan(), returns an atomic snapshot of all mm components. We consider two termination properties: wait-freedom, which requires a process to always terminate in a bounded number of its own steps, and the weaker obstruction-freedom, which requires such termination only for processes that eventually execute uninterrupted. First, we present a simple, time and space optimal, obstruction-free solution to the single-writer, multi-scanner version of the snapshot problem (wherein concurrent Updates never occur on the same component). Second, we assume hardware support for compare&swap (CAS) to give a time-optimal, wait-free solution to the multi-writer, single-scanner snapshot problem (wherein concurrent Scans never occur). This algorithm uses only O(mn)O(mn) space and has optimal CAS, write and remote-reference complexities. Additionally, it can be augmented to implement a general snapshot object with the same time and space bounds, thus improving the space complexity of O(mn2)O(mn^2) of the only previously known time-optimal solution

    Polynomial-Time Verification and Testing of Implementations of the Snapshot Data Structure

    Get PDF
    We analyze correctness of implementations of the snapshot data structure in terms of linearizability. We show that such implementations can be verified in polynomial time. Additionally, we identify a set of representative executions for testing and show that the correctness of each of these executions can be validated in linear time. These results present a significant speedup considering that verifying linearizability of implementations of concurrent data structures, in general, is EXPSPACE-complete in the number of program-states, and testing linearizability is NP-complete in the length of the tested execution. The crux of our approach is identifying a class of executions, which we call simple, such that a snapshot implementation is linearizable if and only if all of its simple executions are linearizable. We then divide all possible non-linearizable simple executions into three categories and construct a small automaton that recognizes each category. We describe two implementations (one for verification and one for testing) of an automata-based approach that we develop based on this result and an evaluation that demonstrates significant improvements over existing tools. For verification, we show that restricting a state-of-the-art tool to analyzing only simple executions saves resources and allows the analysis of more complex cases. Specifically, restricting attention to simple executions finds bugs in 27 instances, whereas, without this restriction, we were only able to find 14 of the 30 bugs in the instances we examined. We also show that our technique accelerates testing performance significantly. Specifically, our implementation solves the complete set of 900 problems we generated, whereas the state-of-the-art linearizability testing tool solves only 554 problems

    Notes on Theory of Distributed Systems

    Full text link
    Notes for the Yale course CPSC 465/565 Theory of Distributed Systems

    Towards a practical snapshot algorithm

    Get PDF
    AbstractAn atomic snapshot memory is an implementation of a multiple-location shared memory that can be atomically read in its entirety without preventing concurrent writing. The design of wait-free implementations of atomic snapshot memories has been the subject of extensive theoretical research in recent years. This paper introduces the coordinated-collect algorithm, a novel wait-free atomic snapshot construction which we believe is a first step in taking snapshots from theory to practice. Unlike previous algorithms, it uses currently available multiprocessor synchronization operations to provide an algorithm that has only O(1) update complexity and O(n) scan complexity, with very small constants. We evaluated the performance of known snapshot algorithms for a collection of benchmarks on a simulated distributed shared-memory multiprocessor. Our empirical evidence suggests that coordinated-collect will outperform all known wait-free, lock-free, and locking snapshot algorithms in terms of overall throughput and latency
    corecore