7 research outputs found

    Generalized Wake-Up: Amortized Shared Memory Lower Bounds for Linearizable Data Structures

    Full text link
    In this work, we define the generalized wake-up problem, GWU(s)GWU(s), for a shared memory asynchronous system with nn processes. Informally, the problem, which is parametrized by an increasing sequence s=s1,,sps = s_1,\ldots,s_p, asks that at least ni+1n - i + 1 processes identify that at least sis_i other processes have "woken up" and taken at least one step for each 1in1 \le i \le n. We prove that any solution to GWU(s)GWU(s) that uses read/write/compare-and-swap variables requires at least Ω(i=1nlogsi)\Omega\left(\sum_{i = 1}^n \log s_i \right) steps to solve. The generalized wake-up lower bound serves as a technique for proving lower bounds on the amortized complexities of operations on many linearizable concurrent data types through reductions. We illustrate this with several examples: (1) We show an Ω(logn)\Omega(\log n) amortized lower bound on the complexity of implementing counters and {\em fetch-and-increment} objects which match the complexities of the algorithms given by Jayanti and Ellen & Woelfel; the lower bound even extends to a significantly relaxed version of the object. (2) We show an Ω(logn)\Omega(\log n) amortized lower bound on the complexity of the pop, dequeue, and deleteMin operations of a concurrent stack, queue, and priority queue respectively that hold even if the data type definitions are significantly relaxed; (3) In another paper, we have shown an Ω(loglog(n/m))\Omega(\log\log(n \ell/m)) amortized lower bound on the complexity of operations on a union-find object of size \ell (when mm operations are performed).Comment: 8 pages, in Telug

    LL/SC and Atomic Copy: Constant Time, Space Efficient Implementations Using Only Pointer-Width CAS

    Get PDF
    When designing concurrent algorithms, Load-Link/Store-Conditional (LL/SC) is often the ideal primitive to have because unlike Compare and Swap (CAS), LL/SC is immune to the ABA problem. However, the full semantics of LL/SC are not supported by any modern machine, so there has been a significant amount of work on simulations of LL/SC using Compare and Swap (CAS), a synchronization primitive that enjoys widespread hardware support. All of the algorithms so far that are constant time either use unbounded sequence numbers (and thus base objects of unbounded size), or require Ω(MP)\Omega(MP) space for MM LL/SC object (where PP is the number of processes). We present a constant time implementation of MM LL/SC objects using Θ(M+kP2)\Theta(M+kP^2) space, where kk is the maximum number of overlapping LL/SC operations per process (usually a constant), and requiring only pointer-sized CAS objects. Our implementation can also be used to implement LL-word LL/SCLL/SC objects in Θ(L)\Theta(L) time (for both LLLL and SCSC) and Θ((M+kP2)L)\Theta((M+kP^2)L) space. To achieve these bounds, we begin by implementing a new primitive called Single-Writer Copy which takes a pointer to a word sized memory location and atomically copies its contents into another object. The restriction is that only one process is allowed to write/copy into the destination object at a time. We believe this primitive will be very useful in designing other concurrent algorithms as well

    Concurrent Disjoint Set Union

    Full text link
    We develop and analyze concurrent algorithms for the disjoint set union (union-find) problem in the shared memory, asynchronous multiprocessor model of computation, with CAS (compare and swap) or DCAS (double compare and swap) as the synchronization primitive. We give a deterministic bounded wait-free algorithm that uses DCAS and has a total work bound of O(m(log(np/m+1)+α(n,m/(np)))O(m \cdot (\log(np/m + 1) + \alpha(n, m/(np))) for a problem with nn elements and mm operations solved by pp processes, where α\alpha is a functional inverse of Ackermann's function. We give two randomized algorithms that use only CAS and have the same work bound in expectation. The analysis of the second randomized algorithm is valid even if the scheduler is adversarial. Our DCAS and randomized algorithms take O(logn)O(\log n) steps per operation, worst-case for the DCAS algorithm, high-probability for the randomized algorithms. Our work and step bounds grow only logarithmically with pp, making our algorithms truly scalable. We prove that for a class of symmetric algorithms that includes ours, no better step or work bound is possible.Comment: 40 pages, combines ideas in two previous PODC paper

    A Wait-free Queue with Poly-logarithmic Worst-case Step Complexity

    Get PDF
    In this work, we introduce a novel linearizable wait-free queue implementation. Linearizability and lock-freedom are standard requirements for designing shared data structures. To the best of our knowledge, all of the existing linearizable lock-free queues in the literature have a common problem in their worst case, called the CAS Retry Problem. We show that our algorithm avoids this problem with the helping mechanism which we use and has a worst-case running time better than prior lock-free queues. The amortized number of steps for an Enqueue or Dequeue in our algorithm is O(log^2 p + log q), where p is the number of processes and q is the size of the queue when the operation is linearized

    Notes on Theory of Distributed Systems

    Full text link
    Notes for the Yale course CPSC 465/565 Theory of Distributed Systems
    corecore