7 research outputs found
Generalized Wake-Up: Amortized Shared Memory Lower Bounds for Linearizable Data Structures
In this work, we define the generalized wake-up problem, , for a
shared memory asynchronous system with processes. Informally, the problem,
which is parametrized by an increasing sequence , asks that
at least processes identify that at least other processes
have "woken up" and taken at least one step for each . We prove
that any solution to that uses read/write/compare-and-swap variables
requires at least steps to solve.
The generalized wake-up lower bound serves as a technique for proving lower
bounds on the amortized complexities of operations on many linearizable
concurrent data types through reductions. We illustrate this with several
examples: (1) We show an amortized lower bound on the
complexity of implementing counters and {\em fetch-and-increment} objects which
match the complexities of the algorithms given by Jayanti and Ellen & Woelfel;
the lower bound even extends to a significantly relaxed version of the object.
(2) We show an amortized lower bound on the complexity of the
pop, dequeue, and deleteMin operations of a concurrent stack, queue, and
priority queue respectively that hold even if the data type definitions are
significantly relaxed; (3) In another paper, we have shown an
amortized lower bound on the complexity of
operations on a union-find object of size (when operations are
performed).Comment: 8 pages, in Telug
LL/SC and Atomic Copy: Constant Time, Space Efficient Implementations Using Only Pointer-Width CAS
When designing concurrent algorithms, Load-Link/Store-Conditional (LL/SC) is
often the ideal primitive to have because unlike Compare and Swap (CAS), LL/SC
is immune to the ABA problem. However, the full semantics of LL/SC are not
supported by any modern machine, so there has been a significant amount of work
on simulations of LL/SC using Compare and Swap (CAS), a synchronization
primitive that enjoys widespread hardware support. All of the algorithms so far
that are constant time either use unbounded sequence numbers (and thus base
objects of unbounded size), or require space for LL/SC object
(where is the number of processes). We present a constant time
implementation of LL/SC objects using space, where is
the maximum number of overlapping LL/SC operations per process (usually a
constant), and requiring only pointer-sized CAS objects. Our implementation can
also be used to implement -word objects in time (for
both and ) and space. To achieve these bounds, we
begin by implementing a new primitive called Single-Writer Copy which takes a
pointer to a word sized memory location and atomically copies its contents into
another object. The restriction is that only one process is allowed to
write/copy into the destination object at a time. We believe this primitive
will be very useful in designing other concurrent algorithms as well
Concurrent Disjoint Set Union
We develop and analyze concurrent algorithms for the disjoint set union
(union-find) problem in the shared memory, asynchronous multiprocessor model of
computation, with CAS (compare and swap) or DCAS (double compare and swap) as
the synchronization primitive. We give a deterministic bounded wait-free
algorithm that uses DCAS and has a total work bound of for a problem with elements and operations
solved by processes, where is a functional inverse of Ackermann's
function. We give two randomized algorithms that use only CAS and have the same
work bound in expectation. The analysis of the second randomized algorithm is
valid even if the scheduler is adversarial. Our DCAS and randomized algorithms
take steps per operation, worst-case for the DCAS algorithm,
high-probability for the randomized algorithms. Our work and step bounds grow
only logarithmically with , making our algorithms truly scalable. We prove
that for a class of symmetric algorithms that includes ours, no better step or
work bound is possible.Comment: 40 pages, combines ideas in two previous PODC paper
A Wait-free Queue with Poly-logarithmic Worst-case Step Complexity
In this work, we introduce a novel linearizable wait-free queue implementation. Linearizability and lock-freedom are standard requirements for designing shared data structures. To the best of our knowledge, all of the existing linearizable lock-free queues in the literature have a common problem in their worst case, called the CAS Retry Problem. We show that our algorithm avoids this problem with the helping mechanism which we use and has a worst-case running time better than prior lock-free queues. The amortized number of steps for an Enqueue or Dequeue in our algorithm is O(log^2 p + log q), where p is the number of processes and q is the size of the queue when the operation is linearized
Notes on Theory of Distributed Systems
Notes for the Yale course CPSC 465/565 Theory of Distributed Systems