7,819 research outputs found
Randomized Two-Process Wait-Free Test-and-Set
We present the first explicit, and currently simplest, randomized algorithm
for 2-process wait-free test-and-set. It is implemented with two 4-valued
single writer single reader atomic variables. A test-and-set takes at most 11
expected elementary steps, while a reset takes exactly 1 elementary step. Based
on a finite-state analysis, the proofs of correctness and expected length are
compressed into one table.Comment: 9 pages, 4 figures, LaTeX source; Submitte
A Complexity-Based Hierarchy for Multiprocessor Synchronization
For many years, Herlihy's elegant computability based Consensus Hierarchy has
been our best explanation of the relative power of various types of
multiprocessor synchronization objects when used in deterministic algorithms.
However, key to this hierarchy is treating synchronization instructions as
distinct objects, an approach that is far from the real-world, where
multiprocessor programs apply synchronization instructions to collections of
arbitrary memory locations. We were surprised to realize that, when considering
instructions applied to memory locations, the computability based hierarchy
collapses. This leaves open the question of how to better capture the power of
various synchronization instructions.
In this paper, we provide an approach to answering this question. We present
a hierarchy of synchronization instructions, classified by their space
complexity in solving obstruction-free consensus. Our hierarchy provides a
classification of combinations of known instructions that seems to fit with our
intuition of how useful some are in practice, while questioning the
effectiveness of others. We prove an essentially tight characterization of the
power of buffered read and write instructions.Interestingly, we show a similar
result for multi-location atomic assignments
Compositional competitiveness for distributed algorithms
We define a measure of competitive performance for distributed algorithms
based on throughput, the number of tasks that an algorithm can carry out in a
fixed amount of work. This new measure complements the latency measure of Ajtai
et al., which measures how quickly an algorithm can finish tasks that start at
specified times. The novel feature of the throughput measure, which
distinguishes it from the latency measure, is that it is compositional: it
supports a notion of algorithms that are competitive relative to a class of
subroutines, with the property that an algorithm that is k-competitive relative
to a class of subroutines, combined with an l-competitive member of that class,
gives a combined algorithm that is kl-competitive.
In particular, we prove the throughput-competitiveness of a class of
algorithms for collect operations, in which each of a group of n processes
obtains all values stored in an array of n registers. Collects are a
fundamental building block of a wide variety of shared-memory distributed
algorithms, and we show that several such algorithms are competitive relative
to collects. Inserting a competitive collect in these algorithms gives the
first examples of competitive distributed algorithms obtained by composition
using a general construction.Comment: 33 pages, 2 figures; full version of STOC 96 paper titled "Modular
competitiveness for distributed algorithms.
Randomized protocols for asynchronous consensus
The famous Fischer, Lynch, and Paterson impossibility proof shows that it is
impossible to solve the consensus problem in a natural model of an asynchronous
distributed system if even a single process can fail. Since its publication,
two decades of work on fault-tolerant asynchronous consensus algorithms have
evaded this impossibility result by using extended models that provide (a)
randomization, (b) additional timing assumptions, (c) failure detectors, or (d)
stronger synchronization mechanisms than are available in the basic model.
Concentrating on the first of these approaches, we illustrate the history and
structure of randomized asynchronous consensus protocols by giving detailed
descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of
Distributed Computin
Are Lock-Free Concurrent Algorithms Practically Wait-Free?
Lock-free concurrent algorithms guarantee that some concurrent operation will
always make progress in a finite number of steps. Yet programmers prefer to
treat concurrent code as if it were wait-free, guaranteeing that all operations
always make progress. Unfortunately, designing wait-free algorithms is
generally a very complex task, and the resulting algorithms are not always
efficient. While obtaining efficient wait-free algorithms has been a long-time
goal for the theory community, most non-blocking commercial code is only
lock-free.
This paper suggests a simple solution to this problem. We show that, for a
large class of lock- free algorithms, under scheduling conditions which
approximate those found in commercial hardware architectures, lock-free
algorithms behave as if they are wait-free. In other words, programmers can
keep on designing simple lock-free algorithms instead of complex wait-free
ones, and in practice, they will get wait-free progress.
Our main contribution is a new way of analyzing a general class of lock-free
algorithms under a stochastic scheduler. Our analysis relates the individual
performance of processes with the global performance of the system using Markov
chain lifting between a complex per-process chain and a simpler system progress
chain. We show that lock-free algorithms are not only wait-free with
probability 1, but that in fact a general subset of lock-free algorithms can be
closely bounded in terms of the average number of steps required until an
operation completes.
To the best of our knowledge, this is the first attempt to analyze progress
conditions, typically stated in relation to a worst case adversary, in a
stochastic model capturing their expected asymptotic behavior.Comment: 25 page
Maintenance of Strongly Connected Component in Shared-memory Graph
In this paper, we present an on-line fully dynamic algorithm for maintaining
strongly connected component of a directed graph in a shared memory
architecture. The edges and vertices are added or deleted concurrently by fixed
number of threads. To the best of our knowledge, this is the first work to
propose using linearizable concurrent directed graph and is build using both
ordered and unordered list-based set. We provide an empirical comparison
against sequential and coarse-grained. The results show our algorithm's
throughput is increased between 3 to 6x depending on different workload
distributions and applications. We believe that there are huge applications in
the on-line graph. Finally, we show how the algorithm can be extended to
community detection in on-line graph.Comment: 29 pages, 4 figures, Accepted in the Conference NETYS-201
Lower Bounds for Shared-Memory Leader Election Under Bounded Write Contention
This paper gives tight logarithmic lower bounds on the solo step complexity of leader election in an asynchronous shared-memory model with single-writer multi-reader (SWMR) registers, for both deterministic and randomized obstruction-free algorithms. The approach extends to lower bounds for deterministic and randomized obstruction-free algorithms using multi-writer registers under bounded write concurrency, showing a trade-off between the solo step complexity of a leader election algorithm, and the worst-case number of stalls incurred by a processor in an execution
- …