38,309 research outputs found
Wait-Freedom with Advice
We motivate and propose a new way of thinking about failure detectors which
allows us to define, quite surprisingly, what it means to solve a distributed
task \emph{wait-free} \emph{using a failure detector}. In our model, the system
is composed of \emph{computation} processes that obtain inputs and are supposed
to output in a finite number of steps and \emph{synchronization} processes that
are subject to failures and can query a failure detector. We assume that, under
the condition that \emph{correct} synchronization processes take sufficiently
many steps, they provide the computation processes with enough \emph{advice} to
solve the given task wait-free: every computation process outputs in a finite
number of its own steps, regardless of the behavior of other computation
processes. Every task can thus be characterized by the \emph{weakest} failure
detector that allows for solving it, and we show that every such failure
detector captures a form of set agreement. We then obtain a complete
classification of tasks, including ones that evaded comprehensible
characterization so far, such as renaming or weak symmetry breaking
Relating L-Resilience and Wait-Freedom via Hitting Sets
The condition of t-resilience stipulates that an n-process program is only
obliged to make progress when at least n-t processes are correct. Put another
way, the live sets, the collection of process sets such that progress is
required if all the processes in one of these sets are correct, are all sets
with at least n-t processes.
We show that the ability of arbitrary collection of live sets L to solve
distributed tasks is tightly related to the minimum hitting set of L, a minimum
cardinality subset of processes that has a non-empty intersection with every
live set. Thus, finding the computing power of L is NP-complete.
For the special case of colorless tasks that allow participating processes to
adopt input or output values of each other, we use a simple simulation to show
that a task can be solved L-resiliently if and only if it can be solved
(h-1)-resiliently, where h is the size of the minimum hitting set of L.
For general tasks, we characterize L-resilient solvability of tasks with
respect to a limited notion of weak solvability: in every execution where all
processes in some set in L are correct, outputs must be produced for every
process in some (possibly different) participating set in L. Given a task T, we
construct another task T_L such that T is solvable weakly L-resiliently if and
only if T_L is solvable weakly wait-free
Randomized protocols for asynchronous consensus
The famous Fischer, Lynch, and Paterson impossibility proof shows that it is
impossible to solve the consensus problem in a natural model of an asynchronous
distributed system if even a single process can fail. Since its publication,
two decades of work on fault-tolerant asynchronous consensus algorithms have
evaded this impossibility result by using extended models that provide (a)
randomization, (b) additional timing assumptions, (c) failure detectors, or (d)
stronger synchronization mechanisms than are available in the basic model.
Concentrating on the first of these approaches, we illustrate the history and
structure of randomized asynchronous consensus protocols by giving detailed
descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of
Distributed Computin
Reconfigurable Lattice Agreement and Applications
Reconfiguration is one of the central mechanisms in distributed systems. Due to failures and connectivity disruptions, the very set of service replicas (or servers) and their roles in the computation may have to be reconfigured over time. To provide the desired level of consistency and availability to applications running on top of these servers, the clients of the service should be able to reach some form of agreement on the system configuration. We observe that this agreement is naturally captured via a lattice partial order on the system states. We propose an asynchronous implementation of reconfigurable lattice agreement that implies elegant reconfigurable versions of a large class of lattice abstract data types, such as max-registers and conflict detectors, as well as popular distributed programming abstractions, such as atomic snapshot and commit-adopt
Why Extension-Based Proofs Fail
We introduce extension-based proofs, a class of impossibility proofs that
includes valency arguments. They are modelled as an interaction between a
prover and a protocol. Using proofs based on combinatorial topology, it has
been shown that it is impossible to deterministically solve k-set agreement
among n > k > 1 processes in a wait-free manner in certain asynchronous models.
However, it was unknown whether proofs based on simpler techniques were
possible. We show that this impossibility result cannot be obtained for one of
these models by an extension-based proof and, hence, extension-based proofs are
limited in power.Comment: This version of the paper is for the NIS model. Previous versions of
the paper are for the NIIS mode
- …