5,540 research outputs found
The Weakest Failure Detector for Eventual Consistency
In its classical form, a consistent replicated service requires all replicas
to witness the same evolution of the service state. Assuming a message-passing
environment with a majority of correct processes, the necessary and sufficient
information about failures for implementing a general state machine replication
scheme ensuring consistency is captured by the {\Omega} failure detector. This
paper shows that in such a message-passing environment, {\Omega} is also the
weakest failure detector to implement an eventually consistent replicated
service, where replicas are expected to agree on the evolution of the service
state only after some (a priori unknown) time. In fact, we show that {\Omega}
is the weakest to implement eventual consistency in any message-passing
environment, i.e., under any assumption on when and where failures might occur.
Ensuring (strong) consistency in any environment requires, in addition to
{\Omega}, the quorum failure detector {\Sigma}. Our paper thus captures, for
the first time, an exact computational difference be- tween building a
replicated state machine that ensures consistency and one that only ensures
eventual consistency
Fault-Tolerant Consensus in Unknown and Anonymous Networks
This paper investigates under which conditions information can be reliably
shared and consensus can be solved in unknown and anonymous message-passing
networks that suffer from crash-failures. We provide algorithms to emulate
registers and solve consensus under different synchrony assumptions. For this,
we introduce a novel pseudo leader-election approach which allows a
leader-based consensus implementation without breaking symmetry
Wait-Freedom with Advice
We motivate and propose a new way of thinking about failure detectors which
allows us to define, quite surprisingly, what it means to solve a distributed
task \emph{wait-free} \emph{using a failure detector}. In our model, the system
is composed of \emph{computation} processes that obtain inputs and are supposed
to output in a finite number of steps and \emph{synchronization} processes that
are subject to failures and can query a failure detector. We assume that, under
the condition that \emph{correct} synchronization processes take sufficiently
many steps, they provide the computation processes with enough \emph{advice} to
solve the given task wait-free: every computation process outputs in a finite
number of its own steps, regardless of the behavior of other computation
processes. Every task can thus be characterized by the \emph{weakest} failure
detector that allows for solving it, and we show that every such failure
detector captures a form of set agreement. We then obtain a complete
classification of tasks, including ones that evaded comprehensible
characterization so far, such as renaming or weak symmetry breaking
The weakest failure detectors to boost obstruction-freedom
It is considered good practice in concurrent computing to devise shared object implementations that ensure a minimal obstruction-free progress property and delegate the task of boosting liveness to independent generic oracles called contention managers. This paper determines necessary and sufficient conditions to implement wait-free and non-blocking contention managers, i.e., contention managers that ensure wait-freedom (resp. non-blockingness) of any associated obstruction-free object implementation. The necessary conditions hold even when universal objects (like compare-and-swap) or random oracles are available in the implementation of the contention manager. On the other hand, the sufficient conditions assume only basic read/write objects, i.e., registers. We show that failure detector \lozenge{\fancyscript{P}} is the weakest to convert any obstruction-free algorithm into a wait-free one, and Ω *, a new failure detector which we introduce in this paper, and which is strictly weaker than \lozenge\fancyscript{P} but strictly stronger than Ω, is the weakest to convert any obstruction-free algorithm into a non-blocking one. We also address the issue of minimizing the overhead imposed by contention management in low contention scenarios. We propose two intermittent failure detectors and I_{\lozenge\fancyscript{P}} that are in a precise sense equivalent to, respectively, Ω * and \lozenge\fancyscript{P} , but allow for reducing the cost of failure detection in eventually synchronous systems when there is little contention. We present two contention managers: a non-blocking one and a wait-free one, that use, respectively, and I_{\lozenge\fancyscript{P}} . When there is no contention, the first induces very little overhead whereas the second induces some non-trivial overhead. We show that wait-free contention managers, unlike their non-blocking counterparts, impose an inherent non-trivial overhead even in contention-free execution
Fairness Properties of the Trusting Failure Detector
In 1985 it was shown by Fischer et al. that consensus, a fundamental problem in distributed computing, was impossible in asynchronous distributed systems in the presence of even just one process failure. This result prompted a search for alternative system models that were capable of solving such problems and culminated in the development of two helpful constructs: partially synchronous system models and failure detectors. Partially synchronous system models seek to solve the problem of identifying process crashes by constraining the real-time behavior of the underlying system. In the resulting models, crashed processes can be detected indirectly through the use of timeouts. Failure detectors, on the other hand, address process crashes by directly providing (potentially inaccurate) information on failures. As a result, failure detectors were viewed as abstractions of real-time information. Pike et al. proposed a different perspective on failure detectors; as abstracting fairness properties. Fairness in a system imposes bounds on the relative frequencies of communication and execution between processes in a system, and it was shown that four frequently-used failure detectors from the Chandra-Toueg hierarchy (P, ♢P, S, ♢S) encapsulate these fairness properties. This discovery suggests that failure detectors may be better understood as abstractions of fairness rather than real-time properties as well as demonstrates the possibility to communicate results between systems augmented with failure detectors and partially synchronous system models. In this thesis, we will be discussing an extension of the Pike et al. result to the trusting failure detector. The trusting failure detector is the weakest failure detector to implement the problem of fault-tolerant mutual exclusion: a fundamental primitive for distributed computing
Consensus using Asynchronous Failure Detectors
The FLP result shows that crash-tolerant consensus is impossible to solve in
asynchronous systems, and several solutions have been proposed for
crash-tolerant consensus under alternative (stronger) models. One popular
approach is to augment the asynchronous system with appropriate failure
detectors, which provide (potentially unreliable) information about process
crashes in the system, to circumvent the FLP impossibility.
In this paper, we demonstrate the exact mechanism by which (sufficiently
powerful) asynchronous failure detectors enable solving crash-tolerant
consensus. Our approach, which borrows arguments from the FLP impossibility
proof and the famous result from CHT, which shows that is a weakest
failure detector to solve consensus, also yields a natural proof to as
a weakest asynchronous failure detector to solve consensus. The use of I/O
automata theory in our approach enables us to model execution in a more
detailed fashion than CHT and also addresses the latent assumptions and
assertions in the original result in CHT
- …