5,540 research outputs found

    The Weakest Failure Detector for Eventual Consistency

    Get PDF
    In its classical form, a consistent replicated service requires all replicas to witness the same evolution of the service state. Assuming a message-passing environment with a majority of correct processes, the necessary and sufficient information about failures for implementing a general state machine replication scheme ensuring consistency is captured by the {\Omega} failure detector. This paper shows that in such a message-passing environment, {\Omega} is also the weakest failure detector to implement an eventually consistent replicated service, where replicas are expected to agree on the evolution of the service state only after some (a priori unknown) time. In fact, we show that {\Omega} is the weakest to implement eventual consistency in any message-passing environment, i.e., under any assumption on when and where failures might occur. Ensuring (strong) consistency in any environment requires, in addition to {\Omega}, the quorum failure detector {\Sigma}. Our paper thus captures, for the first time, an exact computational difference be- tween building a replicated state machine that ensures consistency and one that only ensures eventual consistency

    Fault-Tolerant Consensus in Unknown and Anonymous Networks

    Get PDF
    This paper investigates under which conditions information can be reliably shared and consensus can be solved in unknown and anonymous message-passing networks that suffer from crash-failures. We provide algorithms to emulate registers and solve consensus under different synchrony assumptions. For this, we introduce a novel pseudo leader-election approach which allows a leader-based consensus implementation without breaking symmetry

    Wait-Freedom with Advice

    Full text link
    We motivate and propose a new way of thinking about failure detectors which allows us to define, quite surprisingly, what it means to solve a distributed task \emph{wait-free} \emph{using a failure detector}. In our model, the system is composed of \emph{computation} processes that obtain inputs and are supposed to output in a finite number of steps and \emph{synchronization} processes that are subject to failures and can query a failure detector. We assume that, under the condition that \emph{correct} synchronization processes take sufficiently many steps, they provide the computation processes with enough \emph{advice} to solve the given task wait-free: every computation process outputs in a finite number of its own steps, regardless of the behavior of other computation processes. Every task can thus be characterized by the \emph{weakest} failure detector that allows for solving it, and we show that every such failure detector captures a form of set agreement. We then obtain a complete classification of tasks, including ones that evaded comprehensible characterization so far, such as renaming or weak symmetry breaking

    The weakest failure detectors to boost obstruction-freedom

    Get PDF
    It is considered good practice in concurrent computing to devise shared object implementations that ensure a minimal obstruction-free progress property and delegate the task of boosting liveness to independent generic oracles called contention managers. This paper determines necessary and sufficient conditions to implement wait-free and non-blocking contention managers, i.e., contention managers that ensure wait-freedom (resp. non-blockingness) of any associated obstruction-free object implementation. The necessary conditions hold even when universal objects (like compare-and-swap) or random oracles are available in the implementation of the contention manager. On the other hand, the sufficient conditions assume only basic read/write objects, i.e., registers. We show that failure detector \lozenge{\fancyscript{P}} is the weakest to convert any obstruction-free algorithm into a wait-free one, and Ω *, a new failure detector which we introduce in this paper, and which is strictly weaker than \lozenge\fancyscript{P} but strictly stronger than Ω, is the weakest to convert any obstruction-free algorithm into a non-blocking one. We also address the issue of minimizing the overhead imposed by contention management in low contention scenarios. We propose two intermittent failure detectors IΩ∗I_{\Omega^*} and I_{\lozenge\fancyscript{P}} that are in a precise sense equivalent to, respectively, Ω * and \lozenge\fancyscript{P} , but allow for reducing the cost of failure detection in eventually synchronous systems when there is little contention. We present two contention managers: a non-blocking one and a wait-free one, that use, respectively, IΩ∗I_{\Omega^*} and I_{\lozenge\fancyscript{P}} . When there is no contention, the first induces very little overhead whereas the second induces some non-trivial overhead. We show that wait-free contention managers, unlike their non-blocking counterparts, impose an inherent non-trivial overhead even in contention-free execution

    Fairness Properties of the Trusting Failure Detector

    Get PDF
    In 1985 it was shown by Fischer et al. that consensus, a fundamental problem in distributed computing, was impossible in asynchronous distributed systems in the presence of even just one process failure. This result prompted a search for alternative system models that were capable of solving such problems and culminated in the development of two helpful constructs: partially synchronous system models and failure detectors. Partially synchronous system models seek to solve the problem of identifying process crashes by constraining the real-time behavior of the underlying system. In the resulting models, crashed processes can be detected indirectly through the use of timeouts. Failure detectors, on the other hand, address process crashes by directly providing (potentially inaccurate) information on failures. As a result, failure detectors were viewed as abstractions of real-time information. Pike et al. proposed a different perspective on failure detectors; as abstracting fairness properties. Fairness in a system imposes bounds on the relative frequencies of communication and execution between processes in a system, and it was shown that four frequently-used failure detectors from the Chandra-Toueg hierarchy (P, ♢P, S, ♢S) encapsulate these fairness properties. This discovery suggests that failure detectors may be better understood as abstractions of fairness rather than real-time properties as well as demonstrates the possibility to communicate results between systems augmented with failure detectors and partially synchronous system models. In this thesis, we will be discussing an extension of the Pike et al. result to the trusting failure detector. The trusting failure detector is the weakest failure detector to implement the problem of fault-tolerant mutual exclusion: a fundamental primitive for distributed computing

    Consensus using Asynchronous Failure Detectors

    Get PDF
    The FLP result shows that crash-tolerant consensus is impossible to solve in asynchronous systems, and several solutions have been proposed for crash-tolerant consensus under alternative (stronger) models. One popular approach is to augment the asynchronous system with appropriate failure detectors, which provide (potentially unreliable) information about process crashes in the system, to circumvent the FLP impossibility. In this paper, we demonstrate the exact mechanism by which (sufficiently powerful) asynchronous failure detectors enable solving crash-tolerant consensus. Our approach, which borrows arguments from the FLP impossibility proof and the famous result from CHT, which shows that Ω\Omega is a weakest failure detector to solve consensus, also yields a natural proof to Ω\Omega as a weakest asynchronous failure detector to solve consensus. The use of I/O automata theory in our approach enables us to model execution in a more detailed fashion than CHT and also addresses the latent assumptions and assertions in the original result in CHT
    • …
    corecore