14 research outputs found

    The Weakest Failure Detector for Eventual Consistency

    Get PDF
    In its classical form, a consistent replicated service requires all replicas to witness the same evolution of the service state. Assuming a message-passing environment with a majority of correct processes, the necessary and sufficient information about failures for implementing a general state machine replication scheme ensuring consistency is captured by the {\Omega} failure detector. This paper shows that in such a message-passing environment, {\Omega} is also the weakest failure detector to implement an eventually consistent replicated service, where replicas are expected to agree on the evolution of the service state only after some (a priori unknown) time. In fact, we show that {\Omega} is the weakest to implement eventual consistency in any message-passing environment, i.e., under any assumption on when and where failures might occur. Ensuring (strong) consistency in any environment requires, in addition to {\Omega}, the quorum failure detector {\Sigma}. Our paper thus captures, for the first time, an exact computational difference be- tween building a replicated state machine that ensures consistency and one that only ensures eventual consistency

    Self-stabilizing virtual synchrony

    Get PDF
    Virtual synchrony (VS) is an important abstraction that is proven to be extremely useful when implemented over asynchronous, typically large, message-passing distributed systems. Fault tolerant design is critical for the success of such implementations since large distributed systems can be highly available as long as they do not depend on the full operational status of every system participant. Self-stabilizing systems can tolerate transient faults that drive the system to an arbitrary unpredictable configuration. Such systems automatically regain consistency from any such configuration, and then produce the desired system behavior ensuring it for practically infinite number of successive steps, e.g., 264 steps. We present a new multi-purpose self-stabilizing counter algorithm establishing an efficient practically unbounded counter, that can directly yield a self-stabilizing Multiple-Writer Multiple-Reader (MWMR) register emulation. We use our counter algorithm, together with a selfstabilizing group membership and a self-stabilizing multicast service to devise the first practically stabilizing VS algorithm and a self-stabilizing VS-based emulation of state machine replication (SMR). As we base the SMR implementation on VS, rather than consensus, the system progresses in more extreme asynchronous settings in relation to consensusbased SMR

    Self-stabilizing Byzantine Multivalued Consensus

    Full text link
    Consensus, abstracting a myriad of problems in which processes have to agree on a single value, is one of the most celebrated problems of fault-tolerant distributed computing. Consensus applications include fundamental services for the environments of the Cloud and Blockchain, and in such challenging environments, malicious behaviors are often modeled as adversarial Byzantine faults. At OPODIS 2010, Mostefaoui and Raynal (in short MR) presented a Byzantine-tolerant solution to consensus in which the decided value cannot be a value proposed only by Byzantine processes. MR has optimal resilience coping with up to t < n/3 Byzantine nodes over n processes. MR provides this multivalued consensus object (which accepts proposals taken from a finite set of values) assuming the availability of a single Binary consensus object (which accepts proposals taken from the set {0,1}). This work, which focuses on multivalued consensus, aims at the design of an even more robust solution than MR. Our proposal expands MR's fault-model with self-stabilization, a vigorous notion of fault-tolerance. In addition to tolerating Byzantine, self-stabilizing systems can automatically recover after the occurrence of arbitrary transient-faults. These faults represent any violation of the assumptions according to which the system was designed to operate (provided that the algorithm code remains intact). To the best of our knowledge, we propose the first self-stabilizing solution for intrusion-tolerant multivalued consensus for asynchronous message-passing systems prone to Byzantine failures. Our solution has a O(t) stabilization time from arbitrary transient faults.Comment: arXiv admin note: text overlap with arXiv:2110.0859

    Near-optimal self-stabilising counting and firing squads

    Get PDF
    Consider a fully-connected synchronous distributed system consisting of n nodes, where up to f nodes may be faulty and every node starts in an arbitrary initial state. In the synchronous C-counting problem, all nodes need to eventually agree on a counter that is increased by one modulo C in each round for given C&gt;1. In the self-stabilising firing squad problem, the task is to eventually guarantee that all non-faulty nodes have simultaneous responses to external inputs: if a subset of the correct nodes receive an external “go” signal as input, then all correct nodes should agree on a round (in the not-too-distant future) in which to jointly output a “fire” signal. Moreover, no node should generate a “fire” signal without some correct node having previously received a “go” signal as input. We present a framework reducing both tasks to binary consensus at very small cost. For example, we obtain a deterministic algorithm for self-stabilising Byzantine firing squads with optimal resilience f&lt;n/3, asymptotically optimal stabilisation and response time O(f), and message size O(log f). As our framework does not restrict the type of consensus routines used, we also obtain efficient randomised solutions

    self-stabilizing

    Get PDF
    Consider a fully-connected synchronous distributed system consisting of n nodes, where up to f nodes may be faulty and every node starts in an arbitrary initial state. In the synchronous C-counting problem, all nodes need to eventually agree on a counter that is increased by one modulo C in each round for given C&gt;1. In the self-stabilising firing squad problem, the task is to eventually guarantee that all non-faulty nodes have simultaneous responses to external inputs: if a subset of the correct nodes receive an external “go” signal as input, then all correct nodes should agree on a round (in the not-too-distant future) in which to jointly output a “fire” signal. Moreover, no node should generate a “fire” signal without some correct node having previously received a “go” signal as input. We present a framework reducing both tasks to binary consensus at very small cost. For example, we obtain a deterministic algorithm for self-stabilising Byzantine firing squads with optimal resilience f&lt;n/3, asymptotically optimal stabilisation and response time O(f), and message size O(log f). As our framework does not restrict the type of consensus routines used, we also obtain efficient randomised solutions

    Self-Stabilizing and Private Distributed Shared Atomic Memory in Seldomly Fair Message Passing Networks

    Get PDF
    We study the problem of privately emulating shared memory in message-passing networks. The system includes clients that store and retrieve replicated information on N servers, out of which e are data-corrupting malicious. When a client accesses a data-corrupting malicious server, the data field of that server response might be different from the value it originally stored. However, all other control variables in the server reply and protocol actions are according to the server algorithm. For the coded atomic storage algorithms by Cadambe et al., we present an enhancement that ensures no information leakage and data-corrupting malicious fault-tolerance. We also consider recovery after the occurrence of transient faults that violate the assumptions according to which the system was designed to operate. After their last occurrence, transient faults leave the system in an arbitrary state (while the program code stays intact). We present a self-stabilizing algorithm, which recovers after the occurrence of transient faults. This addition to Cadambe et al. considers asynchronous settings as long as no transient faults occur. The recovery from transient faults that bring the system counters (close) to their maximal values may include the use of a global reset procedure, which requires the system run to be controlled by a fair scheduler. After the recovery period, the safety properties are provided for asynchronous system runs that are not necessarily controlled by fair schedulers. Since the recovery period is bounded and the occurrence of transient faults is extremely rare, we call this design criteria self-stabilization in the presence of seldom fairness. Our self-stabilizing algorithm uses a bounded amount of storage during asynchronous executions (that are not necessarily controlled by fair schedulers). To the best of our knowledge, we are the first to address privacy, data-corrupting malicious behavior, and self-stabilization in the context of emulating atomic shared memory in message-passing systems
    corecore