49 research outputs found
Self-stabilizing virtual synchrony
Virtual synchrony (VS) is an important abstraction that is proven to be extremely useful when implemented over asynchronous, typically large, message-passing distributed systems. Fault tolerant design is critical for the success of such implementations since large distributed systems can be highly available as long as they do not depend on the full operational status of every system participant. Self-stabilizing systems can tolerate transient faults that drive the system to an arbitrary unpredictable configuration. Such systems automatically regain consistency from any such configuration, and then produce the desired system behavior ensuring it for practically infinite number of successive steps, e.g., 264 steps. We present a new multi-purpose self-stabilizing counter algorithm establishing an efficient practically unbounded counter, that can directly yield a self-stabilizing Multiple-Writer Multiple-Reader (MWMR) register emulation. We use our counter algorithm, together with a selfstabilizing group membership and a self-stabilizing multicast service to devise the first practically stabilizing VS algorithm and a self-stabilizing VS-based emulation of state machine replication (SMR). As we base the SMR implementation on VS, rather than consensus, the system progresses in more extreme asynchronous settings in relation to consensusbased SMR
Brief Announcement: Self-stabilizing Virtual Synchrony
International audienceSystems satisfying the Virtual Synchrony (VS) [2] property provide message multicast and group membership services in which all system events, group membership changes, and incoming messages, are delivered in the same order. VS is an important abstraction, proven to be extremely useful when implemented over asynchronous, typically large-scale, message-passing distributed systems, as it simplifies the design of distributed applications, e.g., State Machine Replication (SMR). The VS property ensures that two or more processors that participate in two consecutive communicating groups should have delivered the same messages. Self-stabilizing systems [1,3] can tolerate transient faults that drive the system to an unpredicted arbitrary configuration. Such sys- tems automatically regain consistency from any such configuration, and then produce the desired system behavior ensuring it for a practically infinite number of successive steps, e.g., 264 steps. We present the first, to our knowledge, self-stabilizing virtual synchrony algorithm
Loosely-self-stabilizing Byzantine-Tolerant Binary Consensus for Signature-Free Message-Passing Systems
At PODC 2014, A. Most\ue9faoui, H. Moumen, and M. Raynal presented a new and simple randomized signature-free binary consensus algorithm (denoted here as MMR) that copes with the net effect of asynchrony and Byzantine behaviors. Assuming message scheduling is fair and independent from random numbers, MMR is optimal in several respects: it deals with up\ua0to t Byzantine processes, where t< n/ 3, n being the number of processes, O(n2) messages, and O(1 ) expected time. The present article presents a non-trivial extension of MMR to an even more fault-prone context, namely, in addition to Byzantine processes, it considers also that the system can experience transient failures. To this end it considers self-stabilization techniques to cope with communication failures and arbitrary transient faults, i.e., any violation of the assumptions according to which the system was designed to operate. The proposed algorithm is the first loosely-self-stabilizing Byzantine fault-tolerant binary consensus algorithm suited to asynchronous message-passing systems. This is achieved via an instructive transformation of MMR to a loosely-self-stabilizing solution that can violate safety requirements with probability Pr = O(1 / (2 M) ), where M is a predefined constant that can be set to any positive integer at the cost of 3 Mn+ log M bits of local memory. In addition to making MMR resilient to transient faults, the obtained loosely-self-stabilizing algorithm preserves its properties of optimal resilience and termination, i.e., t< n/ 3 and O(1 ) expected time. Furthermore, it only requires a bounded amount of memory