6,691 research outputs found
Stabilizing Server-Based Storage in Byzantine Asynchronous Message-Passing Systems
A stabilizing Byzantine single-writer single-reader (SWSR) regular register,
which stabilizes after the first invoked write operation, is first presented.
Then, new/old ordering inversions are eliminated by the use of a (bounded)
sequence number for writes, obtaining a practically stabilizing SWSR atomic
register. A practically stabilizing Byzantine single-writer multi-reader (SWMR)
atomic register is then obtained by using several copies of SWSR atomic
registers. Finally, bounded time-stamps, with a time-stamp per writer, together
with SWMR atomic registers, are used to construct a practically stabilizing
Byzantine multi-writer multi-reader (MWMR) atomic register. In a system of
servers implementing an atomic register, and in addition to transient failures,
the constructions tolerate t<n/8 Byzantine servers if communication is
asynchronous, and t<n/3 Byzantine servers if it is synchronous. The noteworthy
feature of the proposed algorithms is that (to our knowledge) these are the
first that build an atomic read/write storage on top of asynchronous servers
prone to transient failures, and where up to t of them can be Byzantine
Tight Mobile Byzantine Tolerant Atomic Storage
This paper proposes the first implementation of an atomic storage tolerant to
mobile Byzantine agents. Our implementation is designed for the round-based
synchronous model where the set of Byzantine nodes changes from round to round.
In this model we explore the feasibility of multi-writer multi-reader atomic
register prone to various mobile Byzantine behaviors. We prove upper and lower
bounds for solving the atomic storage in all the explored models. Our results,
significantly different from the static case, advocate for a deeper study of
the main building blocks of distributed computing while the system is prone to
mobile Byzantine failures
Self-stabilizing virtual synchrony
Virtual synchrony (VS) is an important abstraction that is proven to be extremely useful when implemented over asynchronous, typically large, message-passing distributed systems. Fault tolerant design is critical for the success of such implementations since large distributed systems can be highly available as long as they do not depend on the full operational status of every system participant. Self-stabilizing systems can tolerate transient faults that drive the system to an arbitrary unpredictable configuration. Such systems automatically regain consistency from any such configuration, and then produce the desired system behavior ensuring it for practically infinite number of successive steps, e.g., 264 steps. We present a new multi-purpose self-stabilizing counter algorithm establishing an efficient practically unbounded counter, that can directly yield a self-stabilizing Multiple-Writer Multiple-Reader (MWMR) register emulation. We use our counter algorithm, together with a selfstabilizing group membership and a self-stabilizing multicast service to devise the first practically stabilizing VS algorithm and a self-stabilizing VS-based emulation of state machine replication (SMR). As we base the SMR implementation on VS, rather than consensus, the system progresses in more extreme asynchronous settings in relation to consensusbased SMR
Self-Stabilizing and Private Distributed Shared Atomic Memory in Seldomly Fair Message Passing Networks
We study the problem of privately emulating shared memory in message-passing networks. The system includes clients that store and retrieve replicated information on N servers, out of which e are data-corrupting malicious. When a client accesses a data-corrupting malicious server, the data field of that server response might be different from the value it originally stored. However, all other control variables in the server reply and protocol actions are according to the server algorithm. For the coded atomic storage algorithms by Cadambe et al., we present an enhancement that ensures no information leakage and data-corrupting malicious fault-tolerance. We also consider recovery after the occurrence of transient faults that violate the assumptions according to which the system was designed to operate. After their last occurrence, transient faults leave the system in an arbitrary state (while the program code stays intact). We present a self-stabilizing algorithm, which recovers after the occurrence of transient faults. This addition to Cadambe et al. considers asynchronous settings as long as no transient faults occur. The recovery from transient faults that bring the system counters (close) to their maximal values may include the use of a global reset procedure, which requires the system run to be controlled by a fair scheduler. After the recovery period, the safety properties are provided for asynchronous system runs that are not necessarily controlled by fair schedulers. Since the recovery period is bounded and the occurrence of transient faults is extremely rare, we call this design criteria self-stabilization in the presence of seldom fairness. Our self-stabilizing algorithm uses a bounded amount of storage during asynchronous executions (that are not necessarily controlled by fair schedulers). To the best of our knowledge, we are the first to address privacy, data-corrupting malicious behavior, and self-stabilization in the context of emulating atomic shared memory in message-passing systems
Self-stabilization Overhead: an Experimental Case Study on Coded Atomic Storage
Shared memory emulation can be used as a fault-tolerant and highly available
distributed storage solution or as a low-level synchronization primitive.
Attiya, Bar-Noy, and Dolev were the first to propose a single-writer,
multi-reader linearizable register emulation where the register is replicated
to all servers. Recently, Cadambe et al. proposed the Coded Atomic Storage
(CAS) algorithm, which uses erasure coding for achieving data redundancy with
much lower communication cost than previous algorithmic solutions.
Although CAS can tolerate server crashes, it was not designed to recover from
unexpected, transient faults, without the need of external (human)
intervention. In this respect, Dolev, Petig, and Schiller have recently
developed a self-stabilizing version of CAS, which we call CASSS. As one would
expect, self-stabilization does not come as a free lunch; it introduces,
mainly, communication overhead for detecting inconsistencies and stale
information. So, one would wonder whether the overhead introduced by
self-stabilization would nullify the gain of erasure coding.
To answer this question, we have implemented and experimentally evaluated the
CASSS algorithm on PlanetLab; a planetary scale distributed infrastructure. The
evaluation shows that our implementation of CASSS scales very well in terms of
the number of servers, the number of concurrent clients, as well as the size of
the replicated object. More importantly, it shows (a) to have only a constant
overhead compared to the traditional CAS algorithm (which we also implement)
and (b) the recovery period (after the last occurrence of a transient fault) is
as fast as a few client (read/write) operations. Our results suggest that CASSS
does not significantly impact efficiency while dealing with automatic recovery
from transient faults and bounded size of needed resources
Recommended from our members
An improved connectionist activation function for energy minimization
Symmetric networks that are based on energy minimization, such as Boltzmann machines or Hopfield nets, are used extensively for optimization, constraint satisfaction, and approximation of NP-hard problems. Nevertheless, finding a global minimum for the energy function is not guaranteed, and even a local minimum may take an exponential number of steps. We propose an improvement to the standard activation function used for such networks. The improved algorithm guarantees that a global minimum is found in linear time for tree-like subnetworks. The algorithm is uniform and does not assume that the network is a tree. It performs no worse than the standard algorithms for any network topology. In the case where there are trees growing from a cyclic subnetwork, the new algorithm performs better than the standard algorithms by avoiding local minima along the trees and by optimizing the free energy of these trees in linear time. The algorithm is self-stabilizing for trees (cycle-free undirected graphs) and remains correct under various scheduling demons. However, no uniform protocol exists to optimize trees under a pure distributed demon and no such protocol exists for cyclic networks under central demon
- …