19,047 research outputs found

    Sharing Memory between Byzantine Processes using Policy-enforced Tuple Spaces

    Get PDF
    Abstract—Despite the large amount of Byzantine fault-tolerant algorithms for message-passing systems designed through the years, only recent algorithms for the coordination of processes subject to Byzantine failures using shared memory have appeared. This paper presents a new computing model in which shared memory objects are protected by fine-grained access policies, and a new shared memory object, the Policy-Enforced Augmented Tuple Space (PEATS). We show the benefits of this model by providing simple and efficient consensus algorithms. These algorithms are much simpler and require less shared memory operations, using also less memory bits than previous algorithms based on access control lists (ACLs) and sticky bits. We also prove that PEATS objects are universal, i.e., that they can be used to implement any other shared memory object, and present lock-free and wait-free universal constructions. Index Terms—Byzantine fault-tolerance, shared memory algorithms, tuple spaces, consensus, universal constructions. Ç

    Toward self-stabilizing wait-free shared memory objects

    Get PDF
    Past research on fault tolerant distributed systems has focussed on either processor failures, ranging from benign crash failures to the malicious byzantine failure types, or on transient memory failures, which can suddenly corrupt the state of the system. An interesting question in the theory of distributed computing is whether one can device highly fault tolerant protocols which can tolerate both processor failures as well as transient errors. To answer this question we consider the construction of self-stabilizing wait-free shared memory objects. These objects occur naturally in distributed systems in which both processors and memory may be faulty. Our contribution in this paper is threefold. First, we propose a general definition of a self-stabilizing wait-free shared memory object that expresses safety guarantees even in the face of processor failures. Second, we show that within this framework one cannot construct a self-stabilizing single-reader single-writer regular bit from single-reader single-writer safe bits. This result leads us to postulate a self-stabilizing dual-reader single-writer safe bit with which, as a third contribution, we construct self-stabilizing regular and atomic registers

    On the Complexity of Implementing Certain Classes of Shared Objects

    Get PDF
    We consider shared memory systems in which asynchronous processes cooperate with each other by communicating via shared data objects, such as counters, queues, stacks, and priority queues. The common approach to implementing such shared objects is based on locking: To perform an operation on a shared object, a process obtains a lock, accesses the object, and then releases the lock. Locking, however, has several drawbacks, including convoying, priority inversion, and deadlocks. Furthermore, lock-based implementations are not fault-tolerant: if a process crashes while holding a lock, other processes can end up waiting forever for the lock. Wait-free linearizable implementations were conceived to overcome most of the above drawbacks of locking. A wait-free implementation guarantees that if a process repeatedly takes steps, then its operation on the implemented data object will eventually complete, regardless of whether other processes are slow, or fast, or have crashed. In this thesis, we first present an efficient wait-free linearizable implementation of a class of object types, called closed and closable types, and then prove time and space lower bounds on wait-free linearizable implementations of another class of object types, called perturbable types. (1) We present a wait-free linearizable implementation of n-process closed and closable types (such as swap, fetch&add, fetch&multiply, and fetch&L, where L is any of the boolean operations and, or, or complement) using registers that support load-link (LL) and store-conditional (SC) as base objects. The time complexity of the implementation grows linearly with contention, but is never more than O(log ^2 n). We believe that this is the first implementation of a class of types (as opposed to a specific type) to achieve a sub-linear time complexity. (2) We prove linear time and space lower bounds on the wait-free linearizable implementations of n-process perturbable types (such as increment, fetch&add, modulo k counter, LL/SC bit, k-valued compare&swap (for any k \u3e= n), single-writer snapshot) that use resettable consensus and historyless objects (such as registers that support read and write) as base objects. This improves on some previously known Omega(sqrt{n}) space complexity lower bounds. It also shows the near space optimality of some known wait-free linearizable implementations

    Randomized protocols for asynchronous consensus

    Full text link
    The famous Fischer, Lynch, and Paterson impossibility proof shows that it is impossible to solve the consensus problem in a natural model of an asynchronous distributed system if even a single process can fail. Since its publication, two decades of work on fault-tolerant asynchronous consensus algorithms have evaded this impossibility result by using extended models that provide (a) randomization, (b) additional timing assumptions, (c) failure detectors, or (d) stronger synchronization mechanisms than are available in the basic model. Concentrating on the first of these approaches, we illustrate the history and structure of randomized asynchronous consensus protocols by giving detailed descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of Distributed Computin

    The Impact of RDMA on Agreement

    Full text link
    Remote Direct Memory Access (RDMA) is becoming widely available in data centers. This technology allows a process to directly read and write the memory of a remote host, with a mechanism to control access permissions. In this paper, we study the fundamental power of these capabilities. We consider the well-known problem of achieving consensus despite failures, and find that RDMA can improve the inherent trade-off in distributed computing between failure resilience and performance. Specifically, we show that RDMA allows algorithms that simultaneously achieve high resilience and high performance, while traditional algorithms had to choose one or another. With Byzantine failures, we give an algorithm that only requires n≥2fP+1n \geq 2f_P + 1 processes (where fPf_P is the maximum number of faulty processes) and decides in two (network) delays in common executions. With crash failures, we give an algorithm that only requires n≥fP+1n \geq f_P + 1 processes and also decides in two delays. Both algorithms tolerate a minority of memory failures inherent to RDMA, and they provide safety in asynchronous systems and liveness with standard additional assumptions.Comment: Full version of PODC'19 paper, strengthened broadcast algorith
    • …
    corecore