14,522 research outputs found

    The Impact of RDMA on Agreement

    Full text link
    Remote Direct Memory Access (RDMA) is becoming widely available in data centers. This technology allows a process to directly read and write the memory of a remote host, with a mechanism to control access permissions. In this paper, we study the fundamental power of these capabilities. We consider the well-known problem of achieving consensus despite failures, and find that RDMA can improve the inherent trade-off in distributed computing between failure resilience and performance. Specifically, we show that RDMA allows algorithms that simultaneously achieve high resilience and high performance, while traditional algorithms had to choose one or another. With Byzantine failures, we give an algorithm that only requires n≥2fP+1n \geq 2f_P + 1 processes (where fPf_P is the maximum number of faulty processes) and decides in two (network) delays in common executions. With crash failures, we give an algorithm that only requires n≥fP+1n \geq f_P + 1 processes and also decides in two delays. Both algorithms tolerate a minority of memory failures inherent to RDMA, and they provide safety in asynchronous systems and liveness with standard additional assumptions.Comment: Full version of PODC'19 paper, strengthened broadcast algorith

    Byzantine Vector Consensus in Complete Graphs

    Full text link
    Consider a network of n processes each of which has a d-dimensional vector of reals as its input. Each process can communicate directly with all the processes in the system; thus the communication network is a complete graph. All the communication channels are reliable and FIFO (first-in-first-out). The problem of Byzantine vector consensus (BVC) requires agreement on a d-dimensional vector that is in the convex hull of the d-dimensional input vectors at the non-faulty processes. We obtain the following results for Byzantine vector consensus in complete graphs while tolerating up to f Byzantine failures: * We prove that in a synchronous system, n >= max(3f+1, (d+1)f+1) is necessary and sufficient for achieving Byzantine vector consensus. * In an asynchronous system, it is known that exact consensus is impossible in presence of faulty processes. For an asynchronous system, we prove that n >= (d+2)f+1 is necessary and sufficient to achieve approximate Byzantine vector consensus. Our sufficiency proofs are constructive. We show sufficiency by providing explicit algorithms that solve exact BVC in synchronous systems, and approximate BVC in asynchronous systems. We also obtain tight bounds on the number of processes for achieving BVC using algorithms that are restricted to a simpler communication pattern

    Relaxed Byzantine Vector Consensus

    Get PDF
    Exact Byzantine consensus problem requires that non-faulty processes reach agreement on a decision (or output) that is in the convex hull of the inputs at the non-faulty processes. It is well-known that exact consensus is impossible in an asynchronous system in presence of faults, and in a synchronous system, n>=3f+1 is tight on the number of processes to achieve exact Byzantine consensus with scalar inputs, in presence of up to f Byzantine faulty processes. Recent work has shown that when the inputs are d-dimensional vectors of reals, n>=max(3f+1,(d+1)f+1) is tight to achieve exact Byzantine consensus in synchronous systems, and n>= (d+2)f+1 for approximate Byzantine consensus in asynchronous systems. Due to the dependence of the lower bound on vector dimension d, the number of processes necessary becomes large when the vector dimension is large. With the hope of reducing the lower bound on n, we consider two relaxed versions of Byzantine vector consensus: k-Relaxed Byzantine vector consensus and (delta,p)-Relaxed Byzantine vector consensus. In k-relaxed consensus, the validity condition requires that the output must be in the convex hull of projection of the inputs onto any subset of k-dimensions of the vectors. For (delta,p)-consensus the validity condition requires that the output must be within distance delta of the convex hull of the inputs of the non-faulty processes, where L_p norm is used as the distance metric. For (delta,p)-consensus, we consider two versions: in one version, delta is a constant, and in the second version, delta is a function of the inputs themselves. We show that for k-relaxed consensus and (delta,p)-consensus with constant delta>=0, the bound on n is identical to the bound stated above for the original vector consensus problem. On the other hand, when delta depends on the inputs, we show that the bound on n is smaller when d>=3

    Generalized Paxos Made Byzantine (and Less Complex)

    Full text link
    One of the most recent members of the Paxos family of protocols is Generalized Paxos. This variant of Paxos has the characteristic that it departs from the original specification of consensus, allowing for a weaker safety condition where different processes can have a different views on a sequence being agreed upon. However, much like the original Paxos counterpart, Generalized Paxos does not have a simple implementation. Furthermore, with the recent practical adoption of Byzantine fault tolerant protocols, it is timely and important to understand how Generalized Paxos can be implemented in the Byzantine model. In this paper, we make two main contributions. First, we provide a description of Generalized Paxos that is easier to understand, based on a simpler specification and the pseudocode for a solution that can be readily implemented. Second, we extend the protocol to the Byzantine fault model

    Asynchronous Convex Consensus in the Presence of Crash Faults

    Full text link
    This paper defines a new consensus problem, convex consensus. Similar to vector consensus [13, 20, 19], the input at each process is a d-dimensional vector of reals (or, equivalently, a point in the d-dimensional Euclidean space). However, for convex consensus, the output at each process is a convex polytope contained within the convex hull of the inputs at the fault-free processes. We explore the convex consensus problem under crash faults with incorrect inputs, and present an asynchronous approximate convex consensus algorithm with optimal fault tolerance that reaches consensus on an optimal output polytope. Convex consensus can be used to solve other related problems. For instance, a solution for convex consensus trivially yields a solution for vector consensus. More importantly, convex consensus can potentially be used to solve other more interesting problems, such as convex function optimization [5, 4].Comment: A version of this work is published in PODC 201

    Randomized protocols for asynchronous consensus

    Full text link
    The famous Fischer, Lynch, and Paterson impossibility proof shows that it is impossible to solve the consensus problem in a natural model of an asynchronous distributed system if even a single process can fail. Since its publication, two decades of work on fault-tolerant asynchronous consensus algorithms have evaded this impossibility result by using extended models that provide (a) randomization, (b) additional timing assumptions, (c) failure detectors, or (d) stronger synchronization mechanisms than are available in the basic model. Concentrating on the first of these approaches, we illustrate the history and structure of randomized asynchronous consensus protocols by giving detailed descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of Distributed Computin
    • …
    corecore