1,123 research outputs found

    A Practical Analysis of the Gorums Framework: A Case Study on Replicated Services with Raft

    Get PDF
    Master's thesis in Computer scienceGorums is a novel RPC framework developed to make it easier to build fault tolerant distributed systems. We want to assess whether Gorums can simplify the implementation of a practical fault tolerant service that supports reconfiguration. The Raft consensus algorithm is implemented in Gorums with the ability to do single-server configuration changes. In addition, we perform a background study of two state of the art Raft implementations. The abstractions used in these implementations are then compared to the abstractions Gorums provides and how they are used in our Raft implementation. A service is created that can be used with any of the aforementioned Raft implementations for consistency and fault tolerance. This service is then used to evaluate the different implementations through experimentation. Our evaluation shows that the Raft implementation that uses Gorums perform better with regards to latency and overall throughput during normal operation. We do however discover this implementation to be sensitive to omission faults, which can further lead to availability issues if not handled properly. We solve this by developing extensions to Raft and Gorums. We show that these methods perform on a similar level when compared with the state of the art implementations. Results from our implementation efforts indicate that Raft's log replication process is problematic to implement with Gorums' abstractions. We discover that this is due to Raft adopting a monolithic design aimed to reduce the number of different RPC types, breaching the separation of concerns design principle

    The Weakest Failure Detector for Eventual Consistency

    Get PDF
    In its classical form, a consistent replicated service requires all replicas to witness the same evolution of the service state. Assuming a message-passing environment with a majority of correct processes, the necessary and sufficient information about failures for implementing a general state machine replication scheme ensuring consistency is captured by the {\Omega} failure detector. This paper shows that in such a message-passing environment, {\Omega} is also the weakest failure detector to implement an eventually consistent replicated service, where replicas are expected to agree on the evolution of the service state only after some (a priori unknown) time. In fact, we show that {\Omega} is the weakest to implement eventual consistency in any message-passing environment, i.e., under any assumption on when and where failures might occur. Ensuring (strong) consistency in any environment requires, in addition to {\Omega}, the quorum failure detector {\Sigma}. Our paper thus captures, for the first time, an exact computational difference be- tween building a replicated state machine that ensures consistency and one that only ensures eventual consistency

    Asymmetric Distributed Trust

    Get PDF
    Quorum systems are a key abstraction in distributed fault-tolerant computing for capturing trust assumptions. They can be found at the core of many algorithms for implementing reliable broadcasts, shared memory, consensus and other problems. This paper introduces asymmetric Byzantine quorum systems that model subjective trust. Every process is free to choose which combinations of other processes it trusts and which ones it considers faulty. Asymmetric quorum systems strictly generalize standard Byzantine quorum systems, which have only one global trust assumption for all processes. This work also presents protocols that implement abstractions of shared memory and broadcast primitives with processes prone to Byzantine faults and asymmetric trust. The model and protocols pave the way for realizing more elaborate algorithms with asymmetric trust

    Asynchronous Reconfiguration with Byzantine Failures

    Get PDF
    Replicated services are inherently vulnerable to failures and security breaches. In a long-running system, it is, therefore, indispensable to maintain a reconfiguration mechanism that would replace faulty replicas with correct ones. An important challenge is to enable reconfiguration without affecting the availability and consistency of the replicated data: the clients should be able to get correct service even when the set of service replicas is being updated. In this paper, we address the problem of reconfiguration in the presence of Byzantine failures: faulty replicas or clients may arbitrarily deviate from their expected behavior. We describe a generic technique for building asynchronous and Byzantine fault-tolerant reconfigurable objects: clients can manipulate the object data and issue reconfiguration calls without reaching consensus on the current configuration. With the help of forward-secure digital signatures, our solution makes sure that superseded and possibly compromised configurations are harmless, that slow clients cannot be fooled into reading stale data, and that Byzantine clients cannot cause a denial of service by flooding the system with reconfiguration requests. Our approach is modular and based on dynamic lattice agreement abstraction, and we discuss how to extend it to enable Byzantine fault-tolerant implementations of a large class of reconfigurable replicated services
    • …
    corecore