344 research outputs found

    Using Failure Detection and Consensus in the General Omission Failure Model to Solve Security Problems

    Full text link
    It has recently been shown that fair exchange, a security problem in distributed systems, can be reduced to a fault tolerance problem, namely a special form of distributed consensus. The reduction uses the concept of security modules which reduce the type and nature of adversarial behavior to two standard fault-assumptions: message omission and process crash. In this paper, we investigate the feasibility of solving consensus in asynchronous systems in which crash and message omission faults may occur. Due to the impossibility result of consensus in such systems, following the lines of unreliable failure detectors of Chandra and Toueg, we add to the system a distributed device that gives information about the failure of other processes. Then we give an algorithm using this device to solve the consensus problem. Finally, we show how to implement such a device in a asynchronous untrusted environment using security modules and some weak timing assumptions

    Chasing the Weakest Failure Detector for k-Set Agreement in Message-passing Systems

    Get PDF
    This paper continues our quest for the weakest failure detector which allows the k-set agreement problem to be solved in asynchronous message-passing systems prone to any number of process failures. It has two main contributions which (we hope) will be instrumental to complete this quest. The first contribution is a new failure detector (denoted ∏∑x,y). This failure detector has several noteworthy properties. (a) It is stronger than ∑x which has been shown to be necessary. (b) It is equivalent to the pair (∑, Ω) when x = y = 1 (from which it follows that ∏∑1,1 is optimal to solve consensus). (c) It is equivalent to the pair (∑n−1, Ωn−1) when x = y = n − 1 (from which it follows that ∏∑n−1, n−1) is optimal for (n − 1)-set agreement). (d) It is strictly weaker than the pair (∑x,Ωy) (which has been investigated in previous works) for the pairs (x, y) such that 1 < y < x < n. (e) It is operational: the paper presents a ∏∑x,y-based algorithm that solves k-set agreement for k ⩾ xy. The second contribution of the paper is a proof that, for 1 < k < n − 1, the eventual leaders failure detector k (which eventually provides each process with the same set of k process identities, this set including at least one correct process) is not necessary to solve k-set agreement problem. More precisely, the paper shows that the weakest failure detector for k-set agreement and k cannot be compared

    Failure detectors in homonymous distributed systems (with an application to consensus)

    Get PDF
    ABSTRACT This paper is on homonymous distributed systems where processes are prone to crash failures and have no initial knowledge of the system membership (‘‘homonymous’’ means that several processes may have the same identifier). New classes of failure detectors suited to these systems are first defined. Among them, the classes HΩ and HΣ are introduced that are the homonymous counterparts of the Classes Ω and Σ, respectively. (Recall that the pair ⟨Ω, Σ⟩ defines the weakest failure detector to solve consensus.) Then, the paper shows how HΩ and HΣ can be implemented in homonymous systems without membership knowledge (under different synchrony requirements). Finally, two algorithms are presented that use these failure detectors to solve consensus in homonymous asynchronous systems where there is no initial knowledge of the membership. One algorithm solves consensus with ⟨HΩ, HΣ⟩, while the other uses only HΩ, but needs a majority of correct processes. Observe that the systems with unique identifiers and anonymous systems are extreme cases of homonymous systems from which follows that all these results also apply to these systems. Interestingly, the new failure detector class HΩ can be implemented with partial synchrony (i.e., all messages sent after some bounded time GST will be received after at most an unknown bounded latency δ), while the analogous class AΩ defined for anonymous systems cannot be implemented (even in synchronous systems). Hence, the paper provides the first consensus algorithm for anonymous systems with this model of partial synchrony and a majority of correct processes

    The Failure Detector Abstraction

    Get PDF
    A failure detector is a fundamental abstraction in distributed computing. This paper surveys this abstraction through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. In particular, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlight some limitations of the failure detector abstraction along each of the dimensions

    Fault-tolerant computing with unreliable channels

    Full text link
    We study implementations of basic fault-tolerant primitives, such as consensus and registers, in message-passing systems subject to process crashes and a broad range of communication failures. Our results characterize the necessary and sufficient conditions for implementing these primitives as a function of the connectivity constraints and synchrony assumptions. Our main contribution is a new algorithm for partially synchronous consensus that is resilient to process crashes and channel failures and is optimal in its connectivity requirements. In contrast to prior work, our algorithm assumes the most general model of message loss where faulty channels are flaky, i.e., can lose messages without any guarantee of fairness. This failure model is particularly challenging for consensus algorithms, as it rules out standard solutions based on leader oracles and failure detectors. To circumvent this limitation, we construct our solution using a new variant of the recently proposed view synchronizer abstraction, which we adapt to the crash-prone setting with flaky channels

    Communication Predicates: A high-level abstraction for coping with transient and dynamic faults

    Get PDF
    Consensus is one of the key problems in fault tolerant distributed computing. A very popular model for solving consensus is the failure detector model defined by Chandra and Toueg. However, the failure detector model has limitations. The paper points out these limitations, and suggests instead a model based on communication predicates, called HO model. The advantage of the HO model over failure detectors is shown, and the implementation of the HO model is discussed in the context of a system that alternates between good periods and bad periods. Two definitions of a good period are considered. For both definitions, the HO model allows us to compute the duration of a good period for solving consensus. Specifically, the model allows us to quantify the difference between the required length of an initial good period and the length of a non initial good period

    Otimizando a comunicação entre detectores de defeitos em sistemas distribuídos

    Get PDF
    Detectores de defeitos (FDs) não confiáveis são utilizados como bloco básico na especificação e implementação de toleráncia a falhas em sistemas distribuídos assíncronos. Um exemplo típico de sistemas distribuídos assíncronos é a Internet. Neste contexto, FDs tradicionais apresentam problemas, uma vez que seu projeto destina-se á redes controladas (LAN). Um problema a ser tratado é a explosão de mensagens, tendo em vista que tal impasse pode comprometer o desempenho do serviço dos FDs. Este artigo trata do problema da explosão de mensagens propondo uma abordagem genérica e prática que utiliza o reaproveitamento de mensagens para suprir mensagens de controle nos FDs, os experimentos demonstraram reduzir o número de mensagens contribuindo também na precisão dos FDs.Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Otimizando a comunicação entre detectores de defeitos em sistemas distribuídos

    Get PDF
    Detectores de defeitos (FDs) não confiáveis são utilizados como bloco básico na especificação e implementação de toleráncia a falhas em sistemas distribuídos assíncronos. Um exemplo típico de sistemas distribuídos assíncronos é a Internet. Neste contexto, FDs tradicionais apresentam problemas, uma vez que seu projeto destina-se á redes controladas (LAN). Um problema a ser tratado é a explosão de mensagens, tendo em vista que tal impasse pode comprometer o desempenho do serviço dos FDs. Este artigo trata do problema da explosão de mensagens propondo uma abordagem genérica e prática que utiliza o reaproveitamento de mensagens para suprir mensagens de controle nos FDs, os experimentos demonstraram reduzir o número de mensagens contribuindo também na precisão dos FDs.Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI
    corecore