249 research outputs found

    An Improved Approximate Consensus Algorithm in the Presence of Mobile Faults

    Full text link
    This paper explores the problem of reaching approximate consensus in synchronous point-to-point networks, where each pair of nodes is able to communicate with each other directly and reliably. We consider the mobile Byzantine fault model proposed by Garay '94 -- in the model, an omniscient adversary can corrupt up to ff nodes in each round, and at the beginning of each round, faults may "move" in the system (i.e., different sets of nodes may become faulty in different rounds). Recent work by Bonomi et al. '16 proposed a simple iterative approximate consensus algorithm which requires at least 4f+14f+1 nodes. This paper proposes a novel technique of using "confession" (a mechanism to allow others to ignore past behavior) and a variant of reliable broadcast to improve the fault-tolerance level. In particular, we present an approximate consensus algorithm that requires only ⌈7f/2⌉+1\lceil 7f/2\rceil + 1 nodes, an ⌊f/2⌋\lfloor f/2 \rfloor improvement over the state-of-the-art algorithms. Moreover, we also show that the proposed algorithm is optimal within a family of round-based algorithms

    Byzantine fault-tolerant agreement protocols for wireless Ad hoc networks

    Get PDF
    Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2010.The thesis investigates the problem of fault- and intrusion-tolerant consensus in resource-constrained wireless ad hoc networks. This is a fundamental problem in distributed computing because it abstracts the need to coordinate activities among various nodes. It has been shown to be a building block for several other important distributed computing problems like state-machine replication and atomic broadcast. The thesis begins by making a thorough performance assessment of existing intrusion-tolerant consensus protocols, which shows that the performance bottlenecks of current solutions are in part related to their system modeling assumptions. Based on these results, the communication failure model is identified as a model that simultaneously captures the reality of wireless ad hoc networks and allows the design of efficient protocols. Unfortunately, the model is subject to an impossibility result stating that there is no deterministic algorithm that allows n nodes to reach agreement if more than n2 omission transmission failures can occur in a communication step. This result is valid even under strict timing assumptions (i.e., a synchronous system). The thesis applies randomization techniques in increasingly weaker variants of this model, until an efficient intrusion-tolerant consensus protocol is achieved. The first variant simplifies the problem by restricting the number of nodes that may be at the source of a transmission failure at each communication step. An algorithm is designed that tolerates f dynamic nodes at the source of faulty transmissions in a system with a total of n 3f + 1 nodes. The second variant imposes no restrictions on the pattern of transmission failures. The proposed algorithm effectively circumvents the Santoro- Widmayer impossibility result for the first time. It allows k out of n nodes to decide despite dn 2 e(nk)+k2 omission failures per communication step. This algorithm also has the interesting property of guaranteeing safety during arbitrary periods of unrestricted message loss. The final variant shares the same properties of the previous one, but relaxes the model in the sense that the system is asynchronous and that a static subset of nodes may be malicious. The obtained algorithm, called Turquois, admits f < n 3 malicious nodes, and ensures progress in communication steps where dnf 2 e(n k f) + k 2. The algorithm is subject to a comparative performance evaluation against other intrusiontolerant protocols. The results show that, as the system scales, Turquois outperforms the other protocols by more than an order of magnitude.Esta tese investiga o problema do consenso tolerante a faltas acidentais e maliciosas em redes ad hoc sem fios. Trata-se de um problema fundamental que captura a essência da coordenação em actividades envolvendo vários nós de um sistema, sendo um bloco construtor de outros importantes problemas dos sistemas distribuídos como a replicação de máquina de estados ou a difusão atómica. A tese começa por efectuar uma avaliação de desempenho a protocolos tolerantes a intrusões já existentes na literatura. Os resultados mostram que as limitações de desempenho das soluções existentes estão em parte relacionadas com o seu modelo de sistema. Baseado nestes resultados, é identificado o modelo de falhas de comunicação como um modelo que simultaneamente permite capturar o ambiente das redes ad hoc sem fios e projectar protocolos eficientes. Todavia, o modelo é restrito por um resultado de impossibilidade que afirma não existir algoritmo algum que permita a n nós chegaram a acordo num sistema que admita mais do que n2 transmissões omissas num dado passo de comunicação. Este resultado é válido mesmo sob fortes hipóteses temporais (i.e., em sistemas síncronos) A tese aplica técnicas de aleatoriedade em variantes progressivamente mais fracas do modelo até ser alcançado um protocolo eficiente e tolerante a intrusões. A primeira variante do modelo, de forma a simplificar o problema, restringe o número de nós que estão na origem de transmissões faltosas. É apresentado um algoritmo que tolera f nós dinâmicos na origem de transmissões faltosas em sistemas com um total de n 3f + 1 nós. A segunda variante do modelo não impõe quaisquer restrições no padrão de transmissões faltosas. É apresentado um algoritmo que contorna efectivamente o resultado de impossibilidade Santoro-Widmayer pela primeira vez e que permite a k de n nós efectuarem progresso nos passos de comunicação em que o número de transmissões omissas seja dn 2 e(n k) + k 2. O algoritmo possui ainda a interessante propriedade de tolerar períodos arbitrários em que o número de transmissões omissas seja superior a . A última variante do modelo partilha das mesmas características da variante anterior, mas com pressupostos mais fracos sobre o sistema. Em particular, assume-se que o sistema é assíncrono e que um subconjunto estático dos nós pode ser malicioso. O algoritmo apresentado, denominado Turquois, admite f < n 3 nós maliciosos e assegura progresso nos passos de comunicação em que dnf 2 e(n k f) + k 2. O algoritmo é sujeito a uma análise de desempenho comparativa com outros protocolos na literatura. Os resultados demonstram que, à medida que o número de nós no sistema aumenta, o desempenho do protocolo Turquois ultrapassa os restantes em mais do que uma ordem de magnitude.FC

    Scalable Byzantine Reliable Broadcast

    Get PDF
    Byzantine reliable broadcast is a powerful primitive that allows a set of processes to agree on a message from a designated sender, even if some processes (including the sender) are Byzantine. Existing broadcast protocols for this setting scale poorly, as they typically build on quorum systems with strong intersection guarantees, which results in linear per-process communication and computation complexity. We generalize the Byzantine reliable broadcast abstraction to the probabilistic setting, allowing each of its properties to be violated with a fixed, arbitrarily small probability. We leverage these relaxed guarantees in a protocol where we replace quorums with stochastic samples. Compared to quorums, samples are significantly smaller in size, leading to a more scalable design. We obtain the first Byzantine reliable broadcast protocol with logarithmic per-process communication and computation complexity. We conduct a complete and thorough analysis of our protocol, deriving bounds on the probability of each of its properties being compromised. During our analysis, we introduce a novel general technique that we call adversary decorators. Adversary decorators allow us to make claims about the optimal strategy of the Byzantine adversary without imposing any additional assumptions. We also introduce Threshold Contagion, a model of message propagation through a system with Byzantine processes. To the best of our knowledge, this is the first formal analysis of a probabilistic broadcast protocol in the Byzantine fault model. We show numerically that practically negligible failure probabilities can be achieved with realistic security parameters

    Making Asynchronous Distributed Computations Robust to Channel Noise

    Get PDF
    We consider the problem of making distributed computations robust to noise, in particular to worst-case (adversarial) corruptions of messages. We give a general distributed interactive coding scheme which simulates any asynchronous distributed protocol while tolerating a maximal corruption level of Theta(1/n)-fraction of all messages. Our noise tolerance is optimal and is obtained with only a moderate overhead in the number of messages. Our result is the first fully distributed interactive coding scheme in which the topology of the communication network is not known in advance. Prior work required either a coordinating node to be connected to all other nodes in the network or assumed a synchronous network in which all nodes already know the complete topology of the network. Overcoming this more realistic setting of an unknown topology leads to intriguing distributed problems, in which nodes try to learn sufficient information about the network topology in order to perform efficient coding and routing operations for coping with the noise. What makes these problems hard is that these topology exploration computations themselves must already be robust to noise

    Distributed eventual leader election in the crash-recovery and general omission failure models.

    Get PDF
    102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism
    • …
    corecore