1,723 research outputs found

    Fault-Tolerant Distributed Services in Message-Passing Systems

    Get PDF
    Distributed systems ranging from small local area networks to large wide area networks like the Internet composed of static and/or mobile users have become increasingly popular. A desirable property for any distributed service is fault-tolerance, which means the service remains uninterrupted even if some components in the network fail. This dissertation considers weak distributed models to find either algorithms to solve certain problems or impossibility proofs to show that a problem is unsolvable. These are the main contributions of this dissertation: • Failure detectors are used as a service to solve consensus (agreement among nodes) which is otherwise impossible in failure-prone asynchronous systems. We find an algorithm for crash-failure detection that uses bounded size messages in an arbitrary, partitionable network composed of badly- behaved channels that can lose and reorder messages. • Registers are a fundamental building block for shared memory emulations on top of message passing systems. The problem has been extensively studied in static systems. However, register emulation in dynamic systems with faulty nodes is still quite hard and there are impossibility proofs that point out scenarios where change in the system composition due to nodes entering and leaving (also called churn) makes the problem unsolvable. We propose the first emulation of a crash-fault tolerant register in a system with continuous churn where consensus is unsolvable, the size of the system can grow without bound and at most a constant fraction of the number of nodes in the system can fail by crashing. We prove a lower bound that states that fault-tolerance for dynamic systems with churn is inherently lower than in static systems. • We then extend the results in the crash-fault tolerant case to a dynamic system with continuous churn and nodes that can be Byzantine faulty. It is the first emulation of an atomic register in a system that can withstand nodes continually entering and leaving, imposes no upper bound on the system size and can tolerate Byzantine nodes. However, the number of Byzantine faulty nodes that can be tolerated is upper bounded by a constant number. Although the algorithm requires that there be a constant known upper bound on the number of Byzantine nodes, this restriction is unavoidable, as we show that it is impossible to emulate an atomic register if the system size and maximum number of servers that can be Byzantine in the system is unknown

    Fault-Tolerant Distributed Services in Message-Passing Systems

    Get PDF
    Distributed systems ranging from small local area networks to large wide area networks like the Internet composed of static and/or mobile users have become increasingly popular. A desirable property for any distributed service is fault-tolerance, which means the service remains uninterrupted even if some components in the network fail. This dissertation considers weak distributed models to find either algorithms to solve certain problems or impossibility proofs to show that a problem is unsolvable. These are the main contributions of this dissertation: • Failure detectors are used as a service to solve consensus (agreement among nodes) which is otherwise impossible in failure-prone asynchronous systems. We find an algorithm for crash-failure detection that uses bounded size messages in an arbitrary, partitionable network composed of badly- behaved channels that can lose and reorder messages. • Registers are a fundamental building block for shared memory emulations on top of message passing systems. The problem has been extensively studied in static systems. However, register emulation in dynamic systems with faulty nodes is still quite hard and there are impossibility proofs that point out scenarios where change in the system composition due to nodes entering and leaving (also called churn) makes the problem unsolvable. We propose the first emulation of a crash-fault tolerant register in a system with continuous churn where consensus is unsolvable, the size of the system can grow without bound and at most a constant fraction of the number of nodes in the system can fail by crashing. We prove a lower bound that states that fault-tolerance for dynamic systems with churn is inherently lower than in static systems. • We then extend the results in the crash-fault tolerant case to a dynamic system with continuous churn and nodes that can be Byzantine faulty. It is the first emulation of an atomic register in a system that can withstand nodes continually entering and leaving, imposes no upper bound on the system size and can tolerate Byzantine nodes. However, the number of Byzantine faulty nodes that can be tolerated is upper bounded by a constant number. Although the algorithm requires that there be a constant known upper bound on the number of Byzantine nodes, this restriction is unavoidable, as we show that it is impossible to emulate an atomic register if the system size and maximum number of servers that can be Byzantine in the system is unknown

    Distributed k-ary System: Algorithms for Distributed Hash Tables

    Get PDF
    This dissertation presents algorithms for data structures called distributed hash tables (DHT) or structured overlay networks, which are used to build scalable self-managing distributed systems. The provided algorithms guarantee lookup consistency in the presence of dynamism: they guarantee consistent lookup results in the presence of nodes joining and leaving. Similarly, the algorithms guarantee that routing never fails while nodes join and leave. Previous algorithms for lookup consistency either suffer from starvation, do not work in the presence of failures, or lack proof of correctness. Several group communication algorithms for structured overlay networks are presented. We provide an overlay broadcast algorithm, which unlike previous algorithms avoids redundant messages, reaching all nodes in O(log n) time, while using O(n) messages, where n is the number of nodes in the system. The broadcast algorithm is used to build overlay multicast. We introduce bulk operation, which enables a node to efficiently make multiple lookups or send a message to all nodes in a specified set of identifiers. The algorithm ensures that all specified nodes are reached in O(log n) time, sending maximum O(log n) messages per node, regardless of the input size of the bulk operation. Moreover, the algorithm avoids sending redundant messages. Previous approaches required multiple lookups, which consume more messages and can render the initiator a bottleneck. Our algorithms are used in DHT-based storage systems, where nodes can do thousands of lookups to fetch large files. We use the bulk operation algorithm to construct a pseudo-reliable broadcast algorithm. Bulk operations can also be used to implement efficient range queries. Finally, we describe a novel way to place replicas in a DHT, called symmetric replication, that enables parallel recursive lookups. Parallel lookups are known to reduce latencies. However, costly iterative lookups have previously been used to do parallel lookups. Moreover, joins or leaves only require exchanging O(1) messages, while other schemes require at least log(f) messages for a replication degree of f. The algorithms have been implemented in a middleware called the Distributed k-ary System (DKS), which is briefly described

    Byzantine fault-tolerant agreement protocols for wireless Ad hoc networks

    Get PDF
    Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2010.The thesis investigates the problem of fault- and intrusion-tolerant consensus in resource-constrained wireless ad hoc networks. This is a fundamental problem in distributed computing because it abstracts the need to coordinate activities among various nodes. It has been shown to be a building block for several other important distributed computing problems like state-machine replication and atomic broadcast. The thesis begins by making a thorough performance assessment of existing intrusion-tolerant consensus protocols, which shows that the performance bottlenecks of current solutions are in part related to their system modeling assumptions. Based on these results, the communication failure model is identified as a model that simultaneously captures the reality of wireless ad hoc networks and allows the design of efficient protocols. Unfortunately, the model is subject to an impossibility result stating that there is no deterministic algorithm that allows n nodes to reach agreement if more than n2 omission transmission failures can occur in a communication step. This result is valid even under strict timing assumptions (i.e., a synchronous system). The thesis applies randomization techniques in increasingly weaker variants of this model, until an efficient intrusion-tolerant consensus protocol is achieved. The first variant simplifies the problem by restricting the number of nodes that may be at the source of a transmission failure at each communication step. An algorithm is designed that tolerates f dynamic nodes at the source of faulty transmissions in a system with a total of n 3f + 1 nodes. The second variant imposes no restrictions on the pattern of transmission failures. The proposed algorithm effectively circumvents the Santoro- Widmayer impossibility result for the first time. It allows k out of n nodes to decide despite dn 2 e(nk)+k2 omission failures per communication step. This algorithm also has the interesting property of guaranteeing safety during arbitrary periods of unrestricted message loss. The final variant shares the same properties of the previous one, but relaxes the model in the sense that the system is asynchronous and that a static subset of nodes may be malicious. The obtained algorithm, called Turquois, admits f < n 3 malicious nodes, and ensures progress in communication steps where dnf 2 e(n k f) + k 2. The algorithm is subject to a comparative performance evaluation against other intrusiontolerant protocols. The results show that, as the system scales, Turquois outperforms the other protocols by more than an order of magnitude.Esta tese investiga o problema do consenso tolerante a faltas acidentais e maliciosas em redes ad hoc sem fios. Trata-se de um problema fundamental que captura a essência da coordenação em actividades envolvendo vários nós de um sistema, sendo um bloco construtor de outros importantes problemas dos sistemas distribuídos como a replicação de máquina de estados ou a difusão atómica. A tese começa por efectuar uma avaliação de desempenho a protocolos tolerantes a intrusões já existentes na literatura. Os resultados mostram que as limitações de desempenho das soluções existentes estão em parte relacionadas com o seu modelo de sistema. Baseado nestes resultados, é identificado o modelo de falhas de comunicação como um modelo que simultaneamente permite capturar o ambiente das redes ad hoc sem fios e projectar protocolos eficientes. Todavia, o modelo é restrito por um resultado de impossibilidade que afirma não existir algoritmo algum que permita a n nós chegaram a acordo num sistema que admita mais do que n2 transmissões omissas num dado passo de comunicação. Este resultado é válido mesmo sob fortes hipóteses temporais (i.e., em sistemas síncronos) A tese aplica técnicas de aleatoriedade em variantes progressivamente mais fracas do modelo até ser alcançado um protocolo eficiente e tolerante a intrusões. A primeira variante do modelo, de forma a simplificar o problema, restringe o número de nós que estão na origem de transmissões faltosas. É apresentado um algoritmo que tolera f nós dinâmicos na origem de transmissões faltosas em sistemas com um total de n 3f + 1 nós. A segunda variante do modelo não impõe quaisquer restrições no padrão de transmissões faltosas. É apresentado um algoritmo que contorna efectivamente o resultado de impossibilidade Santoro-Widmayer pela primeira vez e que permite a k de n nós efectuarem progresso nos passos de comunicação em que o número de transmissões omissas seja dn 2 e(n k) + k 2. O algoritmo possui ainda a interessante propriedade de tolerar períodos arbitrários em que o número de transmissões omissas seja superior a . A última variante do modelo partilha das mesmas características da variante anterior, mas com pressupostos mais fracos sobre o sistema. Em particular, assume-se que o sistema é assíncrono e que um subconjunto estático dos nós pode ser malicioso. O algoritmo apresentado, denominado Turquois, admite f < n 3 nós maliciosos e assegura progresso nos passos de comunicação em que dnf 2 e(n k f) + k 2. O algoritmo é sujeito a uma análise de desempenho comparativa com outros protocolos na literatura. Os resultados demonstram que, à medida que o número de nós no sistema aumenta, o desempenho do protocolo Turquois ultrapassa os restantes em mais do que uma ordem de magnitude.FC

    Self-stabilizing virtual synchrony

    Get PDF
    Virtual synchrony (VS) is an important abstraction that is proven to be extremely useful when implemented over asynchronous, typically large, message-passing distributed systems. Fault tolerant design is critical for the success of such implementations since large distributed systems can be highly available as long as they do not depend on the full operational status of every system participant. Self-stabilizing systems can tolerate transient faults that drive the system to an arbitrary unpredictable configuration. Such systems automatically regain consistency from any such configuration, and then produce the desired system behavior ensuring it for practically infinite number of successive steps, e.g., 264 steps. We present a new multi-purpose self-stabilizing counter algorithm establishing an efficient practically unbounded counter, that can directly yield a self-stabilizing Multiple-Writer Multiple-Reader (MWMR) register emulation. We use our counter algorithm, together with a selfstabilizing group membership and a self-stabilizing multicast service to devise the first practically stabilizing VS algorithm and a self-stabilizing VS-based emulation of state machine replication (SMR). As we base the SMR implementation on VS, rather than consensus, the system progresses in more extreme asynchronous settings in relation to consensusbased SMR

    The Weakest Failure Detector for Solving Wait-Free, Eventually Bounded-Fair Dining Philosophers

    Get PDF
    This dissertation explores the necessary and sufficient conditions to solve a variant of the dining philosophers problem. This dining variant is defined by three properties: wait-freedom, eventual weak exclusion, and eventual bounded fairness. Wait-freedom guarantees that every correct hungry process eventually enters its critical section, regardless of process crashes. Eventual weak exclusion guarantees that every execution has an infinite suffix during which no two live neighbors execute overlapping critical sections. Eventual bounded fairness guarantees that there exists a fairness bound k such that every execution has an infinite suffix during which no correct hungry process is overtaken more than k times by any neighbor. This dining variant (WF-EBF dining for short) is important for synchronization tasks where eventual safety (i.e., eventual weak exclusion) is sufficient for correctness (e.g., duty-cycle scheduling, self-stabilizing daemons, and contention managers). Unfortunately, it is known that wait-free dining is unsolvable in asynchronous message-passing systems subject to crash faults. To circumvent this impossibility result, it is necessary to assume the existence of bounds on timing properties, such as relative process speeds and message delivery time. As such, it is of interest to characterize the necessary and sufficient timing assumptions to solve WF-EBF dining. We focus on implicit timing assumptions, which can be encapsulated by failure detectors. Failure detectors can be viewed as distributed oracles that can be queried for potentially unreliable information about crash faults. The weakest detector D for WF-EBF dining means that D is both necessary and sufficient. Necessity means that every failure detector that solves WF-EBF dining is at least as strong as D. Sufficiency means that there exists at least one algorithm that solves WF-EBF dining using D. As such, our research goal is to characterize the weakest failure detector to solve WF-EBF dining. We prove that the eventually perfect failure detector 3P is the weakest failure detector for solving WF-EBF dining. 3P eventually suspects crashed processes permanently, but may make mistakes by wrongfully suspecting correct processes finitely many times during any execution. As such, 3P eventually stops suspecting correct processes

    Design and performance study of algorithms for consensus in sparse, mobile ad-hoc networks

    Get PDF
    PhD ThesisMobile Ad-hoc Networks (MANETs) are self-organizing wireless networks that consist of mobile wireless devices (nodes). These networks operate without the aid of any form of supporting infrastructure, and thus need the participating nodes to co-operate by forwarding each other’s messages. MANETs can be deployed when urgent temporary communications are required or when installing network infrastructure is considered too costly or too slow, for example in environments such as battlefields, crisis management or space exploration. Consensus is central to several applications including collaborative ones which a MANET can facilitate for mobile users. This thesis solves the consensus problem in a sparse MANET in which a node can at times have no other node in its wireless range and useful end-to-end connectivity between nodes can just be a temporary feature that emerges at arbitrary intervals of time for any given node pair. Efficient one-to-many dissemination, essential for consensus, now becomes a challenge: enough number of destinations cannot deliver a multicast unless nodes retain the multicast message for exercising opportunistic forwarding. Seeking to keep storage and bandwidth costs low, we propose two protocols. An eventually relinquishing (}RC) protocol that does not store messages for long is used for attempting at consensus, and an eventually quiescent (}QC) one that stops forwarding messages after a while is used for concluding consensus. Use of }RC protocol poses additional challenges for consensus, when the fraction, f n, of nodes that can crash is: 1 4 f n < 1 2 . Consensus latency and packet overhead are measured through simulation indicating that they are not too high to be feasible in MANETs. They both decrease considerably even for a modest increase in network density.Damascus University

    Self-organizing Network Optimization via Placement of Additional Nodes

    Get PDF
    Das Hauptforschungsgebiet des Graduiertenkollegs "International Graduate School on Mobile Communication" (GS Mobicom) der Technischen Universität Ilmenau ist die Kommunikation in Katastrophenszenarien. Wegen eines Desasters oder einer Katastrophe können die terrestrischen Elementen der Infrastruktur eines Kommunikationsnetzwerks beschädigt oder komplett zerstört werden. Dennoch spielen verfügbare Kommunikationsnetze eine sehr wichtige Rolle während der Rettungsmaßnahmen, besonders für die Koordinierung der Rettungstruppen und für die Kommunikation zwischen ihren Mitgliedern. Ein solcher Service kann durch ein mobiles Ad-Hoc-Netzwerk (MANET) zur Verfügung gestellt werden. Ein typisches Problem der MANETs ist Netzwerkpartitionierung, welche zur Isolation von verschiedenen Knotengruppen führt. Eine mögliche Lösung dieses Problems ist die Positionierung von zusätzlichen Knoten, welche die Verbindung zwischen den isolierten Partitionen wiederherstellen können. Hauptziele dieser Arbeit sind die Recherche und die Entwicklung von Algorithmen und Methoden zur Positionierung der zusätzlichen Knoten. Der Fokus der Recherche liegt auf Untersuchung der verteilten Algorithmen zur Bestimmung der Positionen für die zusätzlichen Knoten. Die verteilten Algorithmen benutzen nur die Information, welche in einer lokalen Umgebung eines Knotens verfügbar ist, und dadurch entsteht ein selbstorganisierendes System. Jedoch wird das gesamte Netzwerk hier vor allem innerhalb eines ganz speziellen Szenarios - Katastrophenszenario - betrachtet. In einer solchen Situation kann die Information über die Topologie des zu reparierenden Netzwerks im Voraus erfasst werden und soll, natürlich, für die Wiederherstellung mitbenutzt werden. Dank der eventuell verfügbaren zusätzlichen Information können die Positionen für die zusätzlichen Knoten genauer ermittelt werden. Die Arbeit umfasst eine Beschreibung, Implementierungsdetails und eine Evaluierung eines selbstorganisierendes Systems, welche die Netzwerkwiederherstellung in beiden Szenarien ermöglicht.The main research area of the International Graduate School on Mobile Communication (GS Mobicom) at Ilmenau University of Technology is communication in disaster scenarios. Due to a disaster or an accident, the network infrastructure can be damaged or even completely destroyed. However, available communication networks play a vital role during the rescue activities especially for the coordination of the rescue teams and for the communication between their members. Such a communication service can be provided by a Mobile Ad-Hoc Network (MANET). One of the typical problems of a MANET is network partitioning, when separate groups of nodes become isolated from each other. One possible solution for this problem is the placement of additional nodes in order to reconstruct the communication links between isolated network partitions. The primary goal of this work is the research and development of algorithms and methods for the placement of additional nodes. The focus of this research lies on the investigation of distributed algorithms for the placement of additional nodes, which use only the information from the nodes’ local environment and thus form a self-organizing system. However, during the usage specifics of the system in a disaster scenario, global information about the topology of the network to be recovered can be known or collected in advance. In this case, it is of course reasonable to use this information in order to calculate the placement positions more precisely. The work provides the description, the implementation details and the evaluation of a self-organizing system which is able to recover from network partitioning in both situations
    corecore