1,723 research outputs found
Fault-Tolerant Distributed Services in Message-Passing Systems
Distributed systems ranging from small local area networks to large wide area networks like the Internet composed of static and/or mobile users have become increasingly popular. A desirable property for any distributed service is fault-tolerance, which means the service remains uninterrupted even if some components in the network fail. This dissertation considers weak distributed models to find either algorithms to solve certain problems or impossibility proofs to show that a problem is unsolvable. These are the main contributions of this dissertation:
• Failure detectors are used as a service to solve consensus (agreement among nodes) which is otherwise impossible in failure-prone asynchronous systems. We find an algorithm for crash-failure detection that uses bounded size messages in an arbitrary, partitionable network composed of badly- behaved channels that can lose and reorder messages.
• Registers are a fundamental building block for shared memory emulations on top of message passing systems. The problem has been extensively studied in static systems. However, register emulation in dynamic systems with faulty nodes is still quite hard and there are impossibility proofs that point out scenarios where change in the system composition due to nodes entering and leaving (also called churn) makes the problem unsolvable.
We propose the first emulation of a crash-fault tolerant register in a system with continuous churn where consensus is unsolvable, the size of the system can grow without bound and at most a constant fraction of the number of nodes in the system can fail by crashing. We prove a lower bound that states that fault-tolerance for dynamic systems with churn is inherently lower than in static systems.
• We then extend the results in the crash-fault tolerant case to a dynamic system with continuous churn and nodes that can be Byzantine faulty. It is the first emulation of an atomic register in a system that can withstand nodes continually entering and leaving, imposes no upper bound on the system size and can tolerate Byzantine nodes. However, the number of Byzantine faulty nodes that can be tolerated is upper bounded by a constant number. Although the algorithm requires that there be a constant known upper bound on the number of Byzantine nodes, this restriction is unavoidable, as we show that it is impossible to emulate an atomic register if the system size and maximum number of servers that can be Byzantine in the system is unknown
Fault-Tolerant Distributed Services in Message-Passing Systems
Distributed systems ranging from small local area networks to large wide area networks like the Internet composed of static and/or mobile users have become increasingly popular. A desirable property for any distributed service is fault-tolerance, which means the service remains uninterrupted even if some components in the network fail. This dissertation considers weak distributed models to find either algorithms to solve certain problems or impossibility proofs to show that a problem is unsolvable. These are the main contributions of this dissertation:
• Failure detectors are used as a service to solve consensus (agreement among nodes) which is otherwise impossible in failure-prone asynchronous systems. We find an algorithm for crash-failure detection that uses bounded size messages in an arbitrary, partitionable network composed of badly- behaved channels that can lose and reorder messages.
• Registers are a fundamental building block for shared memory emulations on top of message passing systems. The problem has been extensively studied in static systems. However, register emulation in dynamic systems with faulty nodes is still quite hard and there are impossibility proofs that point out scenarios where change in the system composition due to nodes entering and leaving (also called churn) makes the problem unsolvable.
We propose the first emulation of a crash-fault tolerant register in a system with continuous churn where consensus is unsolvable, the size of the system can grow without bound and at most a constant fraction of the number of nodes in the system can fail by crashing. We prove a lower bound that states that fault-tolerance for dynamic systems with churn is inherently lower than in static systems.
• We then extend the results in the crash-fault tolerant case to a dynamic system with continuous churn and nodes that can be Byzantine faulty. It is the first emulation of an atomic register in a system that can withstand nodes continually entering and leaving, imposes no upper bound on the system size and can tolerate Byzantine nodes. However, the number of Byzantine faulty nodes that can be tolerated is upper bounded by a constant number. Although the algorithm requires that there be a constant known upper bound on the number of Byzantine nodes, this restriction is unavoidable, as we show that it is impossible to emulate an atomic register if the system size and maximum number of servers that can be Byzantine in the system is unknown
Distributed k-ary System: Algorithms for Distributed Hash Tables
This dissertation presents algorithms for data structures called distributed hash tables (DHT) or structured overlay networks, which are used to build scalable self-managing distributed systems. The provided algorithms guarantee lookup consistency in the presence of dynamism: they guarantee consistent lookup results in the presence of nodes joining and leaving. Similarly, the algorithms guarantee that routing never fails while nodes join and leave. Previous algorithms for lookup consistency either suffer from starvation, do not work in the presence of failures, or lack proof of correctness.
Several group communication algorithms for structured overlay networks are presented. We provide an overlay broadcast algorithm, which unlike previous algorithms avoids redundant messages, reaching all nodes in O(log n) time, while using O(n) messages, where n is the number of nodes in the system. The broadcast algorithm is used to build overlay multicast.
We introduce bulk operation, which enables a node to efficiently make multiple lookups or send a message to all nodes in a specified set of identifiers. The algorithm ensures that all specified nodes are reached in O(log n) time, sending maximum O(log n) messages per node, regardless of the input size of the bulk operation. Moreover, the algorithm avoids sending redundant messages. Previous approaches required multiple lookups, which consume more messages and can render the initiator a bottleneck. Our algorithms are used in DHT-based storage systems, where nodes can do thousands of lookups to fetch large files. We use the bulk operation algorithm to construct a pseudo-reliable broadcast algorithm. Bulk operations can also be used to implement efficient range queries.
Finally, we describe a novel way to place replicas in a DHT, called symmetric replication, that enables parallel recursive lookups. Parallel lookups are known to reduce latencies. However, costly iterative lookups have previously been used to do parallel lookups. Moreover, joins or leaves only require exchanging O(1) messages, while other schemes require at least log(f) messages for a replication degree of f.
The algorithms have been implemented in a middleware called the Distributed k-ary System (DKS), which is briefly described
Byzantine fault-tolerant agreement protocols for wireless Ad hoc networks
Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2010.The thesis investigates the problem of fault- and intrusion-tolerant consensus
in resource-constrained wireless ad hoc networks. This is a fundamental
problem in distributed computing because it abstracts the need
to coordinate activities among various nodes. It has been shown to be a
building block for several other important distributed computing problems
like state-machine replication and atomic broadcast.
The thesis begins by making a thorough performance assessment of existing
intrusion-tolerant consensus protocols, which shows that the performance
bottlenecks of current solutions are in part related to their system
modeling assumptions. Based on these results, the communication failure
model is identified as a model that simultaneously captures the reality
of wireless ad hoc networks and allows the design of efficient protocols.
Unfortunately, the model is subject to an impossibility result stating that
there is no deterministic algorithm that allows n nodes to reach agreement
if more than n2 omission transmission failures can occur in a communication
step. This result is valid even under strict timing assumptions (i.e.,
a synchronous system).
The thesis applies randomization techniques in increasingly weaker variants
of this model, until an efficient intrusion-tolerant consensus protocol
is achieved. The first variant simplifies the problem by restricting the
number of nodes that may be at the source of a transmission failure at
each communication step. An algorithm is designed that tolerates f dynamic
nodes at the source of faulty transmissions in a system with a total
of n 3f + 1 nodes.
The second variant imposes no restrictions on the pattern of transmission
failures. The proposed algorithm effectively circumvents the Santoro-
Widmayer impossibility result for the first time. It allows k out of n nodes
to decide despite dn
2 e(nk)+k2 omission failures per communication
step. This algorithm also has the interesting property of guaranteeing
safety during arbitrary periods of unrestricted message loss.
The final variant shares the same properties of the previous one, but relaxes
the model in the sense that the system is asynchronous and that a
static subset of nodes may be malicious. The obtained algorithm, called
Turquois, admits f < n
3 malicious nodes, and ensures progress in communication
steps where dnf
2 e(n k f) + k 2. The algorithm is
subject to a comparative performance evaluation against other intrusiontolerant
protocols. The results show that, as the system scales, Turquois
outperforms the other protocols by more than an order of magnitude.Esta tese investiga o problema do consenso tolerante a faltas acidentais
e maliciosas em redes ad hoc sem fios. Trata-se de um problema fundamental
que captura a essência da coordenação em actividades envolvendo
vários nós de um sistema, sendo um bloco construtor de outros importantes
problemas dos sistemas distribuídos como a replicação de máquina
de estados ou a difusão atómica.
A tese começa por efectuar uma avaliação de desempenho a protocolos
tolerantes a intrusões já existentes na literatura. Os resultados mostram
que as limitações de desempenho das soluções existentes estão em parte
relacionadas com o seu modelo de sistema. Baseado nestes resultados, é
identificado o modelo de falhas de comunicação como um modelo que simultaneamente
permite capturar o ambiente das redes ad hoc sem fios e
projectar protocolos eficientes. Todavia, o modelo é restrito por um resultado
de impossibilidade que afirma não existir algoritmo algum que permita
a n nós chegaram a acordo num sistema que admita mais do que n2
transmissões omissas num dado passo de comunicação. Este resultado é
válido mesmo sob fortes hipóteses temporais (i.e., em sistemas síncronos)
A tese aplica técnicas de aleatoriedade em variantes progressivamente
mais fracas do modelo até ser alcançado um protocolo eficiente e tolerante
a intrusões. A primeira variante do modelo, de forma a simplificar
o problema, restringe o número de nós que estão na origem de transmissões
faltosas. É apresentado um algoritmo que tolera f nós dinâmicos na
origem de transmissões faltosas em sistemas com um total de n 3f + 1
nós.
A segunda variante do modelo não impõe quaisquer restrições no padrão
de transmissões faltosas. É apresentado um algoritmo que contorna efectivamente
o resultado de impossibilidade Santoro-Widmayer pela primeira
vez e que permite a k de n nós efectuarem progresso nos passos de comunicação
em que o número de transmissões omissas seja dn
2 e(n
k) + k 2. O algoritmo possui ainda a interessante propriedade de tolerar
períodos arbitrários em que o número de transmissões omissas seja
superior a .
A última variante do modelo partilha das mesmas características da variante
anterior, mas com pressupostos mais fracos sobre o sistema. Em particular,
assume-se que o sistema é assíncrono e que um subconjunto estático
dos nós pode ser malicioso. O algoritmo apresentado, denominado
Turquois, admite f < n
3 nós maliciosos e assegura progresso nos passos
de comunicação em que dnf
2 e(n k f) + k 2. O algoritmo é
sujeito a uma análise de desempenho comparativa com outros protocolos
na literatura. Os resultados demonstram que, à medida que o número de
nós no sistema aumenta, o desempenho do protocolo Turquois ultrapassa
os restantes em mais do que uma ordem de magnitude.FC
Self-stabilizing virtual synchrony
Virtual synchrony (VS) is an important abstraction that is proven to be extremely useful when implemented over asynchronous, typically large, message-passing distributed systems. Fault tolerant design is critical for the success of such implementations since large distributed systems can be highly available as long as they do not depend on the full operational status of every system participant. Self-stabilizing systems can tolerate transient faults that drive the system to an arbitrary unpredictable configuration. Such systems automatically regain consistency from any such configuration, and then produce the desired system behavior ensuring it for practically infinite number of successive steps, e.g., 264 steps. We present a new multi-purpose self-stabilizing counter algorithm establishing an efficient practically unbounded counter, that can directly yield a self-stabilizing Multiple-Writer Multiple-Reader (MWMR) register emulation. We use our counter algorithm, together with a selfstabilizing group membership and a self-stabilizing multicast service to devise the first practically stabilizing VS algorithm and a self-stabilizing VS-based emulation of state machine replication (SMR). As we base the SMR implementation on VS, rather than consensus, the system progresses in more extreme asynchronous settings in relation to consensusbased SMR
The Weakest Failure Detector for Solving Wait-Free, Eventually Bounded-Fair Dining Philosophers
This dissertation explores the necessary and sufficient conditions to solve a variant
of the dining philosophers problem. This dining variant is defined by three properties:
wait-freedom, eventual weak exclusion, and eventual bounded fairness. Wait-freedom
guarantees that every correct hungry process eventually enters its critical
section, regardless of process crashes. Eventual weak exclusion guarantees that every
execution has an infinite suffix during which no two live neighbors execute overlapping
critical sections. Eventual bounded fairness guarantees that there exists a
fairness bound k such that every execution has an infinite suffix during which no
correct hungry process is overtaken more than k times by any neighbor. This dining
variant (WF-EBF dining for short) is important for synchronization tasks where eventual
safety (i.e., eventual weak exclusion) is sufficient for correctness (e.g., duty-cycle
scheduling, self-stabilizing daemons, and contention managers).
Unfortunately, it is known that wait-free dining is unsolvable in asynchronous
message-passing systems subject to crash faults. To circumvent this impossibility
result, it is necessary to assume the existence of bounds on timing properties, such
as relative process speeds and message delivery time. As such, it is of interest to
characterize the necessary and sufficient timing assumptions to solve WF-EBF dining.
We focus on implicit timing assumptions, which can be encapsulated by failure detectors. Failure detectors can be viewed as distributed oracles that can be queried
for potentially unreliable information about crash faults. The weakest detector D for
WF-EBF dining means that D is both necessary and sufficient. Necessity means that
every failure detector that solves WF-EBF dining is at least as strong as D. Sufficiency
means that there exists at least one algorithm that solves WF-EBF dining using D.
As such, our research goal is to characterize the weakest failure detector to solve
WF-EBF dining.
We prove that the eventually perfect failure detector 3P is the weakest failure
detector for solving WF-EBF dining. 3P eventually suspects crashed processes permanently,
but may make mistakes by wrongfully suspecting correct processes finitely
many times during any execution. As such, 3P eventually stops suspecting correct
processes
Design and performance study of algorithms for consensus in sparse, mobile ad-hoc networks
PhD ThesisMobile Ad-hoc Networks (MANETs) are self-organizing wireless networks that consist
of mobile wireless devices (nodes). These networks operate without the aid
of any form of supporting infrastructure, and thus need the participating nodes to
co-operate by forwarding each other’s messages. MANETs can be deployed when
urgent temporary communications are required or when installing network infrastructure
is considered too costly or too slow, for example in environments such as
battlefields, crisis management or space exploration.
Consensus is central to several applications including collaborative ones which a
MANET can facilitate for mobile users. This thesis solves the consensus problem in
a sparse MANET in which a node can at times have no other node in its wireless
range and useful end-to-end connectivity between nodes can just be a temporary
feature that emerges at arbitrary intervals of time for any given node pair.
Efficient one-to-many dissemination, essential for consensus, now becomes a challenge:
enough number of destinations cannot deliver a multicast unless nodes retain
the multicast message for exercising opportunistic forwarding. Seeking to keep storage
and bandwidth costs low, we propose two protocols. An eventually relinquishing
(}RC) protocol that does not store messages for long is used for attempting at consensus,
and an eventually quiescent (}QC) one that stops forwarding messages
after a while is used for concluding consensus. Use of }RC protocol poses additional
challenges for consensus, when the fraction, f
n, of nodes that can crash is:
1
4 f
n < 1
2 .
Consensus latency and packet overhead are measured through simulation indicating
that they are not too high to be feasible in MANETs. They both decrease
considerably even for a modest increase in network density.Damascus University
Self-organizing Network Optimization via Placement of Additional Nodes
Das Hauptforschungsgebiet des Graduiertenkollegs "International Graduate
School on Mobile Communication" (GS Mobicom) der Technischen Universität
Ilmenau ist die Kommunikation in Katastrophenszenarien. Wegen eines
Desasters oder einer Katastrophe können die terrestrischen Elementen der
Infrastruktur eines Kommunikationsnetzwerks beschädigt oder komplett
zerstört werden. Dennoch spielen verfügbare Kommunikationsnetze eine sehr
wichtige Rolle während der Rettungsmaßnahmen, besonders für die
Koordinierung der Rettungstruppen und für die Kommunikation zwischen ihren
Mitgliedern. Ein solcher Service kann durch ein mobiles Ad-Hoc-Netzwerk
(MANET) zur Verfügung gestellt werden. Ein typisches Problem der MANETs
ist Netzwerkpartitionierung, welche zur Isolation von verschiedenen
Knotengruppen führt. Eine mögliche Lösung dieses Problems ist die
Positionierung von zusätzlichen Knoten, welche die Verbindung zwischen den
isolierten Partitionen wiederherstellen können. Hauptziele dieser Arbeit
sind die Recherche und die Entwicklung von Algorithmen und Methoden zur
Positionierung der zusätzlichen Knoten. Der Fokus der Recherche liegt auf
Untersuchung der verteilten Algorithmen zur Bestimmung der Positionen für
die zusätzlichen Knoten. Die verteilten Algorithmen benutzen nur die
Information, welche in einer lokalen Umgebung eines Knotens verfügbar ist,
und dadurch entsteht ein selbstorganisierendes System. Jedoch wird das
gesamte Netzwerk hier vor allem innerhalb eines ganz speziellen Szenarios -
Katastrophenszenario - betrachtet. In einer solchen Situation kann die
Information über die Topologie des zu reparierenden Netzwerks im Voraus
erfasst werden und soll, natürlich, für die Wiederherstellung mitbenutzt
werden. Dank der eventuell verfügbaren zusätzlichen Information können
die Positionen für die zusätzlichen Knoten genauer ermittelt werden. Die
Arbeit umfasst eine Beschreibung, Implementierungsdetails und eine
Evaluierung eines selbstorganisierendes Systems, welche die
Netzwerkwiederherstellung in beiden Szenarien ermöglicht.The main research area of the International Graduate School on Mobile
Communication (GS Mobicom) at Ilmenau University of Technology is
communication in disaster scenarios. Due to a disaster or an accident, the
network infrastructure can be damaged or even completely destroyed.
However, available communication networks play a vital role during the
rescue activities especially for the coordination of the rescue teams and
for the communication between their members. Such a communication service
can be provided by a Mobile Ad-Hoc Network (MANET). One of the typical
problems of a MANET is network partitioning, when separate groups of nodes
become isolated from each other. One possible solution for this problem is
the placement of additional nodes in order to reconstruct the communication
links between isolated network partitions. The primary goal of this work is
the research and development of algorithms and methods for the placement of
additional nodes. The focus of this research lies on the investigation of
distributed algorithms for the placement of additional nodes, which use
only the information from the nodes’ local environment and thus form a
self-organizing system. However, during the usage specifics of the system
in a disaster scenario, global information about the topology of the
network to be recovered can be known or collected in advance. In this case,
it is of course reasonable to use this information in order to calculate
the placement positions more precisely. The work provides the description,
the implementation details and the evaluation of a self-organizing system
which is able to recover from network partitioning in both situations
- …