6,858 research outputs found
A Dual Digraph Approach for Leaderless Atomic Broadcast (Extended Version)
Many distributed systems work on a common shared state; in such systems,
distributed agreement is necessary for consistency. With an increasing number
of servers, these systems become more susceptible to single-server failures,
increasing the relevance of fault-tolerance. Atomic broadcast enables
fault-tolerant distributed agreement, yet it is costly to solve. Most practical
algorithms entail linear work per broadcast message. AllConcur -- a leaderless
approach -- reduces the work, by connecting the servers via a sparse resilient
overlay network; yet, this resiliency entails redundancy, limiting the
reduction of work. In this paper, we propose AllConcur+, an atomic broadcast
algorithm that lifts this limitation: During intervals with no failures, it
achieves minimal work by using a redundancy-free overlay network. When failures
do occur, it automatically recovers by switching to a resilient overlay
network. In our performance evaluation of non-failure scenarios, AllConcur+
achieves comparable throughput to AllGather -- a non-fault-tolerant distributed
agreement algorithm -- and outperforms AllConcur, LCR and Libpaxos both in
terms of throughput and latency. Furthermore, our evaluation of failure
scenarios shows that AllConcur+'s expected performance is robust with regard to
occasional failures. Thus, for realistic use cases, leveraging redundancy-free
distributed agreement during intervals with no failures improves performance
significantly.Comment: Overview: 24 pages, 6 sections, 3 appendices, 8 figures, 3 tables.
Modifications from previous version: extended the evaluation of AllConcur+
with a simulation of a multiple datacenters deploymen
Effective Edge-Fault-Tolerant Single-Source Spanners via Best (or Good) Swap Edges
Computing \emph{all best swap edges} (ABSE) of a spanning tree of a given
-vertex and -edge undirected and weighted graph means to select, for
each edge of , a corresponding non-tree edge , in such a way that the
tree obtained by replacing with enjoys some optimality criterion (which
is naturally defined according to some objective function originally addressed
by ). Solving efficiently an ABSE problem is by now a classic algorithmic
issue, since it conveys a very successful way of coping with a (transient)
\emph{edge failure} in tree-based communication networks: just replace the
failing edge with its respective swap edge, so as that the connectivity is
promptly reestablished by minimizing the rerouting and set-up costs. In this
paper, we solve the ABSE problem for the case in which is a
\emph{single-source shortest-path tree} of , and our two selected swap
criteria aim to minimize either the \emph{maximum} or the \emph{average
stretch} in the swap tree of all the paths emanating from the source. Having
these criteria in mind, the obtained structures can then be reviewed as
\emph{edge-fault-tolerant single-source spanners}. For them, we propose two
efficient algorithms running in and time, respectively, and we show that the guaranteed (either
maximum or average, respectively) stretch factor is equal to 3, and this is
tight. Moreover, for the maximum stretch, we also propose an almost linear time algorithm computing a set of \emph{good} swap edges,
each of which will guarantee a relative approximation factor on the maximum
stretch of (tight) as opposed to that provided by the corresponding BSE.
Surprisingly, no previous results were known for these two very natural swap
problems.Comment: 15 pages, 4 figures, SIROCCO 201
Pinwheel Scheduling for Fault-tolerant Broadcast Disks in Real-time Database Systems
The design of programs for broadcast disks which incorporate real-time and fault-tolerance requirements is considered. A generalized model for real-time fault-tolerant broadcast disks is defined. It is shown that designing programs for broadcast disks specified in this model is closely related to the scheduling of pinwheel task systems. Some new results in pinwheel scheduling theory are derived, which facilitate the efficient generation of real-time fault-tolerant broadcast disk programs.National Science Foundation (CCR-9308344, CCR-9596282
Optimal Gossip with Direct Addressing
Gossip algorithms spread information by having nodes repeatedly forward
information to a few random contacts. By their very nature, gossip algorithms
tend to be distributed and fault tolerant. If done right, they can also be fast
and message-efficient. A common model for gossip communication is the random
phone call model, in which in each synchronous round each node can PUSH or PULL
information to or from a random other node. For example, Karp et al. [FOCS
2000] gave algorithms in this model that spread a message to all nodes in
rounds while sending only messages per node
on average.
Recently, Avin and Els\"asser [DISC 2013], studied the random phone call
model with the natural and commonly used assumption of direct addressing.
Direct addressing allows nodes to directly contact nodes whose ID (e.g., IP
address) was learned before. They show that in this setting, one can "break the
barrier" and achieve a gossip algorithm running in
rounds, albeit while using messages per node.
We study the same model and give a simple gossip algorithm which spreads a
message in only rounds. We also prove a matching lower bound which shows that this running time is best possible. In
particular we show that any gossip algorithm takes with high probability at
least rounds to terminate. Lastly, our algorithm can be
tweaked to send only messages per node on average with only
bits per message. Our algorithm therefore simultaneously achieves the optimal
round-, message-, and bit-complexity for this setting. As all prior gossip
algorithms, our algorithm is also robust against failures. In particular, if in
the beginning an oblivious adversary fails any nodes our algorithm still,
with high probability, informs all but surviving nodes
- …