13,684 research outputs found
Partially ordered distributed computations on asynchronous point-to-point networks
Asynchronous executions of a distributed algorithm differ from each other due
to the nondeterminism in the order in which the messages exchanged are handled.
In many situations of interest, the asynchronous executions induced by
restricting nondeterminism are more efficient, in an application-specific
sense, than the others. In this work, we define partially ordered executions of
a distributed algorithm as the executions satisfying some restricted orders of
their actions in two different frameworks, those of the so-called event- and
pulse-driven computations. The aim of these restrictions is to characterize
asynchronous executions that are likely to be more efficient for some important
classes of applications. Also, an asynchronous algorithm that ensures the
occurrence of partially ordered executions is given for each case. Two of the
applications that we believe may benefit from the restricted nondeterminism are
backtrack search, in the event-driven case, and iterative algorithms for
systems of linear equations, in the pulse-driven case
VCube-PS: A Causal Broadcast Topic-based Publish/Subscribe System
In this work we present VCube-PS, a topic-based Publish/Subscribe system
built on the top of a virtual hypercube-like topology. Membership information
and published messages are broadcast to subscribers (members) of a topic group
over dynamically built spanning trees rooted at the publisher. For a given
topic, the delivery of published messages respects the causal order. VCube-PS
was implemented on the PeerSim simulator, and experiments are reported
including a comparison with the traditional Publish/Subscribe approach that
employs a single rooted static spanning-tree for message distribution. Results
confirm the efficiency of VCube-PS in terms of scalability, latency, number and
size of messages.Comment: Improved text and performance evaluation. Added proof for the
algorithms (Section 3.4
Update Consistency for Wait-free Concurrent Objects
In large scale systems such as the Internet, replicating data is an essential
feature in order to provide availability and fault-tolerance. Attiya and Welch
proved that using strong consistency criteria such as atomicity is costly as
each operation may need an execution time linear with the latency of the
communication network. Weaker consistency criteria like causal consistency and
PRAM consistency do not ensure convergence. The different replicas are not
guaranteed to converge towards a unique state. Eventual consistency guarantees
that all replicas eventually converge when the participants stop updating.
However, it fails to fully specify the semantics of the operations on shared
objects and requires additional non-intuitive and error-prone distributed
specification techniques. This paper introduces and formalizes a new
consistency criterion, called update consistency, that requires the state of a
replicated object to be consistent with a linearization of all the updates. In
other words, whereas atomicity imposes a linearization of all of the
operations, this criterion imposes this only on updates. Consequently some read
operations may return out-dated values. Update consistency is stronger than
eventual consistency, so we can replace eventually consistent objects with
update consistent ones in any program. Finally, we prove that update
consistency is universal, in the sense that any object can be implemented under
this criterion in a distributed system where any number of nodes may crash.Comment: appears in International Parallel and Distributed Processing
Symposium, May 2015, Hyderabad, Indi
Programming with process groups: Group and multicast semantics
Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects
A Configurable Transport Layer for CAF
The message-driven nature of actors lays a foundation for developing scalable
and distributed software. While the actor itself has been thoroughly modeled,
the message passing layer lacks a common definition. Properties and guarantees
of message exchange often shift with implementations and contexts. This adds
complexity to the development process, limits portability, and removes
transparency from distributed actor systems.
In this work, we examine actor communication, focusing on the implementation
and runtime costs of reliable and ordered delivery. Both guarantees are often
based on TCP for remote messaging, which mixes network transport with the
semantics of messaging. However, the choice of transport may follow different
constraints and is often governed by deployment. As a first step towards
re-architecting actor-to-actor communication, we decouple the messaging
guarantees from the transport protocol. We validate our approach by redesigning
the network stack of the C++ Actor Framework (CAF) so that it allows to combine
an arbitrary transport protocol with additional functions for remote messaging.
An evaluation quantifies the cost of composability and the impact of individual
layers on the entire stack
A Dual Digraph Approach for Leaderless Atomic Broadcast (Extended Version)
Many distributed systems work on a common shared state; in such systems,
distributed agreement is necessary for consistency. With an increasing number
of servers, these systems become more susceptible to single-server failures,
increasing the relevance of fault-tolerance. Atomic broadcast enables
fault-tolerant distributed agreement, yet it is costly to solve. Most practical
algorithms entail linear work per broadcast message. AllConcur -- a leaderless
approach -- reduces the work, by connecting the servers via a sparse resilient
overlay network; yet, this resiliency entails redundancy, limiting the
reduction of work. In this paper, we propose AllConcur+, an atomic broadcast
algorithm that lifts this limitation: During intervals with no failures, it
achieves minimal work by using a redundancy-free overlay network. When failures
do occur, it automatically recovers by switching to a resilient overlay
network. In our performance evaluation of non-failure scenarios, AllConcur+
achieves comparable throughput to AllGather -- a non-fault-tolerant distributed
agreement algorithm -- and outperforms AllConcur, LCR and Libpaxos both in
terms of throughput and latency. Furthermore, our evaluation of failure
scenarios shows that AllConcur+'s expected performance is robust with regard to
occasional failures. Thus, for realistic use cases, leveraging redundancy-free
distributed agreement during intervals with no failures improves performance
significantly.Comment: Overview: 24 pages, 6 sections, 3 appendices, 8 figures, 3 tables.
Modifications from previous version: extended the evaluation of AllConcur+
with a simulation of a multiple datacenters deploymen
- âŠ