1,497 research outputs found
A Dual Digraph Approach for Leaderless Atomic Broadcast (Extended Version)
Many distributed systems work on a common shared state; in such systems,
distributed agreement is necessary for consistency. With an increasing number
of servers, these systems become more susceptible to single-server failures,
increasing the relevance of fault-tolerance. Atomic broadcast enables
fault-tolerant distributed agreement, yet it is costly to solve. Most practical
algorithms entail linear work per broadcast message. AllConcur -- a leaderless
approach -- reduces the work, by connecting the servers via a sparse resilient
overlay network; yet, this resiliency entails redundancy, limiting the
reduction of work. In this paper, we propose AllConcur+, an atomic broadcast
algorithm that lifts this limitation: During intervals with no failures, it
achieves minimal work by using a redundancy-free overlay network. When failures
do occur, it automatically recovers by switching to a resilient overlay
network. In our performance evaluation of non-failure scenarios, AllConcur+
achieves comparable throughput to AllGather -- a non-fault-tolerant distributed
agreement algorithm -- and outperforms AllConcur, LCR and Libpaxos both in
terms of throughput and latency. Furthermore, our evaluation of failure
scenarios shows that AllConcur+'s expected performance is robust with regard to
occasional failures. Thus, for realistic use cases, leveraging redundancy-free
distributed agreement during intervals with no failures improves performance
significantly.Comment: Overview: 24 pages, 6 sections, 3 appendices, 8 figures, 3 tables.
Modifications from previous version: extended the evaluation of AllConcur+
with a simulation of a multiple datacenters deploymen
Building Regular Registers with Rational Malicious Servers and Anonymous Clients
The paper addresses the problem of emulating a regular register in a synchronous distributed system where clients invoking and operations are anonymous while server processes maintaining the state of the register may be compromised by rational adversaries (i.e., a server might behave as rational malicious Byzantine process). We first model our problem as a Bayesian game between a client and a rational malicious server where the equilibrium depends on the decisions of the malicious server (behave correctly and not be detected by clients vs returning a wrong register value to clients with the risk of being detected and then excluded by the computation). We prove such equilibrium exists and finally we design a protocol implementing the regular register that forces the rational malicious server to behave correctly
Distributed protocols as behaviours in Erlang
We investigate the implementation of standard algorithms for three classes of Distributed Agreement problems in Erlang, an industry-strength language for programming fault tolerant distributed systems. We develop a framework to bridge the gap between the assumptions of these standard algorithm and the network abstraction provided by Erlang, and structure our implementations as reusable behaviours within this framework.peer-reviewe
Improving the Scalability of DPWS-Based Networked Infrastructures
The Devices Profile for Web Services (DPWS) specification enables seamless
discovery, configuration, and interoperability of networked devices in various
settings, ranging from home automation and multimedia to manufacturing
equipment and data centers. Unfortunately, the sheer simplicity of event
notification mechanisms that makes it fit for resource-constrained devices,
makes it hard to scale to large infrastructures with more stringent
dependability requirements, ironically, where self-configuration would be most
useful. In this report, we address this challenge with a proposal to integrate
gossip-based dissemination in DPWS, thus maintaining compatibility with
original assumptions of the specification, and avoiding a centralized
configuration server or custom black-box middleware components. In detail, we
show how our approach provides an evolutionary and non-intrusive solution to
the scalability limitations of DPWS and experimentally evaluate it with an
implementation based on the the Web Services for Devices (WS4D) Java Multi
Edition DPWS Stack (JMEDS).Comment: 28 pages, Technical Repor
Strong Consistency for Shared Objects in Pervasive Grids
International audienceRecent advances in communication technology en- able the emergence of a new generation of applications that integrates mobile devices with classical high performance systems as part of a common computing environment. In such environ- ments, keeping the coherence of shared data (distributed objects, for example) represents a real challenge as communications are strongly influenced by the performance and the reliability of mobile devices (laptops, PDAs and cellular telephones) and wireless networks (WiFi, Bluetooth). Indeed, data incoherence may arise due to message losses or node volatility, which blocks the algorithms used to synchronize these data. In this paper, we analyze the main challenges concerning the manipulation of shared distributed objects in a pervasive environment. We demonstrate how a membership service can be enhanced to tolerate temporary disconnections and message losses without blocking, while reducing the number of exchanged message
Recommended from our members
UPC++ v1.0 Programmer’s Guide, Revision 2020.3.0
UPC++ is a C++11 library that provides Partitioned Global Address Space (PGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The PGAS model is single program, multiple-data (SPMD), with each separate constituent process having access to local memory as it would in C++. However, PGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the processes. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores
- …