9 research outputs found
Distributed Multi-writer Multi-reader Atomic Register with Optimistically Fast Read and Write
A distributed multi-writer multi-reader (MWMR) atomic register is an
important primitive that enables a wide range of distributed algorithms. Hence,
improving its performance can have large-scale consequences. Since the seminal
work of ABD emulation in the message-passing networks [JACM '95], many
researchers study fast implementations of atomic registers under various
conditions. "Fast" means that a read or a write can be completed with 1
round-trip time (RTT), by contacting a simple majority. In this work, we
explore an atomic register with optimal resilience and "optimistically fast"
read and write operations. That is, both operations can be fast if there is no
concurrent write.
This paper has three contributions: (i) We present Gus, the emulation of an
MWMR atomic register with optimal resilience and optimistically fast reads and
writes when there are up to 5 nodes; (ii) We show that when there are > 5
nodes, it is impossible to emulate an MWMR atomic register with both
properties; and (iii) We implement Gus in the framework of EPaxos and Gryff,
and show that Gus provides lower tail latency than state-of-the-art systems
such as EPaxos, Gryff, Giza, and Tempo under various workloads in the context
of geo-replicated object storage systems
ARES: Adaptive, Reconfigurable, Erasure coded, atomic Storage
Atomicity or strong consistency is one of the fundamental, most intuitive,
and hardest to provide primitives in distributed shared memory emulations. To
ensure survivability, scalability, and availability of a storage service in the
presence of failures, traditional approaches for atomic memory emulation, in
message passing environments, replicate the objects across multiple servers.
Compared to replication based algorithms, erasure code-based atomic memory
algorithms has much lower storage and communication costs, but usually, they
are harder to design. The difficulty of designing atomic memory algorithms
further grows, when the set of servers may be changed to ensure survivability
of the service over software and hardware upgrades, while avoiding service
interruptions. Atomic memory algorithms for performing server reconfiguration,
in the replicated systems, are very few, complex, and are still part of an
active area of research; reconfigurations of erasure-code based algorithms are
non-existent.
In this work, we present ARES, an algorithmic framework that allows
reconfiguration of the underlying servers, and is particularly suitable for
erasure-code based algorithms emulating atomic objects. ARES introduces new
configurations while keeping the service available. To use with ARES we also
propose a new, and to our knowledge, the first two-round erasure code based
algorithm TREAS, for emulating multi-writer, multi-reader (MWMR) atomic objects
in asynchronous, message-passing environments, with near-optimal communication
and storage costs. Our algorithms can tolerate crash failures of any client and
some fraction of servers, and yet, guarantee safety and liveness property.
Moreover, by bringing together the advantages of ARES and TREAS, we propose an
optimized algorithm where new configurations can be installed without the
objects values passing through the reconfiguration clients
Fast Access to Distributed Atomic Memory
We study efficient and robust implementations of an atomic read-write data structure over an asynchronous distributed message-passing system made of reader and writer processes, as well as a number of servers implementing the data structure. We determine the exact conditions under which every read and write involves one round of communication with the servers. These conditions relate the number of readers to the tolerated number of faulty servers and the nature of these failures
On the Efficiency of Atomic Multi-reader, Multi-writer Distributed Memory
This paper considers quorum-replicated, multi-writer, multi-reader (MWMR) implementations of surviv-able atomic registers in a distributed message-passing system with processors prone to failures. Previous implementations in such settings invariably required two rounds of communication between readers/writers and replica owners. Hence the question arises whether it is possible to have single round read and/or write operations in this setting. As a first step, we present an algorithm, called CWFR, that allows the classic two round write operations, while supporting single round read operations. Since multiple write operations may be concurrent with a read operation, this algorithm involves an iterative (local) discovery of the latest completed write operation. This algorithm precipitates the question of whether fast (single round) writes may co-exist with fast reads. We thus devise a second algorithm, called SFW, that exploits a new technique called server side ordering (SSO), which –unlike previous approaches – places partial responsibility for the ordering of write operations on the replica owners (the servers). With SSO, fast write operations are introduced for the very first time in the MWMR setting. While this is possible, we show that under certain conditions the MWMR model imposes in-herent limitations on any quorum-based fast write implementation of a safe read/write register and potentiall