6 research outputs found
Distributed Multi-writer Multi-reader Atomic Register with Optimistically Fast Read and Write
A distributed multi-writer multi-reader (MWMR) atomic register is an
important primitive that enables a wide range of distributed algorithms. Hence,
improving its performance can have large-scale consequences. Since the seminal
work of ABD emulation in the message-passing networks [JACM '95], many
researchers study fast implementations of atomic registers under various
conditions. "Fast" means that a read or a write can be completed with 1
round-trip time (RTT), by contacting a simple majority. In this work, we
explore an atomic register with optimal resilience and "optimistically fast"
read and write operations. That is, both operations can be fast if there is no
concurrent write.
This paper has three contributions: (i) We present Gus, the emulation of an
MWMR atomic register with optimal resilience and optimistically fast reads and
writes when there are up to 5 nodes; (ii) We show that when there are > 5
nodes, it is impossible to emulate an MWMR atomic register with both
properties; and (iii) We implement Gus in the framework of EPaxos and Gryff,
and show that Gus provides lower tail latency than state-of-the-art systems
such as EPaxos, Gryff, Giza, and Tempo under various workloads in the context
of geo-replicated object storage systems
Automated Validation of State-Based Client-Centric Isolation with TLA+
Clear consistency guarantees on data are paramount for the design and implementation of distributed systems. When implementing distributed applications
Fault-tolerant computing with unreliable channels
We study implementations of basic fault-tolerant primitives, such as
consensus and registers, in message-passing systems subject to process crashes
and a broad range of communication failures. Our results characterize the
necessary and sufficient conditions for implementing these primitives as a
function of the connectivity constraints and synchrony assumptions. Our main
contribution is a new algorithm for partially synchronous consensus that is
resilient to process crashes and channel failures and is optimal in its
connectivity requirements. In contrast to prior work, our algorithm assumes the
most general model of message loss where faulty channels are flaky, i.e., can
lose messages without any guarantee of fairness. This failure model is
particularly challenging for consensus algorithms, as it rules out standard
solutions based on leader oracles and failure detectors. To circumvent this
limitation, we construct our solution using a new variant of the recently
proposed view synchronizer abstraction, which we adapt to the crash-prone
setting with flaky channels
Automated Validation of State-Based Client-Centric Isolation with TLA <sup>+</sup>
Clear consistency guarantees on data are paramount for the design and implementation of distributed systems. When implementing distributed applications, developers require approaches to verify the data consistency guarantees of an implementation choice. Crooks et al. define a state-based and client-centric model of database isolation. This paper formalizes this state-based model in, reproduces their examples and shows how to model check runtime traces and algorithms with this formalization. The formalized model in enables semi-automatic model checking for different implementation alternatives for transactional operations and allows checking of conformance to isolation levels. We reproduce examples of the original paper and confirm the isolation guarantees of the combination of the well-known 2-phase locking and 2-phase commit algorithms. Using model checking this formalization can also help finding bugs in incorrect specifications. This improves feasibility of automated checking of isolation guarantees in synthesized synchronization implementations and it provides an environment for experimenting with new designs.</p