2,988 research outputs found
Incremental Consistency Guarantees for Replicated Objects
Programming with replicated objects is difficult. Developers must face the
fundamental trade-off between consistency and performance head on, while
struggling with the complexity of distributed storage stacks. We introduce
Correctables, a novel abstraction that hides most of this complexity, allowing
developers to focus on the task of balancing consistency and performance. To
aid developers with this task, Correctables provide incremental consistency
guarantees, which capture successive refinements on the result of an ongoing
operation on a replicated object. In short, applications receive both a
preliminary---fast, possibly inconsistent---result, as well as a
final---consistent---result that arrives later.
We show how to leverage incremental consistency guarantees by speculating on
preliminary values, trading throughput and bandwidth for improved latency. We
experiment with two popular storage systems (Cassandra and ZooKeeper) and three
applications: a Twissandra-based microblogging service, an ad serving system,
and a ticket selling system. Our evaluation on the Amazon EC2 platform with
YCSB workloads A, B, and C shows that we can reduce the latency of strongly
consistent operations by up to 40% (from 100ms to 60ms) at little cost (10%
bandwidth increase, 6% throughput drop) in the ad system. Even if the
preliminary result is frequently inconsistent (25% of accesses), incremental
consistency incurs a bandwidth overhead of only 27%.Comment: 16 total pages, 12 figures. OSDI'16 (to appear
Causal Consistency: Beyond Memory
In distributed systems where strong consistency is costly when not
impossible, causal consistency provides a valuable abstraction to represent
program executions as partial orders. In addition to the sequential program
order of each computing entity, causal order also contains the semantic links
between the events that affect the shared objects -- messages emission and
reception in a communication channel , reads and writes on a shared register.
Usual approaches based on semantic links are very difficult to adapt to other
data types such as queues or counters because they require a specific analysis
of causal dependencies for each data type. This paper presents a new approach
to define causal consistency for any abstract data type based on sequential
specifications. It explores, formalizes and studies the differences between
three variations of causal consistency and highlights them in the light of
PRAM, eventual consistency and sequential consistency: weak causal consistency,
that captures the notion of causality preservation when focusing on convergence
; causal convergence that mixes weak causal consistency and convergence; and
causal consistency, that coincides with causal memory when applied to shared
memory.Comment: 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming, Mar 2016, Barcelone, Spai
Exploring heterogeneity of unreliable machines for p2p backup
P2P architecture is a viable option for enterprise backup. In contrast to
dedicated backup servers, nowadays a standard solution, making backups directly
on organization's workstations should be cheaper (as existing hardware is
used), more efficient (as there is no single bottleneck server) and more
reliable (as the machines are geographically dispersed).
We present the architecture of a p2p backup system that uses pairwise
replication contracts between a data owner and a replicator. In contrast to
standard p2p storage systems using directly a DHT, the contracts allow our
system to optimize replicas' placement depending on a specific optimization
strategy, and so to take advantage of the heterogeneity of the machines and the
network. Such optimization is particularly appealing in the context of backup:
replicas can be geographically dispersed, the load sent over the network can be
minimized, or the optimization goal can be to minimize the backup/restore time.
However, managing the contracts, keeping them consistent and adjusting them in
response to dynamically changing environment is challenging.
We built a scientific prototype and ran the experiments on 150 workstations
in the university's computer laboratories and, separately, on 50 PlanetLab
nodes. We found out that the main factor affecting the quality of the system is
the availability of the machines. Yet, our main conclusion is that it is
possible to build an efficient and reliable backup system on highly unreliable
machines (our computers had just 13% average availability)
Optimistic replication
Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic ” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges
Tunable Causal Consistency: Specification and Implementation
To achieve high availability and low latency, distributed data stores often
geographically replicate data at multiple sites called replicas. However, this
introduces the data consistency problem. Due to the fundamental tradeoffs among
consistency, availability, and latency in the presence of network partition, no
a one-size-fits-all consistency model exists. To meet the needs of different
applications, many popular data stores provide tunable consistency, allowing
clients to specify the consistency level per individual operation. In this
paper, we propose tunable causal consistency (TCC). It allows clients to choose
the desired session guarantee for each operation, from the well-known four
session guarantees, i.e., read your writes, monotonic reads, monotonic writes,
and writes follow reads. Specifically, we first propose a formal specification
of TCC in an extended (vis,ar) framework originally proposed by Burckhardt et
al. Then we design a TCC protocol and develop a prototype distributed key-value
store called TCCSTORE. We evaluate TCCSTORE on Aliyun. The latency is less than
38ms for all workloads and the throughput is up to about 2800 operations per
second. We also show that TCC achieves better performance than causal
consistency and requires a negligible overhead when compared with eventual
consistency
Functional programming abstractions for weakly consistent systems
In recent years, there has been a wide-spread adoption of both multicore and cloud computing. Traditionally, concurrent programmers have relied on the underlying system providing strong memory consistency, where there is a semblance of concurrent tasks operating over a shared global address space. However, providing scalable strong consistency guarantees as the scale of the system grows is an increasingly difficult endeavor. In a multicore setting, the increasing complexity and the lack of scalability of hardware mechanisms such as cache coherence deters scalable strong consistency. In geo-distributed compute clouds, the availability concerns in the presence of partial failures prohibit strong consistency. Hence, modern multicore and cloud computing platforms eschew strong consistency in favor of weakly consistent memory, where each task\u27s memory view is incomparable with the other tasks. As a result, programmers on these platforms must tackle the full complexity of concurrent programming for an asynchronous distributed system. ^ This dissertation argues that functional programming language abstractions can simplify scalable concurrent programming for weakly consistent systems. Functional programming espouses mutation-free programming, and rare mutations when present are explicit in their types. By controlling and explicitly reasoning about shared state mutations, functional abstractions simplify concurrent programming. Building upon this intuition, this dissertation presents three major contributions, each focused on addressing a particular challenge associated with weakly consistent loosely coupled systems. First, it describes A NERIS, a concurrent functional programming language and runtime for the Intel Single-chip Cloud Computer, and shows how to provide an efficient cache coherent virtual address space on top of a non cache coherent multicore architecture. Next, it describes RxCML, a distributed extension of MULTIMLTON and shows that, with the help of speculative execution, synchronous communication can be utilized as an efficient abstraction for programming asynchronous distributed systems. Finally, it presents QUELEA, a programming system for eventually consistent distributed stores, and shows that the choice of correct consistency level for replicated data type operations and transactions can be automated with the help of high-level declarative contracts
- …