275 research outputs found
Reconfigurable Lattice Agreement and Applications
Reconfiguration is one of the central mechanisms in distributed systems. Due to failures and connectivity disruptions, the very set of service replicas (or servers) and their roles in the computation may have to be reconfigured over time. To provide the desired level of consistency and availability to applications running on top of these servers, the clients of the service should be able to reach some form of agreement on the system configuration. We observe that this agreement is naturally captured via a lattice partial order on the system states. We propose an asynchronous implementation of reconfigurable lattice agreement that implies elegant reconfigurable versions of a large class of lattice abstract data types, such as max-registers and conflict detectors, as well as popular distributed programming abstractions, such as atomic snapshot and commit-adopt
The Lock-free -LSM Relaxed Priority Queue
Priority queues are data structures which store keys in an ordered fashion to
allow efficient access to the minimal (maximal) key. Priority queues are
essential for many applications, e.g., Dijkstra's single-source shortest path
algorithm, branch-and-bound algorithms, and prioritized schedulers.
Efficient multiprocessor computing requires implementations of basic data
structures that can be used concurrently and scale to large numbers of threads
and cores. Lock-free data structures promise superior scalability by avoiding
blocking synchronization primitives, but the \emph{delete-min} operation is an
inherent scalability bottleneck in concurrent priority queues. Recent work has
focused on alleviating this obstacle either by batching operations, or by
relaxing the requirements to the \emph{delete-min} operation.
We present a new, lock-free priority queue that relaxes the \emph{delete-min}
operation so that it is allowed to delete \emph{any} of the smallest
keys, where is a runtime configurable parameter. Additionally, the
behavior is identical to a non-relaxed priority queue for items added and
removed by the same thread. The priority queue is built from a logarithmic
number of sorted arrays in a way similar to log-structured merge-trees. We
experimentally compare our priority queue to recent state-of-the-art lock-free
priority queues, both with relaxed and non-relaxed semantics, showing high
performance and good scalability of our approach.Comment: Short version as ACM PPoPP'15 poste
Hoare-style Specifications as Correctness Conditions for Non-linearizable Concurrent Objects
Designing scalable concurrent objects, which can be efficiently used on
multicore processors, often requires one to abandon standard specification
techniques, such as linearizability, in favor of more relaxed consistency
requirements. However, the variety of alternative correctness conditions makes
it difficult to choose which one to employ in a particular case, and to compose
them when using objects whose behaviors are specified via different criteria.
The lack of syntactic verification methods for most of these criteria poses
challenges in their systematic adoption and application.
In this paper, we argue for using Hoare-style program logics as an
alternative and uniform approach for specification and compositional formal
verification of safety properties for concurrent objects and their client
programs. Through a series of case studies, we demonstrate how an existing
program logic for concurrency can be employed off-the-shelf to capture
important state and history invariants, allowing one to explicitly quantify
over interference of environment threads and provide intuitive and expressive
Hoare-style specifications for several non-linearizable concurrent objects that
were previously specified only via dedicated correctness criteria. We
illustrate the adequacy of our specifications by verifying a number of
concurrent client scenarios, that make use of the previously specified
concurrent objects, capturing the essence of such correctness conditions as
concurrency-aware linearizability, quiescent, and quantitative quiescent
consistency. All examples described in this paper are verified mechanically in
Coq.Comment: 18 page
The Power of Choice in Priority Scheduling
Consider the following random process: we are given queues, into which
elements of increasing labels are inserted uniformly at random. To remove an
element, we pick two queues at random, and remove the element of lower label
(higher priority) among the two. The cost of a removal is the rank of the label
removed, among labels still present in any of the queues, that is, the distance
from the optimal choice at each step. Variants of this strategy are prevalent
in state-of-the-art concurrent priority queue implementations. Nonetheless, it
is not known whether such implementations provide any rank guarantees, even in
a sequential model.
We answer this question, showing that this strategy provides surprisingly
strong guarantees: Although the single-choice process, where we always insert
and remove from a single randomly chosen queue, has degrading cost, going to
infinity as we increase the number of steps, in the two choice process, the
expected rank of a removed element is while the expected worst-case
cost is . These bounds are tight, and hold irrespective of the
number of steps for which we run the process.
The argument is based on a new technical connection between "heavily loaded"
balls-into-bins processes and priority scheduling.
Our analytic results inspire a new concurrent priority queue implementation,
which improves upon the state of the art in terms of practical performance
The Fence Complexity of Persistent Sets
We study the psync complexity of concurrent sets in the non-volatile shared
memory model. Flush instructions are used in non-volatile memory to force
shared state to be written back to non-volatile memory and must typically be
accompanied by the use of expensive fence instructions to enforce ordering
among such flushes. Collectively we refer to a flush and a fence as a psync.
The safety property of strict linearizability forces crashed operations to take
effect before the crash or not take effect at all; the weaker property of
durable linearizability enforces this requirement only for operations that have
completed prior to the crash event. We consider lock-free implementations of
list-based sets and prove two lower bounds. We prove that for any durable
linearizable lock-free set there must exist an execution where some process
must perform at least one redundant psync as part of an update operation. We
introduce an extension to strict linearizability specialized for persistent
sets that we call strict limited effect (SLE) linearizability. SLE
linearizability explicitly ensures that operations do not take effect after a
crash which better reflects the original intentions of strict linearizability.
We show that it is impossible to implement SLE linearizable lock-free sets in
which read-only (or search) operations do not flush or fence. We undertake an
empirical study of persistent sets that examines various algorithmic design
techniques and the impact of flush instructions in practice. We present
concurrent set algorithms that provide matching upper bounds and rigorously
evaluate them against existing persistent sets to expose the impact of
algorithmic design and safety properties on psync complexity in practice as
well as the cost of recovering the data structure following a system crash
Fast and Space-Efficient Queues via Relaxation
Efficient message-passing implementations of shared data types are a vital component of practical distributed systems, enabling them to work on shared data in predictable ways, but there is a long history of results showing that many of the most useful types of access to shared data are necessarily slow. A variety of approaches attempt to circumvent these bounds, notably weakening consistency guarantees and relaxing the sequential specification of the provided data type. These trade behavioral guarantees for performance. We focus on relaxing the sequential specification of a first-in, first-out queue type, which has been shown to allow faster linearizable implementations than are possible for traditional FIFO queues without relaxation.
The algorithms which showed these improvements in operation time tracked a complete execution history, storing complete object state at all n processes in the system, leading to n copies of every stored data element. In this paper, we consider the question of reducing the space complexity of linearizable implementations of shared data types, which provide intuitive behavior through strong consistency guarantees. We improve the existing algorithm for a relaxed queue, showing that it is possible to store only one copy of each element in a shared queue, while still having a low amortized time cost. This is one of several important steps towards making these data types practical in real world systems
Lock-free Concurrent Data Structures
Concurrent data structures are the data sharing side of parallel programming.
Data structures give the means to the program to store data, but also provide
operations to the program to access and manipulate these data. These operations
are implemented through algorithms that have to be efficient. In the sequential
setting, data structures are crucially important for the performance of the
respective computation. In the parallel programming setting, their importance
becomes more crucial because of the increased use of data and resource sharing
for utilizing parallelism.
The first and main goal of this chapter is to provide a sufficient background
and intuition to help the interested reader to navigate in the complex research
area of lock-free data structures. The second goal is to offer the programmer
familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing
Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and
Distributed Computin
- …