1,242 research outputs found
Scalable Range Locks for Scalable Address Spaces and Beyond
Range locks are a synchronization construct designed to provide concurrent
access to multiple threads (or processes) to disjoint parts of a shared
resource. Originally conceived in the file system context, range locks are
gaining increasing interest in the Linux kernel community seeking to alleviate
bottlenecks in the virtual memory management subsystem. The existing
implementation of range locks in the kernel, however, uses an internal spin
lock to protect the underlying tree structure that keeps track of acquired and
requested ranges. This spin lock becomes a point of contention on its own when
the range lock is frequently acquired. Furthermore, where and exactly how
specific (refined) ranges can be locked remains an open question.
In this paper, we make two independent, but related contributions. First, we
propose an alternative approach for building range locks based on linked lists.
The lists are easy to maintain in a lock-less fashion, and in fact, our range
locks do not use any internal locks in the common case. Second, we show how the
range of the lock can be refined in the mprotect operation through a
speculative mechanism. This refinement, in turn, allows concurrent execution of
mprotect operations on non-overlapping memory regions. We implement our new
algorithms and demonstrate their effectiveness in user-space and kernel-space,
achieving up to 9 speedup compared to the stock version of the Linux
kernel. Beyond the virtual memory management subsystem, we discuss other
applications of range locks in parallel software. As a concrete example, we
show how range locks can be used to facilitate the design of scalable
concurrent data structures, such as skip lists.Comment: 17 pages, 9 figures, Eurosys 202
Scalable, Highly Performant, Reader/Writer Lock Using Restartable Sequences
Traditional shared locks for synchronizing mostly read-only data are expensive and scale poorly with the number of threads or processor cores. This disclosure describes a scalable, low-cost, low-contention, shared lock (a mutex implementation) that behaves well under cases of reasonable contention and under temporarily write-dominated scenarios. The described shared lock, also known as counting mutex, is based on restartable sequences and fast fences
Towards fair, scalable, locking
Without care, Hardware Transactional Memory presents several
performance pathologies that can degrade its performance. Among
them, writers of commonly read variables can suffer from starvation. Though different solutions have been proposed for HTM systems, hybrid systems can still suffer from this performance problem, given that software transactions don’t interact with the mechanisms used by hardware to avoid starvation.
In this paper we introduce a new per-directory-line hardware contention management mechanism that allows fairer access between both software and hardware threads without the need to abort any transaction. Our mechanism is based on “reserving” directory lines, implementing a limited fair queue for the requests on that line. We adapt the mechanism to the LogTM conflict detection mechanism and show that the resulting proposal is deadlock free. Finally, we sketch how the idea could be applied more generally to reader-writer locks.Postprint (published version
A Comparison of Relativistic and Reader-Writer Locking Approaches to Shared Data Access
This paper explores the relationship between reader-writer locking and relativistic programming approaches to managing accesses to shared data. It demonstrates that by placing certain restrictions on writers, relativistic programming allows more concurrency than reader-writer locking while still providing the same isolation guarantees. Relativistic programming also allows for a straightforward model for reasoning about the correctness of programs that allow concurrent read-write accesses
Scalable Synchronization with Mindicators
The Mindicator is a shared object that stores one value for each thread in a system, and can return the minimum of all thread’s values in constant time. In this paper, we explore applications of the Mindicator in synchronization algorithms. We introduce three new algorithms, designed for scalable Read-Copy-Update (RCU), fair Readers-Writer locking, and Group Mutual Exclusion. Experimental evaluation shows these algorithms to perform well while avoiding contention
Resizable, Scalable, Concurrent Hash Tables
We present algorithms for shrinking and expanding a hash table while allowing concurrent, wait-free, linearly scalable lookups. These resize algorithms allow the hash table to maintain constant-time performance as the number of entries grows, and reclaim memory as the number of entries decreases, without delaying or disrupting readers.
We implemented our algorithms in the Linux kernel, to test their performance and scalability. Benchmarks show lookup scalability improved 125x over readerwriter locking, and 56% over the current state-of-the-art for Linux, with no performance degradation for lookups during a resize.
To achieve this performance, this hash table implementation uses a new concurrent programming methodology known as relativistic programming. In particular, we make use of an existing synchronization primitive which waits for all current readers to finish, with little to no reader overhead; careful use of this primitive allows ordering of updates without read-side synchronization or memory barriers
- …