Search CORE

18,337 research outputs found

Toward Linearizability Testing for Multi-Word Persistent Synchronization Primitives

Author: Cepeda Diego
Chowdhury Sakib
Golab Wojciech
Li Nan
Lopez Raphael
Wang Xinzhe
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Principles of Distributed Systems (OPODIS 2019)
Publication date: 01/01/2020
Field of study

Persistent memory makes it possible to recover in-memory data structures following a failure instead of rebuilding them from state saved in slow secondary storage. Implementing such recoverable data structures correctly is challenging as their underlying algorithms must deal with both parallelism and failures, which makes them especially susceptible to programming errors. Traditional proofs of correctness should therefore be combined with other methods, such as model checking or software testing, to minimize the likelihood of uncaught defects. This research focuses specifically on the algorithmic principles of software testing, particularly linearizability analysis, for multi-word persistent synchronization primitives such as conditional swap operations. We describe an efficient decision procedure for linearizability in this context, and discuss its practical applications in detecting previously-unknown bugs in implementations of multi-word persistent primitives

Dagstuhl Research Online Publication Server

Lock-free Concurrent Data Structures

Author: Cederman Daniel
Gidenstam Anders
Ha Phuong
Papatriantafilou Marina
Sundell Håkan
Tsigas Philippas
Publication venue
Publication date: 01/01/2013
Field of study

Concurrent data structures are the data sharing side of parallel programming. Data structures give the means to the program to store data, but also provide operations to the program to access and manipulate these data. These operations are implemented through algorithms that have to be efficient. In the sequential setting, data structures are crucially important for the performance of the respective computation. In the parallel programming setting, their importance becomes more crucial because of the increased use of data and resource sharing for utilizing parallelism. The first and main goal of this chapter is to provide a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. The second goal is to offer the programmer familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computin

arXiv.org e-Print Archive

Chalmers Research

Lock-Free and Practical Deques using Single-Word Compare-And-Swap

Author: A. Silberschatz
M. Greenwald
M. Herlihy
M. Herlihy
M.M. Michael
Publication venue
Publication date: 01/01/2004
Field of study

We present an efficient and practical lock-free implementation of a concurrent deque that is disjoint-parallel accessible and uses atomic primitives which are available in modern computer systems. Previously known lock-free algorithms of deques are either based on non-available atomic synchronization primitives, only implement a subset of the functionality, or are not designed for disjoint accesses. Our algorithm is based on a doubly linked list, and only requires single-word compare-and-swap atomic primitives, even for dynamic memory sizes. We have performed an empirical study using full implementations of the most efficient algorithms of lock-free deques known. For systems with low concurrency, the algorithm by Michael shows the best performance. However, as our algorithm is designed for disjoint accesses, it performs significantly better on systems with high concurrency and non-uniform memory architecture

arXiv.org e-Print Archive

Crossref

Chalmers Research

Chalmers Publication Library

Efficient Lock-free Binary Search Trees

Author: Chatterjee Bapi
Nguyen Nhan
Tsigas Philippas
Publication venue
Publication date: 01/01/2014
Field of study

In this paper we present a novel algorithm for concurrent lock-free internal binary search trees (BST) and implement a Set abstract data type (ADT) based on that. We show that in the presented lock-free BST algorithm the amortized step complexity of each set operation - {\sc Add}, {\sc Remove} and {\sc Contains} - is

O(H(n) + c)

, where,

H(n)

is the height of BST with

n

number of nodes and

c

is the contention during the execution. Our algorithm adapts to contention measures according to read-write load. If the situation is read-heavy, the operations avoid helping pending concurrent {\sc Remove} operations during traversal, and, adapt to interval contention. However, for write-heavy situations we let an operation help pending {\sc Remove}, even though it is not obstructed, and so adapt to tighter point contention. It uses single-word compare-and-swap (\texttt{CAS}) operations. We show that our algorithm has improved disjoint-access-parallelism compared to similar existing algorithms. We prove that the presented algorithm is linearizable. To the best of our knowledge this is the first algorithm for any concurrent tree data structure in which the modify operations are performed with an additive term of contention measure.Comment: 15 pages, 3 figures, submitted to POD

arXiv.org e-Print Archive

Crossref

Chalmers Research

Boosting Multi-Core Reachability Performance with Shared Hash Tables

Author: Laarman Alfons
van de Pol Jaco
Weber Michael
Publication venue
Publication date: 01/01/2010
Field of study

This paper focuses on data structures for multi-core reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with thread-local storage and resulted in reasonable speedups, but left open whether improvements are possible. In this paper, we present a scaling solution for shared state storage which is based on a lockless hash table implementation. The solution is specifically designed for the cache architecture of modern CPUs. Because model checking algorithms impose loose requirements on the hash table operations, their design can be streamlined substantially compared to related work on lockless hash tables. Still, an implementation of the hash table presented here has dozens of sensitive performance parameters (bucket size, cache line size, data layout, probing sequence, etc.). We analyzed their impact and compared the resulting speedups with related tools. Our implementation outperforms two state-of-the-art multi-core model checkers (SPIN and DiVinE) by a substantial margin, while placing fewer constraints on the load balancing and search algorithms.Comment: preliminary repor

arXiv.org e-Print Archive

CiteSeerX

University of Twente Research Information

Efficient Multi-Word Compare and Swap

Author: Guerraoui Rachid
Kogan Alex
Marathe Virendra J.
Zablotchi Igor
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th International Symposium on Distributed Computing (DISC 2020)
Publication date: 01/01/2020
Field of study

Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algorithms. Yet, its widespread usage has been limited because lock-free implementations of MCAS make heavy use of expensive compare-and-swap (CAS) instructions. Existing MCAS implementations indeed use at least 2k+1 CASes per k-CAS. This leads to the natural desire to minimize the number of CASes required to implement MCAS. We first prove in this paper that it is impossible to "pack" the information required to perform a k-word CAS (k-CAS) in less than k locations to be CASed. Then we present the first algorithm that requires k+1 CASes per call to k-CAS in the common uncontended case. We implement our algorithm and show that it outperforms a state-of-the-art baseline in a variety of benchmarks in most considered workloads. We also present a durably linearizable (persistent memory friendly) version of our MCAS algorithm using only 2 persistence fences per call, while still only requiring k+1 CASes per k-CAS

Infoscience - École polytechnique fédérale de Lausanne

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server