    Efficient Lock-free Binary Search Trees

    In this paper we present a novel algorithm for concurrent lock-free internal binary search trees (BST) and implement a Set abstract data type (ADT) based on that. We show that in the presented lock-free BST algorithm the amortized step complexity of each set operation - {\sc Add}, {\sc Remove} and {\sc Contains} - is O(H(n)+c)O(H(n) + c), where, H(n)H(n) is the height of BST with nn number of nodes and cc is the contention during the execution. Our algorithm adapts to contention measures according to read-write load. If the situation is read-heavy, the operations avoid helping pending concurrent {\sc Remove} operations during traversal, and, adapt to interval contention. However, for write-heavy situations we let an operation help pending {\sc Remove}, even though it is not obstructed, and so adapt to tighter point contention. It uses single-word compare-and-swap (\texttt{CAS}) operations. We show that our algorithm has improved disjoint-access-parallelism compared to similar existing algorithms. We prove that the presented algorithm is linearizable. To the best of our knowledge this is the first algorithm for any concurrent tree data structure in which the modify operations are performed with an additive term of contention measure.Comment: 15 pages, 3 figures, submitted to POD

    A Contention-Friendly Methodology for Search Structures

    In this paper, a new methodology for writing concurrent data structures is proposed. This methodology limits the high contention induced by today's mutlicore environments to come up with efficient alternatives to most widely used search structures, including skip lists, binary search trees and hash tables. Data structures are generally constrained to guarantee a big-oh step complexity even in the presence of concurrency. By contrast our methodology guarantees the big-oh complexity only in the absence of contention and limits the contention when concurrency appears. The key concept lies in dividing update operations within an eager abstract access that returns rapidly for efficiency reason and a lazy structural adaptation that may be postponed to diminish contention. We illustrate our methodology with three contention-friendly data structures: a lock based skip list and binary search tree, and a lock-free hash table. Our evaluation clearly shows that our contention-friendly data structures are more efficient than their non-contention-friendly counterparts. In particular, our lockbased skip list is up to 1:3 faster than the Java concurrent skip list, our lock-based tree is up to 2:2 faster than the most recent concurrent tree algorithm we are aware of, and our lock-free hash table outperforms by up to 1:2 the Java concurrent hash table. We also present contention-friendly versions of the skip list and binary search tree using transactional memory. Even though our transaction-based data structures are substantially slower than our lock-based ones, they inherit compositionality from transactional memory and outperform their non-contention-friendly counterparts by 1:5 on average.Ce rapport présente une approche méthodologique pour les structures de recherche concurrentes avec des applcations aux listes å saut (skip list), arbres et table de hachage (hash table)

    A Concurrency-Optimal Binary Search Tree

    The paper presents the first \emph{concurrency-optimal} implementation of a binary search tree (BST). The implementation, based on a standard sequential implementation of an internal tree, ensures that every \emph{schedule} is accepted, i.e., interleaving of steps of the sequential code, unless linearizability is violated. To ensure this property, we use a novel read-write locking scheme that protects tree \emph{edges} in addition to nodes. Our implementation outperforms the state-of-the art BSTs on most basic workloads, which suggests that optimizing the set of accepted schedules of the sequential code can be an adequate design principle for efficient concurrent data structures

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    As other fundamental programming abstractions in energy-efficient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a kk-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size-fixed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert, and Delete operations are O(log⁥BN)O(\log_B N), where N,BN, B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and {\em Delete} operations are non-blocking to other Search, Insert, and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    A Template for Implementing Fast Lock-free Trees Using HTM

    Algorithms that use hardware transactional memory (HTM) must provide a software-only fallback path to guarantee progress. The design of the fallback path can have a profound impact on performance. If the fallback path is allowed to run concurrently with hardware transactions, then hardware transactions must be instrumented, adding significant overhead. Otherwise, hardware transactions must wait for any processes on the fallback path, causing concurrency bottlenecks, or move to the fallback path. We introduce an approach that combines the best of both worlds. The key idea is to use three execution paths: an HTM fast path, an HTM middle path, and a software fallback path, such that the middle path can run concurrently with each of the other two. The fast path and fallback path do not run concurrently, so the fast path incurs no instrumentation overhead. Furthermore, fast path transactions can move to the middle path instead of waiting or moving to the software path. We demonstrate our approach by producing an accelerated version of the tree update template of Brown et al., which can be used to implement fast lock-free data structures based on down-trees. We used the accelerated template to implement two lock-free trees: a binary search tree (BST), and an (a,b)-tree (a generalization of a B-tree). Experiments show that, with 72 concurrent processes, our accelerated (a,b)-tree performs between 4.0x and 4.2x as many operations per second as an implementation obtained using the original tree update template
