14,712 research outputs found

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    Full text link
    As other fundamental programming abstractions in energy-efficient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a kk-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size-fixed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert, and Delete operations are O(logBN)O(\log_B N), where N,BN, B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and {\em Delete} operations are non-blocking to other Search, Insert, and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    A Unified approach to concurrent and parallel algorithms on balanced data structures

    Get PDF
    Concurrent and parallel algorithms are different. However, in the case of dictionaries, both kinds of algorithms share many common points. We present a unified approach emphasizing these points. It is based on a careful analysis of the sequential algorithm, extracting from it the more basic facts, encapsulated later on as local rules. We apply the method to the insertion algorithms in AVL trees. All the concurrent and parallel insertion algorithms have two main phases. A percolation phase, moving the keys to be inserted down, and a rebalancing phase. Finally, some other algorithms and balanced structures are discussed.Postprint (published version

    Non-blocking Priority Queue based on Skiplists with Relaxed Semantics

    Full text link
    Priority queues are data structures that store information in an orderly fashion. They are of tremendous importance because they are an integral part of many applications, like Dijkstra’s shortest path algorithm, MST algorithms, priority schedulers, and so on. Since priority queues by nature have high contention on the delete_min operation, the design of an efficient priority queue should involve an intelligent choice of the data structure as well as relaxation bounds on the data structure. Lock-free data structures provide higher scalability as well as progress guarantee than a lock-based data structure. That is another factor to be considered in the priority queue design. We present a relaxed non-blocking priority queue based on skiplists. We address all the design issues mentioned above in our priority queue. Use of skiplists allows multiple threads to concurrently access different parts of the skiplist quickly, whereas relaxing the priority queue delete_min operation distributes contention over the skiplist instead of just at the front. Furthermore, a non-blocking implementation guarantees that the system will make progress even when some process fails. Our priority queue is internally composed of several priority queues, one for each thread and one shared priority queue common to all threads. Each thread selects the best value from its local priority queue and the shared priority queue and returns the value. In case a thread is unable to delete an item, it tries to spy items from other threads\u27 local priority queues. We experimentally and theoretically show the correctness of our data structure. We also compare the performance of our data structure with other variations like priority queues based on coarse-grained skiplists for both relaxed and non-relaxed semantics

    Parallel Working-Set Search Structures

    Full text link
    In this paper we present two versions of a parallel working-set map on p processors that supports searches, insertions and deletions. In both versions, the total work of all operations when the map has size at least p is bounded by the working-set bound, i.e., the cost of an item depends on how recently it was accessed (for some linearization): accessing an item in the map with recency r takes O(1+log r) work. In the simpler version each map operation has O((log p)^2+log n) span (where n is the maximum size of the map). In the pipelined version each map operation on an item with recency r has O((log p)^2+log r) span. (Operations in parallel may have overlapping span; span is additive only for operations in sequence.) Both data structures are designed to be used by a dynamic multithreading parallel program that at each step executes a unit-time instruction or makes a data structure call. To achieve the stated bounds, the pipelined data structure requires a weak-priority scheduler, which supports a limited form of 2-level prioritization. At the end we explain how the results translate to practical implementations using work-stealing schedulers. To the best of our knowledge, this is the first parallel implementation of a self-adjusting search structure where the cost of an operation adapts to the access sequence. A corollary of the working-set bound is that it achieves work static optimality: the total work is bounded by the access costs in an optimal static search tree.Comment: Authors' version of a paper accepted to SPAA 201
    corecore