181 research outputs found

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    Full text link
    As other fundamental programming abstractions in energy-efficient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a kk-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size-fixed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert, and Delete operations are O(logBN)O(\log_B N), where N,BN, B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and {\em Delete} operations are non-blocking to other Search, Insert, and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    Get PDF
    As other fundamental programming abstractions in energy-e cient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a k-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size- xed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert and Delete operations are O(logBN), where N;B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and Delete operations are non-blocking to other Search, Insert and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    Models for energy consumption of data structures and algorithms

    Get PDF
    EXCESS deliverable D2.1. More information at http://www.excess-project.eu/This deliverable reports our early energy models for data structures and algorithms based on both micro-benchmarks and concurrent algorithms. It reports the early results of Task 2.1 on investigating and modeling the trade-off between energy and performance in concurrent data structures and algorithms, which forms the basis for the whole work package 2 (WP2). The work has been conducted on the two main EXCESS platforms: (1) Intel platform with recent Intel multi-core CPUs and (2) Movidius embedded platform

    White-box methodologies, programming abstractions and libraries

    Get PDF
    EXCESS deliverable D2.2. More information at http://www.excess-project.eu/This deliverable reports the results of white-box methodologies and early results ofthe first prototype of libraries and programming abstractions as available by projectmonth 18 by Work Package 2 (WP2). It reports i) the latest results of Task 2.2on white-box methodologies, programming abstractions and libraries for developingenergy-efficient data structures and algorithms and ii) the improved results of Task2.1 on investigating and modeling the trade-off between energy and performance ofconcurrent data structures and algorithms. The work has been conducted on two mainEXCESS platforms: Intel platforms with recent Intel multicore CPUs and MovidiusMyriad1 platform

    Parallel Longest Increasing Subsequence and van Emde Boas Trees

    Full text link
    This paper studies parallel algorithms for the longest increasing subsequence (LIS) problem. Let nn be the input size and kk be the LIS length of the input. Sequentially, LIS is a simple problem that can be solved using dynamic programming (DP) in O(nlogn)O(n\log n) work. However, parallelizing LIS is a long-standing challenge. We are unaware of any parallel LIS algorithm that has optimal O(nlogn)O(n\log n) work and non-trivial parallelism (i.e., O~(k)\tilde{O}(k) or o(n)o(n) span). This paper proposes a parallel LIS algorithm that costs O(nlogk)O(n\log k) work, O~(k)\tilde{O}(k) span, and O(n)O(n) space, and is much simpler than the previous parallel LIS algorithms. We also generalize the algorithm to a weighted version of LIS, which maximizes the weighted sum for all objects in an increasing subsequence. To achieve a better work bound for the weighted LIS algorithm, we designed parallel algorithms for the van Emde Boas (vEB) tree, which has the same structure as the sequential vEB tree, and supports work-efficient parallel batch insertion, deletion, and range queries. We also implemented our parallel LIS algorithms. Our implementation is light-weighted, efficient, and scalable. On input size 10910^9, our LIS algorithm outperforms a highly-optimized sequential algorithm (with O(nlogk)O(n\log k) cost) on inputs with k3×105k\le 3\times 10^5. Our algorithm is also much faster than the best existing parallel implementation by Shen et al. (2022) on all input instances.Comment: to be published in Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '23

    Fast Arrays: Atomic Arrays with Constant Time Initialization

    Get PDF
    Some algorithms require a large array, but only operate on a small fraction of its indices. Examples include adjacency matrices for sparse graphs, hash tables, and van Emde Boas trees. For such algorithms, array initialization can be the most time-consuming operation. Fast arrays were invented to avoid this costly initialization. A fast array is a software implementation of an array, such that the entire array can be initialized in just constant time. While algorithms for sequential fast arrays have been known for a long time, to the best of our knowledge, there are no previous algorithms for concurrent fast arrays. We present the first such algorithms in this paper. Our first algorithm is linearizable and wait-free, uses only linear space, and supports all operations - initialize, read, and write - in constant time. Our second algorithm enhances the first to additionally support all the read-modify-write operations available in hardware (such as compare-and-swap) in constant time
    corecore