14,091 research outputs found

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    Full text link
    As other fundamental programming abstractions in energy-efficient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a kk-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size-fixed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert, and Delete operations are O(logBN)O(\log_B N), where N,BN, B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and {\em Delete} operations are non-blocking to other Search, Insert, and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    Structural Analysis: Shape Information via Points-To Computation

    Full text link
    This paper introduces a new hybrid memory analysis, Structural Analysis, which combines an expressive shape analysis style abstract domain with efficient and simple points-to style transfer functions. Using data from empirical studies on the runtime heap structures and the programmatic idioms used in modern object-oriented languages we construct a heap analysis with the following characteristics: (1) it can express a rich set of structural, shape, and sharing properties which are not provided by a classic points-to analysis and that are useful for optimization and error detection applications (2) it uses efficient, weakly-updating, set-based transfer functions which enable the analysis to be more robust and scalable than a shape analysis and (3) it can be used as the basis for a scalable interprocedural analysis that produces precise results in practice. The analysis has been implemented for .Net bytecode and using this implementation we evaluate both the runtime cost and the precision of the results on a number of well known benchmarks and real world programs. Our experimental evaluations show that the domain defined in this paper is capable of precisely expressing the majority of the connectivity, shape, and sharing properties that occur in practice and, despite the use of weak updates, the static analysis is able to precisely approximate the ideal results. The analysis is capable of analyzing large real-world programs (over 30K bytecodes) in less than 65 seconds and using less than 130MB of memory. In summary this work presents a new type of memory analysis that advances the state of the art with respect to expressive power, precision, and scalability and represents a new area of study on the relationships between and combination of concepts from shape and points-to analyses

    HardScope: Thwarting DOP with Hardware-assisted Run-time Scope Enforcement

    Full text link
    Widespread use of memory unsafe programming languages (e.g., C and C++) leaves many systems vulnerable to memory corruption attacks. A variety of defenses have been proposed to mitigate attacks that exploit memory errors to hijack the control flow of the code at run-time, e.g., (fine-grained) randomization or Control Flow Integrity. However, recent work on data-oriented programming (DOP) demonstrated highly expressive (Turing-complete) attacks, even in the presence of these state-of-the-art defenses. Although multiple real-world DOP attacks have been demonstrated, no efficient defenses are yet available. We propose run-time scope enforcement (RSE), a novel approach designed to efficiently mitigate all currently known DOP attacks by enforcing compile-time memory safety constraints (e.g., variable visibility rules) at run-time. We present HardScope, a proof-of-concept implementation of hardware-assisted RSE for the new RISC-V open instruction set architecture. We discuss our systematic empirical evaluation of HardScope which demonstrates that it can mitigate all currently known DOP attacks, and has a real-world performance overhead of 3.2% in embedded benchmarks

    Parallel Working-Set Search Structures

    Full text link
    In this paper we present two versions of a parallel working-set map on p processors that supports searches, insertions and deletions. In both versions, the total work of all operations when the map has size at least p is bounded by the working-set bound, i.e., the cost of an item depends on how recently it was accessed (for some linearization): accessing an item in the map with recency r takes O(1+log r) work. In the simpler version each map operation has O((log p)^2+log n) span (where n is the maximum size of the map). In the pipelined version each map operation on an item with recency r has O((log p)^2+log r) span. (Operations in parallel may have overlapping span; span is additive only for operations in sequence.) Both data structures are designed to be used by a dynamic multithreading parallel program that at each step executes a unit-time instruction or makes a data structure call. To achieve the stated bounds, the pipelined data structure requires a weak-priority scheduler, which supports a limited form of 2-level prioritization. At the end we explain how the results translate to practical implementations using work-stealing schedulers. To the best of our knowledge, this is the first parallel implementation of a self-adjusting search structure where the cost of an operation adapts to the access sequence. A corollary of the working-set bound is that it achieves work static optimality: the total work is bounded by the access costs in an optimal static search tree.Comment: Authors' version of a paper accepted to SPAA 201

    Smooth heaps and a dual view of self-adjusting data structures

    Full text link
    We present a new connection between self-adjusting binary search trees (BSTs) and heaps, two fundamental, extensively studied, and practically relevant families of data structures. Roughly speaking, we map an arbitrary heap algorithm within a natural model, to a corresponding BST algorithm with the same cost on a dual sequence of operations (i.e. the same sequence with the roles of time and key-space switched). This is the first general transformation between the two families of data structures. There is a rich theory of dynamic optimality for BSTs (i.e. the theory of competitiveness between BST algorithms). The lack of an analogous theory for heaps has been noted in the literature. Through our connection, we transfer all instance-specific lower bounds known for BSTs to a general model of heaps, initiating a theory of dynamic optimality for heaps. On the algorithmic side, we obtain a new, simple and efficient heap algorithm, which we call the smooth heap. We show the smooth heap to be the heap-counterpart of Greedy, the BST algorithm with the strongest proven and conjectured properties from the literature, widely believed to be instance-optimal. Assuming the optimality of Greedy, the smooth heap is also optimal within our model of heap algorithms. As corollaries of results known for Greedy, we obtain instance-specific upper bounds for the smooth heap, with applications in adaptive sorting. Intriguingly, the smooth heap, although derived from a non-practical BST algorithm, is simple and easy to implement (e.g. it stores no auxiliary data besides the keys and tree pointers). It can be seen as a variation on the popular pairing heap data structure, extending it with a "power-of-two-choices" type of heuristic.Comment: Presented at STOC 2018, light revision, additional figure
    corecore