4,162 research outputs found

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    Full text link
    As other fundamental programming abstractions in energy-efficient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a kk-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size-fixed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert, and Delete operations are O(logBN)O(\log_B N), where N,BN, B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and {\em Delete} operations are non-blocking to other Search, Insert, and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    Coarse-Grained, Fine-Grained, and Lock-Free Concurrency Approaches for Self-Balancing B-Tree

    Full text link
    This dissertation examines the concurrency approaches for a standard, unmodified B-Tree which is one of the more complex data structures. This includes the coarse grained, fine-grained locking, and the lock-free approaches. The basic industry standard coarse-grained approach is used as a base-line for comparison to the more advanced fine-grained and lock-free approaches. The fine-grained approach is explored and algorithms are presented for the fine-grained B-Tree insertion and deletion. The lock-free approach is addressed and an algorithm for a lock-free B- Tree insertion is provided. The issues associated with a lock-free deletion are discussed. Comparison trade-offs are presented and discussed. As a final part of this effort, specific testing processes are discussed and presented

    DeltaTree: A Practical Locality-aware Concurrent Search Tree

    Get PDF
    As other fundamental programming abstractions in energy-e cient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a k-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size- xed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert and Delete operations are O(logBN), where N;B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and Delete operations are non-blocking to other Search, Insert and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high

    O2-tree: a shared memory resident index in multicore architectures

    Get PDF
    Shared memory multicore computer architectures are now commonplace in computing. These can be found in modern desktops and workstation computers and also in High Performance Computing (HPC) systems. Recent advances in memory architecture and in 64-bit addressing, allow such systems to have memory sizes of the order of hundreds of gigabytes and beyond. This now allows for realistic development of main memory resident database systems. This still requires the use of a memory resident index such as T-Tree, and the B+-Tree for fast access to the data items. This thesis proposes a new indexing structure, called the O2-Tree, which is essentially an augmented Red-Black Tree in which the leaf nodes are index data blocks that store multiple pairs of key and value referred to as \key-value" pairs. The value is either the entire record associated with the key or a pointer to the location of the record. The internal nodes contain copies of the keys that split blocks of the leaf nodes in a manner similar to the B+-Tree. O2-Tree structure has the advantage that: it can be easily reconstructed by reading only the lowest value of the key of each leaf node page. The size is su ciently small and thus can be dumped and restored much faster. Analysis and comparative experimental study show that the performance of the O2-Tree is superior to other tree-based index structures with respect to various query operations for large datasets. We also present results which indicate that the O2-Tree outperforms popular key-value stores such as BerkelyDB and TreeDB of Kyoto Cabinet for various workloads. The thesis addresses various concurrent access techniques for the O2-Tree for shared memory multicore architecture and gives analysis of the O2-Tree with respect to query operations, storage utilization, failover and recovery

    Acute: high-level programming language design for distributed computation

    No full text
    Existing languages provide good support for typeful programming of standalone programs. In a distributed system, however, there may be interaction between multiple instances of many distinct programs, sharing some (but not necessarily all) of their module structure, and with some instances rebuilt with new versions of certain modules as time goes on. In this paper we discuss programming language support for such systems, focussing on their typing and naming issues. We describe an experimental language, Acute, which extends an ML core to support distributed development, deployment, and execution, allowing type-safe interaction between separately-built programs. The main features are: (1) type-safe marshalling of arbitrary values; (2) type names that are generated (freshly and by hashing) to ensure that type equality tests suffice to protect the invariants of abstract types, across the entire distributed system; (3) expression-level names generated to ensure that name equality tests suffice for type-safety of associated values, e.g. values carried on named channels; (4) controlled dynamic rebinding of marshalled values to local resources; and (5) thunkification of threads and mutexes to support computation mobility. These features are a large part of what is needed for typeful distributed programming. They are a relatively lightweight extension of ML, should be efficiently implementable, and are expressive enough to enable a wide variety of distributed infrastructure layers to be written as simple library code above the byte-string network and persistent store APIs. This disentangles the language runtime from communication intricacies. This paper highlights the main design choices in Acute. It is supported by a full language definition (of typing, compilation, and operational semantics), by a prototype implementation, and by example distribution libraries

    Software & system verification with KIV

    Get PDF

    Self Adjusting Contention Friendly Concurrent Binary Search Tree by Lazy Splaying

    Full text link
    We present a partial blocking implementation of concurrent binary search tree data structure that is contention friendly, fast and scales well. It uses a technique, called lazy splaying to move frequently accessed items close to the root without making the root of the tree a sequential bottleneck. Most of the self adjusting binary search trees are constrained to guarantee the height of a tree even in the presence of concurrency. But, this methodology roughly guarantees the height of a tree only in the absence of contention and limits the contention during concurrent accesses. The main idea is to divide the update operation into two operations:an eager abstract modification with lazy splayingthat completes quickly and makes at most one local rotation of the tree on each access as a function of historical access frequencies; anda lazy structural adaptation with long/semi splayingwhich implements top down recursive splaying of the tree that may be postponed to diminish contention and re-balance the tree during less contention. This way, the frequently accessed items perform full splaying but after few accesses only and always appear near the root of the tree. Whereas, the infrequently accessed items will not get enough pushes up the tree and stay in the bottom part of the tree. As in sequential counting based splay tree, the amortized time bound of each operation isO(log N), whereNis the number of items in the tree

    Concurrent Lazy Splay Tree with Relativistic Programming Approach

    Full text link
    A splay tree is a self-adjusting binary search tree in which recently accessed elements are quick to access again. Splay operation causes the sequential bottleneck at the root of the tree in concurrent environment. The Lazy splaying is to rotate the tree at most one per access so that very frequently accessed item does full splaying. We present the RCU (Read-copy-update) based synchronization mechanism for splay tree operations which allows reads to occur concurrently with updates such as deletion and restructuring by splay rotation. This approach is generalized as relativistic programming. The relativistic programming is the programming technique for concurrent shared-memory architectures which tolerates different threads seeing events occurring in different orders, so that events are not necessarily globally ordered, but rather subject to constraints of per-thread ordering. The main idea of the algorithm is that the update operations are carried out concurrently with traversals/reads. Each update is carried out for new reads to see the new state, while allowing pre-existing reads to proceed on the old state. Then the update is completed after all pre-existing reads have completed

    Easier Parallel Programming with Provably-Efficient Runtime Schedulers

    Get PDF
    Over the past decade processor manufacturers have pivoted from increasing uniprocessor performance to multicore architectures. However, utilizing this computational power has proved challenging for software developers. Many concurrency platforms and languages have emerged to address parallel programming challenges, yet writing correct and performant parallel code retains a reputation of being one of the hardest tasks a programmer can undertake. This dissertation will study how runtime scheduling systems can be used to make parallel programming easier. We address the difficulty in writing parallel data structures, automatically finding shared memory bugs, and reproducing non-deterministic synchronization bugs. Each of the systems presented depends on a novel runtime system which provides strong theoretical performance guarantees and performs well in practice
    corecore