9 research outputs found

    Efficient Rebalancing of Chromatic Search Trees

    Get PDF
    In PODS'91, Nurmi and Soisalon-Soininen presented a new type of binary search tree for databases, which they call a chromatic tree. The aim is to improve runtime performance by allowing a greater degree of concurrency, which, in turn, is obtained by uncoupling updating from rebalancing. This also allows rebalancing to be postponed completely or partially until after peak working hours.The advantages of the proposal of Nurmi and Soisalon-Soininen are quite significant, but there are definite problems with it. First, they give no explicit upper bound on the complexity of their algorithm. Second, some of their rebalancing operations can be applied many more times than necessary. Third, some of their operations, when removing one problem, create another.We define a new set of rebalancing operations which we prove give rise to at most I_ log_2(N+1) _I - 1 $ rebalancing operations per insertion and at most I_ log_2 (N+1)_I - 2 rebalancing operations per deletion, where N is the maximum'size the tree could ever have, given its initial size and the number of insertions performed. Most of these rebalancing operations, in fact, do no restructuring; they simply move weights around. The number of operations which actually change the structure of the tree is at most one per update

    Efficient Rebalancing of Chromatic Search Trees

    Get PDF
    In PODS'91, Nurmi and Soisalon-Soininen presented a new type of binary search tree for databases, which they call a chromatic tree. The aim is to improve runtime performance by allowing a greater degree of concurrency, which, in turn, is obtained by uncoupling updating from rebalancing. This also allows rebalancing to be postponed completely or partially until after peak working hours.The advantages of the proposal of Nurmi and Soisalon-Soininen are quite significant, but there are definite problems with it. First, they give no explicit upper bound on the complexity of their algorithm. Second, some of their rebalancing operations can be applied many more times than necessary. Third, some of their operations, when removing one problem, create another.We define a new set of rebalancing operations which we prove give rise to at most I_ log_2(N+1) _I - 1 $ rebalancing operations per insertion and at most I_ log_2 (N+1)_I - 2 rebalancing operations per deletion, where N is the maximum'size the tree could ever have, given its initial size and the number of insertions performed. Most of these rebalancing operations, in fact, do no restructuring; they simply move weights around. The number of operations which actually change the structure of the tree is at most one per update

    High Level Efficiency in Database Languages

    Get PDF
    The subject of this Ph.D. thesis is the design and implementation of database languages. The thesis consists of five articles:  [1] Joan F. Boyar and Kim S. Larsen. Efficient Rebalancing of Chromatic Search Trees. In O. Nurmi and E. Ukkonen, eds., LNCS 621: Algorithm Theory -- SWAT'92 , pp. 151-164. Springer-Verlag, 1992. [2] Kim S. Larsen. On Aggregation and Computation on Domain Values. PB-414, Computer Science Department, Aarhus University, 1992. [3] Kim S. Larsen. Strategies for Expression Evaluation Using Sort-Merge Algorithms. PB-415, Computer Science Department, Aarhus University, 1992. [4] Kim S. Larsen and Michael I. Schwartzbach. Injectivity of Unary Queries With Computation on Domain Values. Computer Science Department, Aarhus University, 1992. Revised version of PB-311. [5] Kim S. Larsen, Michael I. Schwartzbach and Erik M. Schmidt. A New Formalism for Relational Algebra. IPL , 41(3):163-168, 1992. and this survey paper. In [5], a new query language design is proposed. The expressive power of the language is determined in [2] and all reasonable extensions are considered. In [3, 4], we focus on the optimization issue of avoiding unnecessary sorting of relations. The results in these papers are directly applicable to any algebra-based query language. In addition to the query language part, a database system also has to offer update facilities. The theory of standard tuple based updates is quite well developed in the sequential case. In [1], we discuss a new concurrent implementation of balanced search trees for that purpose.This survey paper describes the results of the papers which form the thesis, and relates these results to each other and to the area in a broader sense than is customary in the introductions of individual papers. The paper is intended to be read in combination with the papers on which it is based

    AVL Trees With Relaxed Balance

    Get PDF
    AVL trees with relaxed balance were introduced with the aim of improving runtime per formance by allowing a greater degree of concurrency. This is obtained by uncoupling updating from rebalancing. An additional benefit is that rebalancing can be controlled separately. In particular, it can be postponed completely or partially until after peak working hours.We define a new collection of rebalancing operations which allows for a significantly greater degree of concurrency than the original proposal. Additionally, in contrast to the original proposal, we prove the complexity of our algorithm.If N is the maximum size of the tree, we prove that each insertion gives rise to at most I_ log_Phi(N + 3/2) + log_Phi(squareroot{5}) - 3 _I rebalancing operations and that each deletion gives rise to at most I_ log_Phi(N + 3/2) + log_Phi(squareroot{5}) - 4 _I rebalancing operations, where Phi is the golden ratio

    A Pedagogically Sound yet Efficient Deletion algorithm for Red-Black Trees: The Parity-Seeking Delete Algorithm

    Full text link
    Red-black (RB) trees are one of the most efficient variants of balanced binary search trees. However, they have always been blamed for being too complicated, hard to explain, and not suitable for pedagogical purposes. Sedgewick (2008) proposed left-leaning red-black (LLRB) trees in which red links are restricted to left children, and proposed recursive concise insert and delete algorithms. However, the top-down deletion algorithm of LLRB is still very complicated and highly inefficient. In this paper, we first consider 2-3 red-black trees in which both children cannot be red. We propose a parity-seeking delete algorithm with the basic idea of making the deficient subtree on a par with its sibling: either by fixing the deficient subtree or by making the sibling deficient, as well, ascending deficiency to the parent node. This is the first pedagogically sound algorithm for the delete operation in red-black trees. Then, we amend our algorithm and propose a parity-seeking delete algorithm for classical RB trees. Our experiments show that, despite having more rotations, 2-3 RB trees are almost as efficient as RB trees and twice faster than LLRB trees. Besides, RB trees with the proposed parity-seeking delete algorithm have the same number of rotations and almost identical running time as the classic delete algorithm. While being extremely efficient, the proposed parity-seeking delete algorithm is easily understandable and suitable for pedagogical purposes

    The Case for Learned Index Structures

    Full text link
    Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if a data record exists or not. In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term learned indexes. The key idea is that a model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records. We theoretically analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets. More importantly though, we believe that the idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work just provides a glimpse of what might be possible

    CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

    Full text link
    Learned indexes, which use machine learning models to replace traditional index structures, have shown promising results in recent studies. However, our understanding of this new type of index structure is still at an early stage with many details that need to be carefully examined and improved. In this paper, we propose a cache-aware learned index (CARMI) design to improve the efficiency of the Recursive Model Index (RMI) framework proposed by Kraska et al. and a cost-based construction algorithm to construct the optimal indexes in a wide variety of application scenarios. We formulate the problem of finding the optimal design of a learned index as an optimization problem and propose a dynamic programming algorithm for solving it and a partial greedy step to speed up. Experiments show that our index construction strategy can construct indexes with significantly better performance compared to baselines under various data distribution and workload requirements. Among them, CARMI can obtain an average of 2.52X speedup compared to B-tree, while using only about 0.56X memory space of B-tree on average.Comment: 16 pages, 15 figure

    O2-tree: a shared memory resident index in multicore architectures

    Get PDF
    Shared memory multicore computer architectures are now commonplace in computing. These can be found in modern desktops and workstation computers and also in High Performance Computing (HPC) systems. Recent advances in memory architecture and in 64-bit addressing, allow such systems to have memory sizes of the order of hundreds of gigabytes and beyond. This now allows for realistic development of main memory resident database systems. This still requires the use of a memory resident index such as T-Tree, and the B+-Tree for fast access to the data items. This thesis proposes a new indexing structure, called the O2-Tree, which is essentially an augmented Red-Black Tree in which the leaf nodes are index data blocks that store multiple pairs of key and value referred to as \key-value" pairs. The value is either the entire record associated with the key or a pointer to the location of the record. The internal nodes contain copies of the keys that split blocks of the leaf nodes in a manner similar to the B+-Tree. O2-Tree structure has the advantage that: it can be easily reconstructed by reading only the lowest value of the key of each leaf node page. The size is su ciently small and thus can be dumped and restored much faster. Analysis and comparative experimental study show that the performance of the O2-Tree is superior to other tree-based index structures with respect to various query operations for large datasets. We also present results which indicate that the O2-Tree outperforms popular key-value stores such as BerkelyDB and TreeDB of Kyoto Cabinet for various workloads. The thesis addresses various concurrent access techniques for the O2-Tree for shared memory multicore architecture and gives analysis of the O2-Tree with respect to query operations, storage utilization, failover and recovery

    Efficient Rebalancing of Chromatic Search Trees

    No full text
    corecore