30 research outputs found
Bulk updates and cache sensitivity in search trees
This thesis examines two topics related to binary search trees: cache-sensitive memory layouts and AVL-tree bulk-update operations. Bulk updates are also applied to adaptive sorting.
Cache-sensitive data structures are tailored to the hardware caches in modern computers. The thesis presents a method for adding cache-sensitivity to binary search trees without changing the rebalancing strategy. Cache-sensitivity is maintained using worst-case constant-time operations executed when the tree changes. The thesis presents experiments performed on AVL trees and red-black trees, including a comparison with cache-sensitive B-trees.
Next, the thesis examines bulk insertion and bulk deletion in AVL trees. Bulk insertion inserts several keys in one operation. The number of rotations used by AVL-tree bulk insertion is shown to be worst-case logarithmic in the number of inserted keys, if they go to the same location in the tree. Bulk deletion deletes an interval of keys. When amortized over a sequence of bulk deletions, each deletion requires a number of rotations that is logarithmic in the number of deleted keys. The search cost and total rebalancing complexity of inserting or deleting keys from several locations in the tree are also analyzed. Experiments show that the algorithms work efficiently with randomly generated input data.
Adaptive sorting algorithms are efficient when the input is nearly sorted according to some measure of presortedness. The thesis presents an AVL-tree-based variation of the adaptive sorting algorithm known as local insertion sort. Bulk insertion is applied by extracting consecutive ascending or descending keys from the input to be sorted. A variant that does not require a special bulk-insertion algorithm is also given. Experiments show that applying bulk insertion considerably reduces the number of comparisons and time needed to sort nearly sorted sequences. The algorithms are also compared with various other adaptive and non-adaptive sorting algorithms
The splay-list: A distribution-adaptive concurrent skip-list
The design and implementation of efficient concurrent data structures have
seen significant attention. However, most of this work has focused on
concurrent data structures providing good \emph{worst-case} guarantees. In real
workloads, objects are often accessed at different rates, since access
distributions may be non-uniform. Efficient distribution-adaptive data
structures are known in the sequential case, e.g. the splay-trees; however,
they often are hard to translate efficiently in the concurrent case.
In this paper, we investigate distribution-adaptive concurrent data
structures and propose a new design called the splay-list. At a high level, the
splay-list is similar to a standard skip-list, with the key distinction that
the height of each element adapts dynamically to its access rate: popular
elements ``move up,'' whereas rarely-accessed elements decrease in height. We
show that the splay-list provides order-optimal amortized complexity bounds for
a subset of operations while being amenable to efficient concurrent
implementation. Experimental results show that the splay-list can leverage
distribution-adaptivity to improve on the performance of classic concurrent
designs, and can outperform the only previously-known distribution-adaptive
design in certain settings
New Combinatorial Properties and Algorithms for AVL Trees
In this thesis, new properties of AVL trees and a new partitioning of binary search trees named
core partitioning scheme are discussed, this scheme is applied to three binary search trees namely AVL trees, weight-balanced trees, and plain binary search trees.
We introduce the core partitioning scheme, which maintains a balanced search tree as a dynamic
collection of complete balanced binary trees called cores. Using this technique we achieve the same theoretical efficiency of modern cache-oblivious data structures by using classic data structures such as weight-balanced trees or height balanced trees (e.g. AVL trees). We preserve the original topology and algorithms of the given balanced search tree using a simple post-processing with guaranteed performance to completely rebuild the changed cores (possibly all of them) after each update. Using our core partitioning scheme, we simultaneously achieve good memory allocation, space-efficient representation, and cache-obliviousness. We also apply this scheme to arbitrary binary search trees which can be unbalanced and we produce a new data structure, called Cache-Oblivious General Balanced Tree (COG-tree).
Using our scheme, searching a key requires O(log_B n) block transfers and O(log n) comparisons
in the external-memory and in the cache-oblivious model. These complexities are theoretically efficient. Interestingly, the core partition for weight-balanced trees and COG-tree can be maintained with amortized O(log_B n) block transfers per update, whereas maintaining the core partition for AVL trees requires more than a poly-logarithmic amortized cost.
Studying the properties of these trees also lead us to some other new properties of AVL trees
and trees with bounded degree, namely, we present and study gaps in AVL trees and we prove Tarjan et al.'s conjecture on the number of rotations in a sequence of deletions and insertions
Techniques for Constructing Efficient Lock-free Data Structures
Building a library of concurrent data structures is an essential way to
simplify the difficult task of developing concurrent software. Lock-free data
structures, in which processes can help one another to complete operations,
offer the following progress guarantee: If processes take infinitely many
steps, then infinitely many operations are performed. Handcrafted lock-free
data structures can be very efficient, but are notoriously difficult to
implement. We introduce numerous tools that support the development of
efficient lock-free data structures, and especially trees.Comment: PhD thesis, Univ Toronto (2017
O2-tree: a shared memory resident index in multicore architectures
Shared memory multicore computer architectures are now commonplace in computing.
These can be found in modern desktops and workstation computers and also in High
Performance Computing (HPC) systems. Recent advances in memory architecture and
in 64-bit addressing, allow such systems to have memory sizes of the order of hundreds of
gigabytes and beyond. This now allows for realistic development of main memory resident
database systems. This still requires the use of a memory resident index such as T-Tree,
and the B+-Tree for fast access to the data items.
This thesis proposes a new indexing structure, called the O2-Tree, which is essentially
an augmented Red-Black Tree in which the leaf nodes are index data blocks that store
multiple pairs of key and value referred to as \key-value" pairs. The value is either the
entire record associated with the key or a pointer to the location of the record. The
internal nodes contain copies of the keys that split blocks of the leaf nodes in a manner
similar to the B+-Tree. O2-Tree structure has the advantage that: it can be easily
reconstructed by reading only the lowest value of the key of each leaf node page. The size
is su ciently small and thus can be dumped and restored much faster.
Analysis and comparative experimental study show that the performance of the O2-Tree
is superior to other tree-based index structures with respect to various query operations
for large datasets. We also present results which indicate that the O2-Tree outperforms
popular key-value stores such as BerkelyDB and TreeDB of Kyoto Cabinet for various
workloads. The thesis addresses various concurrent access techniques for the O2-Tree for
shared memory multicore architecture and gives analysis of the O2-Tree with respect to
query operations, storage utilization, failover and recovery