8 research outputs found

    Cache-Oblivious Persistence

    Full text link
    Partial persistence is a general transformation that takes a data structure and allows queries to be executed on any past state of the structure. The cache-oblivious model is the leading model of a modern multi-level memory hierarchy.We present the first general transformation for making cache-oblivious model data structures partially persistent

    Succinct Indices for Range Queries with applications to Orthogonal Range Maxima

    Full text link
    We consider the problem of preprocessing NN points in 2D, each endowed with a priority, to answer the following queries: given a axis-parallel rectangle, determine the point with the largest priority in the rectangle. Using the ideas of the \emph{effective entropy} of range maxima queries and \emph{succinct indices} for range maxima queries, we obtain a structure that uses O(N) words and answers the above query in O(logNloglogN)O(\log N \log \log N) time. This is a direct improvement of Chazelle's result from FOCS 1985 for this problem -- Chazelle required O(N/ϵ)O(N/\epsilon) words to answer queries in O((logN)1+ϵ)O((\log N)^{1+\epsilon}) time for any constant ϵ>0\epsilon > 0.Comment: To appear in ICALP 201

    External-memory search trees with fast insertions

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 65-68).This thesis provides both experimental and theoretical contributions regarding external-memory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B-tree, a data structure that provably outperforms B-trees for updates at the cost of a constant factor decrease in query performance. This thesis also describes the cache-oblivious lookahead array, which outperforms B-trees for updates at a logarithmic cost in query performance, and does so without knowing the cache parameters of the system it is being run on. The buffered repository B-tree is an external-memory search tree that can be tuned for a tradeoff between queries and updates. Specifically, for any E [1/ lg B, 1] this data structure achieves O((1/EBl-E)(1 + logB(N/B))) block transfers for INSERT and DELETE and 0((/(1 + logB(N/B))) block transfers for SEARCH. The update complexity is amortized and is O((1/e)(1 + logB(N/B))) in the worst case. Using the value = 1/2, I was able to achieve a 17 times increase in insertion performance at the cost of only a 3 times decrease in search performance on a database with 12-byte items on a disk with a 4-kilobyte block size.(cont.) This thesis also shows how to build a cache-oblivious data structure, the cache-oblivious lookahead array, which achieves the same bounds as the buffered repository B'-tree in the case where e = 1/ lg B. Specifically, it achieves an update complexity of O((1/B) log(N/B)) and a query complexity of O(log(N/B)) block transfers. This is the first data structure to achieve these bounds cache-obliviously. The research involving the cache-oblivious lookahead array represents joint work with Michael A. Bender, Jeremy Fineman, and Bradley C. Kuszmaul.by Jelani Nelson.M.Eng

    New Combinatorial Properties and Algorithms for AVL Trees

    Get PDF
    In this thesis, new properties of AVL trees and a new partitioning of binary search trees named core partitioning scheme are discussed, this scheme is applied to three binary search trees namely AVL trees, weight-balanced trees, and plain binary search trees. We introduce the core partitioning scheme, which maintains a balanced search tree as a dynamic collection of complete balanced binary trees called cores. Using this technique we achieve the same theoretical efficiency of modern cache-oblivious data structures by using classic data structures such as weight-balanced trees or height balanced trees (e.g. AVL trees). We preserve the original topology and algorithms of the given balanced search tree using a simple post-processing with guaranteed performance to completely rebuild the changed cores (possibly all of them) after each update. Using our core partitioning scheme, we simultaneously achieve good memory allocation, space-efficient representation, and cache-obliviousness. We also apply this scheme to arbitrary binary search trees which can be unbalanced and we produce a new data structure, called Cache-Oblivious General Balanced Tree (COG-tree). Using our scheme, searching a key requires O(log_B n) block transfers and O(log n) comparisons in the external-memory and in the cache-oblivious model. These complexities are theoretically efficient. Interestingly, the core partition for weight-balanced trees and COG-tree can be maintained with amortized O(log_B n) block transfers per update, whereas maintaining the core partition for AVL trees requires more than a poly-logarithmic amortized cost. Studying the properties of these trees also lead us to some other new properties of AVL trees and trees with bounded degree, namely, we present and study gaps in AVL trees and we prove Tarjan et al.'s conjecture on the number of rotations in a sequence of deletions and insertions

    Algorithms incorporating concurrency and caching

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 189-203).This thesis describes provably good algorithms for modern large-scale computer systems, including today's multicores. Designing efficient algorithms for these systems involves overcoming many challenges, including concurrency (dealing with parallel accesses to the same data) and caching (achieving good memory performance.) This thesis includes two parallel algorithms that focus on testing for atomicity violations in a parallel fork-join program. These algorithms augment a parallel program with a data structure that answers queries about the program's structure, on the fly. Specifically, one data structure, called SP-ordered-bags, maintains the series-parallel relationships among threads, which is vital for uncovering race conditions (bugs) in the program. Another data structure, called XConflict, aids in detecting conflicts in a transactional-memory system with nested parallel transactions. For a program with work T and span To, maintaining either data structure adds an overhead of PT, to the running time of the parallel program when executed on P processors using an efficient scheduler, yielding a total runtime of O(T1/P + PTo). For each of these data structures, queries can be answered in 0(1) time. This thesis also introduces the compressed sparse rows (CSB) storage format for sparse matrices, which allows both Ax and ATx to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz > n nonzeros and x is a dense n-vector. The parallel multiplication algorithm uses e(nnz) work and ... span, yielding a parallelism of ... , which is amply high for virtually any large matrix.(cont.) Also addressing concurrency, this thesis considers two scheduling problems. The first scheduling problem, motivated by transactional memory, considers randomized backoff when jobs have different lengths. I give an analysis showing that binary exponential backoff achieves makespan V2e(6v 1- i ) with high probability, where V is the total length of all n contending jobs. This bound is significantly larger than when jobs are all the same size. A variant of exponential backoff, however, achieves makespan of ... with high probability. I also present the size-hashed backoff protocol, specifically designed for jobs having different lengths, that achieves makespan ... with high probability. The second scheduling problem considers scheduling n unit-length jobs on m unrelated machines, where each job may fail probabilistically. Specifically, an input consists of a set of n jobs, a directed acyclic graph G describing the precedence constraints among jobs, and a failure probability qij for each job j and machine i. The goal is to find a schedule that minimizes the expected makespan. I give an O(log log(min {m, n}))-approximation for the case of independent jobs (when there are no precedence constraints) and an O(log(n + m) log log(min {m, n}))-approximation algorithm when precedence constraints form disjoint chains. This chain algorithm can be extended into one that supports precedence constraints that are trees, which worsens the approximation by another log(n) factor. To address caching, this thesis includes several new variants of cache-oblivious dynamic dictionaries.(cont.) A cache-oblivious dictionary fills the same niche as a classic B-tree, but it does so without tuning for particular memory parameters. Thus, cache-oblivious dictionaries optimize for all levels of a multilevel hierarchy and are more portable than traditional B-trees. I describe how to add concurrency to several previously existing cache-oblivious dictionaries. I also describe two new data structures that achieve significantly cheaper insertions with a small overhead on searches. The cache-oblivious lookahead array (COLA) supports insertions/deletions and searches in O((1/B) log N) and O(log N) memory transfers, respectively, where B is the block size, M is the memory size, and N is the number of elements in the data structure. The xDict supports these operations in O((1/1B E1-) logB(N/M)) and O((1/)0logB(N/M)) memory transfers, respectively, where 0 < E < 1 is a tunable parameter. Also on caching, this thesis answers the question: what is the worst possible page-replacement strategy? The goal of this whimsical chapter is to devise an online strategy that achieves the highest possible fraction of page faults / cache misses as compared to the worst offline strategy. I show that there is no deterministic strategy that is competitive with the worst offline. I also give a randomized strategy based on the most recently used heuristic and show that it is the worst possible pagereplacement policy. On a more serious note, I also show that direct mapping is, in some sense, a worst possible page-replacement policy. Finally, this thesis includes a new algorithm, following a new approach, for the problem of maintaining a topological ordering of a dag as edges are dynamically inserted.(cont.) The main result included here is an O(n2 log n) algorithm for maintaining a topological ordering in the presence of up to m < n(n - 1)/2 edge insertions. In contrast, the previously best algorithm has a total running time of O(min { m3/ 2, n5/2 }). Although these algorithms are not parallel and do not exhibit particularly good locality, some of the data structural techniques employed in my solution are similar to others in this thesis.by Jeremy T. Fineman.Ph.D
    corecore