55,080 research outputs found

    New Combinatorial Properties and Algorithms for AVL Trees

    Get PDF
    In this thesis, new properties of AVL trees and a new partitioning of binary search trees named core partitioning scheme are discussed, this scheme is applied to three binary search trees namely AVL trees, weight-balanced trees, and plain binary search trees. We introduce the core partitioning scheme, which maintains a balanced search tree as a dynamic collection of complete balanced binary trees called cores. Using this technique we achieve the same theoretical efficiency of modern cache-oblivious data structures by using classic data structures such as weight-balanced trees or height balanced trees (e.g. AVL trees). We preserve the original topology and algorithms of the given balanced search tree using a simple post-processing with guaranteed performance to completely rebuild the changed cores (possibly all of them) after each update. Using our core partitioning scheme, we simultaneously achieve good memory allocation, space-efficient representation, and cache-obliviousness. We also apply this scheme to arbitrary binary search trees which can be unbalanced and we produce a new data structure, called Cache-Oblivious General Balanced Tree (COG-tree). Using our scheme, searching a key requires O(log_B n) block transfers and O(log n) comparisons in the external-memory and in the cache-oblivious model. These complexities are theoretically efficient. Interestingly, the core partition for weight-balanced trees and COG-tree can be maintained with amortized O(log_B n) block transfers per update, whereas maintaining the core partition for AVL trees requires more than a poly-logarithmic amortized cost. Studying the properties of these trees also lead us to some other new properties of AVL trees and trees with bounded degree, namely, we present and study gaps in AVL trees and we prove Tarjan et al.'s conjecture on the number of rotations in a sequence of deletions and insertions

    Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

    Full text link
    Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern s[1..m] and a Lempel-Ziv representation of a string t[1..N], does s occur in t? Farach and Thorup gave a randomized O(nlog^2(N/n)+m) time solution for this problem, where n is the size of the compressed representation of t. We improve their result by developing a faster and fully deterministic O(nlog(N/n)+m) time algorithm with the same space complexity. Note that for highly compressible texts, log(N/n) might be of order n, so for such inputs the improvement is very significant. A (tiny) fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan.Comment: submitte

    Random Access to Grammar Compressed Strings

    Full text link
    Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. In this paper, we present a novel grammar representation that allows efficient random access to any character or substring without decompressing the string. Let SS be a string of length NN compressed into a context-free grammar S\mathcal{S} of size nn. We present two representations of S\mathcal{S} achieving O(logN)O(\log N) random access time, and either O(nαk(n))O(n\cdot \alpha_k(n)) construction time and space on the pointer machine model, or O(n)O(n) construction time and space on the RAM. Here, αk(n)\alpha_k(n) is the inverse of the kthk^{th} row of Ackermann's function. Our representations also efficiently support decompression of any substring in SS: we can decompress any substring of length mm in the same complexity as a single random access query and additional O(m)O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern PP with at most kk errors in time O(n(min{Pk,k4+P}+logN)+occ)O(n(\min\{|P|k, k^4 + |P|\} + \log N) + occ), where occocc is the number of occurrences of PP in SS. Finally, we generalize our results to navigation and other operations on grammar-compressed ordered trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy paths in grammars.Comment: Preliminary version in SODA 201

    Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing

    Full text link
    This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a data structure that handles unbounded-length keys whose comparison cost is not a constant. Examples of these keys are strings, multi-dimensional points, multiple-precision numbers, multi-key data (e.g.~records), XML paths, URL addresses, etc. The technique is more general than what has been done in previous work as no particular exploitation of the underlying structure of is required. The only requirement is that the insertion of a key must identify its predecessor or its successor. Using the proposed technique, online suffix tree can be constructed in worst case time O(logn)O(\log n) per input symbol (as opposed to amortized O(logn)O(\log n) time per symbol, achieved by previously known algorithms). To our knowledge, our algorithm is the first that achieves O(logn)O(\log n) worst case time per input symbol. Searching for a pattern of length mm in the resulting suffix tree takes O(min(mlogΣ,m+logn)+tocc)O(\min(m\log |\Sigma|, m + \log n) + tocc) time, where tocctocc is the number of occurrences of the pattern. The paper also describes more applications and show how to obtain alternative methods for dealing with suffix sorting, dynamic lowest common ancestors and order maintenance

    The pp-Center Problem in Tree Networks Revisited

    Get PDF
    We present two improved algorithms for weighted discrete pp-center problem for tree networks with nn vertices. One of our proposed algorithms runs in O(nlogn+plog2nlog(n/p))O(n \log n + p \log^2 n \log(n/p)) time. For all values of pp, our algorithm thus runs as fast as or faster than the most efficient O(nlog2n)O(n\log^2 n) time algorithm obtained by applying Cole's speed-up technique [cole1987] to the algorithm due to Megiddo and Tamir [megiddo1983], which has remained unchallenged for nearly 30 years. Our other algorithm, which is more practical, runs in O(nlogn+p2log2(n/p))O(n \log n + p^2 \log^2(n/p)) time, and when p=O(n)p=O(\sqrt{n}) it is faster than Megiddo and Tamir's O(nlog2nloglogn)O(n \log^2n \log\log n) time algorithm [megiddo1983]

    Linial arrangements and local binary search trees

    Full text link
    We study the set of NBC sets (no broken circuit sets) of the Linial arrangement and deduce a constructive bijection to the set of local binary search trees. We then generalize this construction to two families of Linial type arrangements for which the bijections are with some kk-ary labelled trees that we introduce for this purpose.Comment: 13 pages, 1 figure. arXiv admin note: text overlap with arXiv:1403.257
    corecore