32,812 research outputs found

    Heaviest Induced Ancestors and Longest Common Substrings

    Full text link
    Suppose we have two trees on the same set of leaves, in which nodes are weighted such that children are heavier than their parents. We say a node from the first tree and a node from the second tree are induced together if they have a common leaf descendant. In this paper we describe data structures that efficiently support the following heaviest-induced-ancestor query: given a node from the first tree and a node from the second tree, find an induced pair of their ancestors with maximum combined weight. Our solutions are based on a geometric interpretation that enables us to find heaviest induced ancestors using range queries. We then show how to use these results to build an LZ-compressed index with which we can quickly find with high probability a longest substring common to the indexed string and a given pattern

    Near-optimal labeling schemes for nearest common ancestors

    Full text link
    We consider NCA labeling schemes: given a rooted tree TT, label the nodes of TT with binary strings such that, given the labels of any two nodes, one can determine, by looking only at the labels, the label of their nearest common ancestor. For trees with nn nodes we present upper and lower bounds establishing that labels of size (2±ϵ)logn(2\pm \epsilon)\log n, ϵ<1\epsilon<1 are both sufficient and necessary. (All logarithms in this paper are in base 2.) Alstrup, Bille, and Rauhe (SIDMA'05) showed that ancestor and NCA labeling schemes have labels of size logn+Ω(loglogn)\log n +\Omega(\log \log n). Our lower bound increases this to logn+Ω(logn)\log n + \Omega(\log n) for NCA labeling schemes. Since Fraigniaud and Korman (STOC'10) established that labels in ancestor labeling schemes have size logn+Θ(loglogn)\log n +\Theta(\log \log n), our new lower bound separates ancestor and NCA labeling schemes. Our upper bound improves the 10logn10 \log n upper bound by Alstrup, Gavoille, Kaplan and Rauhe (TOCS'04), and our theoretical result even outperforms some recent experimental studies by Fischer (ESA'09) where variants of the same NCA labeling scheme are shown to all have labels of size approximately 8logn8 \log n

    Almost-Tight Distributed Minimum Cut Algorithms

    Full text link
    We study the problem of computing the minimum cut in a weighted distributed message-passing networks (the CONGEST model). Let λ\lambda be the minimum cut, nn be the number of nodes in the network, and DD be the network diameter. Our algorithm can compute λ\lambda exactly in O((nlogn+D)λ4log2n)O((\sqrt{n} \log^{*} n+D)\lambda^4 \log^2 n) time. To the best of our knowledge, this is the first paper that explicitly studies computing the exact minimum cut in the distributed setting. Previously, non-trivial sublinear time algorithms for this problem are known only for unweighted graphs when λ3\lambda\leq 3 due to Pritchard and Thurimella's O(D)O(D)-time and O(D+n1/2logn)O(D+n^{1/2}\log^* n)-time algorithms for computing 22-edge-connected and 33-edge-connected components. By using the edge sampling technique of Karger's, we can convert this algorithm into a (1+ϵ)(1+\epsilon)-approximation O((nlogn+D)ϵ5log3n)O((\sqrt{n}\log^{*} n+D)\epsilon^{-5}\log^3 n)-time algorithm for any ϵ>0\epsilon>0. This improves over the previous (2+ϵ)(2+\epsilon)-approximation O((nlogn+D)ϵ5log2nloglogn)O((\sqrt{n}\log^{*} n+D)\epsilon^{-5}\log^2 n\log\log n)-time algorithm and O(ϵ1)O(\epsilon^{-1})-approximation O(D+n12+ϵpolylogn)O(D+n^{\frac{1}{2}+\epsilon} \mathrm{poly}\log n)-time algorithm of Ghaffari and Kuhn. Due to the lower bound of Ω(D+n1/2/logn)\Omega(D+n^{1/2}/\log n) by Das Sarma et al. which holds for any approximation algorithm, this running time is tight up to a polylogn \mathrm{poly}\log n factor. To get the stated running time, we developed an approximation algorithm which combines the ideas of Thorup's algorithm and Matula's contraction algorithm. It saves an ϵ9log7n\epsilon^{-9}\log^{7} n factor as compared to applying Thorup's tree packing theorem directly. Then, we combine Kutten and Peleg's tree partitioning algorithm and Karger's dynamic programming to achieve an efficient distributed algorithm that finds the minimum cut when we are given a spanning tree that crosses the minimum cut exactly once

    Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing

    Full text link
    This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a data structure that handles unbounded-length keys whose comparison cost is not a constant. Examples of these keys are strings, multi-dimensional points, multiple-precision numbers, multi-key data (e.g.~records), XML paths, URL addresses, etc. The technique is more general than what has been done in previous work as no particular exploitation of the underlying structure of is required. The only requirement is that the insertion of a key must identify its predecessor or its successor. Using the proposed technique, online suffix tree can be constructed in worst case time O(logn)O(\log n) per input symbol (as opposed to amortized O(logn)O(\log n) time per symbol, achieved by previously known algorithms). To our knowledge, our algorithm is the first that achieves O(logn)O(\log n) worst case time per input symbol. Searching for a pattern of length mm in the resulting suffix tree takes O(min(mlogΣ,m+logn)+tocc)O(\min(m\log |\Sigma|, m + \log n) + tocc) time, where tocctocc is the number of occurrences of the pattern. The paper also describes more applications and show how to obtain alternative methods for dealing with suffix sorting, dynamic lowest common ancestors and order maintenance
    corecore