    Efficient LZ78 factorization of grammar compressed text

    We present an efficient algorithm for computing the LZ78 factorization of a text, where the text is represented as a straight line program (SLP), which is a context free grammar in the Chomsky normal form that generates a single string. Given an SLP of size nn representing a text SS of length NN, our algorithm computes the LZ78 factorization of TT in O(nN+mlogN)O(n\sqrt{N}+m\log N) time and O(nN+m)O(n\sqrt{N}+m) space, where mm is the number of resulting LZ78 factors. We also show how to improve the algorithm so that the nNn\sqrt{N} term in the time and space complexities becomes either nLnL, where LL is the length of the longest LZ78 factor, or (Nα)(N - \alpha) where α0\alpha \geq 0 is a quantity which depends on the amount of redundancy that the SLP captures with respect to substrings of SS of a certain length. Since m=O(N/logσN)m = O(N/\log_\sigma N) where σ\sigma is the alphabet size, the latter is asymptotically at least as fast as a linear time algorithm which runs on the uncompressed string when σ\sigma is constant, and can be more efficient when the text is compressible, i.e. when mm and nn are small.Comment: SPIRE 201

    New Algorithms for Position Heaps

    We present several results about position heaps, a relatively new alternative to suffix trees and suffix arrays. First, we show that, if we limit the maximum length of patterns to be sought, then we can also limit the height of the heap and reduce the worst-case cost of insertions and deletions. Second, we show how to build a position heap in linear time independent of the size of the alphabet. Third, we show how to augment a position heap such that it supports access to the corresponding suffix array, and vice versa. Fourth, we introduce a variant of a position heap that can be simulated efficiently by a compressed suffix array with a linear number of extra bits

    Near-optimal labeling schemes for nearest common ancestors

    We consider NCA labeling schemes: given a rooted tree TT, label the nodes of TT with binary strings such that, given the labels of any two nodes, one can determine, by looking only at the labels, the label of their nearest common ancestor. For trees with nn nodes we present upper and lower bounds establishing that labels of size (2±ϵ)logn(2\pm \epsilon)\log n, ϵ<1\epsilon<1 are both sufficient and necessary. (All logarithms in this paper are in base 2.) Alstrup, Bille, and Rauhe (SIDMA'05) showed that ancestor and NCA labeling schemes have labels of size logn+Ω(loglogn)\log n +\Omega(\log \log n). Our lower bound increases this to logn+Ω(logn)\log n + \Omega(\log n) for NCA labeling schemes. Since Fraigniaud and Korman (STOC'10) established that labels in ancestor labeling schemes have size logn+Θ(loglogn)\log n +\Theta(\log \log n), our new lower bound separates ancestor and NCA labeling schemes. Our upper bound improves the 10logn10 \log n upper bound by Alstrup, Gavoille, Kaplan and Rauhe (TOCS'04), and our theoretical result even outperforms some recent experimental studies by Fischer (ESA'09) where variants of the same NCA labeling scheme are shown to all have labels of size approximately 8logn8 \log n

    On positive realness of descriptor systems

    In this brief, the positive realness of descriptor systems is studied. For the continuous-time case, two positive real lemmas are given, based on a generalized algebraic Riccati equation and inequality respectively. For the discrete-time case, the positive real lemma is given in terms of a generalized algebraic Riccati inequality.published_or_final_versio

    Dynamic microsets for RAMs

    We generalize the concept of Gabow and Tarjan&apos;s microsets [3] to dynamic microsets. The dynamic microsets allow the use of direct addressing in a RAM for incremental problems that can be partitioned into subproblems which are merged from time to time. A sequence of operations on dynamic microsets can be performed in O(n+m.a(m, n)) time, where n is the size of the problem and m is the number of queries

    Dynamic Planar Embeddings of Dynamic Graphs

    We present an algorithm to support the dynamic embedding in the plane of a dynamic graph. An edge can be inserted across a face between two vertices on the face boundary (we call such a vertex pair linkable), and edges can be deleted. The planar embedding can also be changed locally by flipping components that are connected to the rest of the graph by at most two vertices. Given vertices u,vu,v, linkable(u,v)(u,v) decides whether uu and vv are linkable in the current embedding, and if so, returns a list of suggestions for the placement of (u,v)(u,v) in the embedding. For non-linkable vertices u,vu,v, we define a new query, one-flip-linkable(u,v)(u,v) providing a suggestion for a flip that will make them linkable if one exists. We support all updates and queries in O(log2n^2 n) time. Our time bounds match those of Italiano et al. for a static (flipless) embedding of a dynamic graph. Our new algorithm is simpler, exploiting that the complement of a spanning tree of a connected plane graph is a spanning tree of the dual graph. The primal and dual trees are interpreted as having the same Euler tour, and a main idea of the new algorithm is an elegant interaction between top trees over the two trees via their common Euler tour.Comment: Announced at STACS'1

    The Suffix Tree of a Tree and Minimizing Sequential Transducers

    This paper gives a linear-time algorithm for the construction of thesuffix tree of a tree. The suffix tree of a tree is used to obtain an efficientalgorithm for the minimization of sequential transducers

    Fully-online Construction of Suffix Trees for Multiple Texts

    We consider fully-online construction of indexing data structures for multiple texts. Let T = {T_1, ..., T_K} be a collection of texts. By fully-online, we mean that a new character can be appended to any text in T at any time. This is a natural generalization of semi-online construction of indexing data structures for multiple texts in which, after a new character is appended to the kth text T_k, then its previous texts T_1, ..., T_k-1 will remain static. Our fully-online scenario arises when we maintain dynamic indexes for multi-sensor data. Let N and sigma denote the total length of texts in T and the alphabet size, respectively. We first show that the algorithm by Blumer et al. [Theoretical Computer Science, 40:31-55, 1985] to construct the directed acyclic word graph (DAWG) for T can readily be extended to our fully-online setting, retaining O(N log sigma)-time and O(N)-space complexities. Then, we give a sophisticated fully-online algorithm which constructs the suffix tree for T in O(N log sigma) time and O(N) space. A key idea of this algorithm is synchronized maintenance of the DAWG and the suffix tree