4,020 research outputs found

    Random Access to Grammar Compressed Strings

    Full text link
    Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. In this paper, we present a novel grammar representation that allows efficient random access to any character or substring without decompressing the string. Let SS be a string of length NN compressed into a context-free grammar S\mathcal{S} of size nn. We present two representations of S\mathcal{S} achieving O(logN)O(\log N) random access time, and either O(nαk(n))O(n\cdot \alpha_k(n)) construction time and space on the pointer machine model, or O(n)O(n) construction time and space on the RAM. Here, αk(n)\alpha_k(n) is the inverse of the kthk^{th} row of Ackermann's function. Our representations also efficiently support decompression of any substring in SS: we can decompress any substring of length mm in the same complexity as a single random access query and additional O(m)O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern PP with at most kk errors in time O(n(min{Pk,k4+P}+logN)+occ)O(n(\min\{|P|k, k^4 + |P|\} + \log N) + occ), where occocc is the number of occurrences of PP in SS. Finally, we generalize our results to navigation and other operations on grammar-compressed ordered trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy paths in grammars.Comment: Preliminary version in SODA 201

    Efficient LZ78 factorization of grammar compressed text

    Full text link
    We present an efficient algorithm for computing the LZ78 factorization of a text, where the text is represented as a straight line program (SLP), which is a context free grammar in the Chomsky normal form that generates a single string. Given an SLP of size nn representing a text SS of length NN, our algorithm computes the LZ78 factorization of TT in O(nN+mlogN)O(n\sqrt{N}+m\log N) time and O(nN+m)O(n\sqrt{N}+m) space, where mm is the number of resulting LZ78 factors. We also show how to improve the algorithm so that the nNn\sqrt{N} term in the time and space complexities becomes either nLnL, where LL is the length of the longest LZ78 factor, or (Nα)(N - \alpha) where α0\alpha \geq 0 is a quantity which depends on the amount of redundancy that the SLP captures with respect to substrings of SS of a certain length. Since m=O(N/logσN)m = O(N/\log_\sigma N) where σ\sigma is the alphabet size, the latter is asymptotically at least as fast as a linear time algorithm which runs on the uncompressed string when σ\sigma is constant, and can be more efficient when the text is compressible, i.e. when mm and nn are small.Comment: SPIRE 201

    Satisfiability Games for Branching-Time Logics

    Full text link
    The satisfiability problem for branching-time temporal logics like CTL*, CTL and CTL+ has important applications in program specification and verification. Their computational complexities are known: CTL* and CTL+ are complete for doubly exponential time, CTL is complete for single exponential time. Some decision procedures for these logics are known; they use tree automata, tableaux or axiom systems. In this paper we present a uniform game-theoretic framework for the satisfiability problem of these branching-time temporal logics. We define satisfiability games for the full branching-time temporal logic CTL* using a high-level definition of winning condition that captures the essence of well-foundedness of least fixpoint unfoldings. These winning conditions form formal languages of \omega-words. We analyse which kinds of deterministic {\omega}-automata are needed in which case in order to recognise these languages. We then obtain a reduction to the problem of solving parity or B\"uchi games. The worst-case complexity of the obtained algorithms matches the known lower bounds for these logics. This approach provides a uniform, yet complexity-theoretically optimal treatment of satisfiability for branching-time temporal logics. It separates the use of temporal logic machinery from the use of automata thus preserving a syntactical relationship between the input formula and the object that represents satisfiability, i.e. a winning strategy in a parity or B\"uchi game. The games presented here work on a Fischer-Ladner closure of the input formula only. Last but not least, the games presented here come with an attempt at providing tool support for the satisfiability problem of complex branching-time logics like CTL* and CTL+
    corecore