463 research outputs found

    Practical Evaluation of Lempel-Ziv-78 and Lempel-Ziv-Welch Tries

    Full text link
    We present the first thorough practical study of the Lempel-Ziv-78 and the Lempel-Ziv-Welch computation based on trie data structures. With a careful selection of trie representations we can beat well-tuned popular trie data structures like Judy, m-Bonsai or Cedar

    Computing LZ77 in Run-Compressed Space

    Get PDF
    In this paper, we show that the LZ77 factorization of a text T {\in\Sigma^n} can be computed in O(R log n) bits of working space and O(n log R) time, R being the number of runs in the Burrows-Wheeler transform of T reversed. For extremely repetitive inputs, the working space can be as low as O(log n) bits: exponentially smaller than the text itself. As a direct consequence of our result, we show that a class of repetition-aware self-indexes based on a combination of run-length encoded BWT and LZ77 can be built in asymptotically optimal O(R + z) words of working space, z being the size of the LZ77 parsing

    Computing Lempel-Ziv Factorization Online

    Full text link
    We present an algorithm which computes the Lempel-Ziv factorization of a word WW of length nn on an alphabet Σ\Sigma of size σ\sigma online in the following sense: it reads WW starting from the left, and, after reading each r=O(logσn)r = O(\log_{\sigma} n) characters of WW, updates the Lempel-Ziv factorization. The algorithm requires O(nlogσ)O(n \log \sigma) bits of space and O(n \log^2 n) time. The basis of the algorithm is a sparse suffix tree combined with wavelet trees

    Fast online Lempel-Ziv factorization in compressed space

    Get PDF
    Let T be a text of length n on an alphabet \u3a3 of size \u3c3, and let H0 be the zero-order empirical entropy of T. We show that the LZ77 factorization of T can be computed in nH0+o(nlog\u3c3)+O(\u3c3logn) bits of working space with an online algorithm running in O(nlogn) time. Previous space-efficient online solutions either work in compact space and O(nlogn) time, or in succinct space and O(nlog3n) time
    corecore