2 research outputs found

    Online Grammar Compression for Frequent Pattern Discovery

    Full text link
    Various grammar compression algorithms have been proposed in the last decade. A grammar compression is a restricted CFG deriving the string deterministically. An efficient grammar compression develops a smaller CFG by finding duplicated patterns and removing them. This process is just a frequent pattern discovery by grammatical inference. While we can get any frequent pattern in linear time using a preprocessed string, a huge working space is required for longer patterns, and the whole string must be loaded into the memory preliminarily. We propose an online algorithm approximating this problem within a compressed space. The main contribution is an improvement of the previously best known approximation ratio Ω(1lg2m)\Omega(\frac{1}{\lg^2m}) to Ω(1lgNlgm)\Omega(\frac{1}{\lg^*N\lg m}) where mm is the length of an optimal pattern in a string of length NN and lg\lg^* is the iteration of the logarithm base 22. For a sufficiently large NN, lgN\lg^*N is practically constant. The experimental results show that our algorithm extracts nearly optimal patterns and achieves a significant improvement in memory consumption compared to the offline algorithm.Comment: 14 page

    Deterministic Sparse Suffix Sorting in the Restore Model

    Full text link
    Given a text TT of length nn, we propose a deterministic online algorithm computing the sparse suffix array and the sparse longest common prefix array of TT in O(clgn+mlgmlgnlgn)O(c \sqrt{\lg n} + m \lg m \lg n \lg^* n) time with O(m)O(m) words of space under the premise that the space of TT is rewritable, where mnm \le n is the number of suffixes to be sorted (provided online and arbitrarily), and cc is the number of characters with mcnm \le c \le n that must be compared for distinguishing the designated suffixes
    corecore