7,004 research outputs found

    Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts

    Full text link
    We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds, which in practical applications are likely to be a bottleneck

    Fast and Compact Regular Expression Matching

    Get PDF
    We study 4 problems in string matching, namely, regular expression matching, approximate regular expression matching, string edit distance, and subsequence indexing, on a standard word RAM model of computation that allows logarithmic-sized words to be manipulated in constant time. We show how to improve the space and/or remove a dependency on the alphabet size for each problem using either an improved tabulation technique of an existing algorithm or by combining known algorithms in a new way

    Faster subsequence recognition in compressed strings

    Full text link
    Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to Lempel--Ziv compression. For an SLP-compressed text of length mˉ\bar m, and an uncompressed pattern of length nn, C{\'e}gielski et al. gave an algorithm for local subsequence recognition running in time O(mˉn2logn)O(\bar mn^2 \log n). We improve the running time to O(mˉn1.5)O(\bar mn^{1.5}). Our algorithm can also be used to compute the longest common subsequence between a compressed text and an uncompressed pattern in time O(mˉn1.5)O(\bar mn^{1.5}); the same problem with a compressed pattern is known to be NP-hard
    corecore