Skip to main content
Article thumbnail
Location of Repository

Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

By Pawel Gawrychowski

Abstract

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern s[1..m] and a Lempel-Ziv representation of a string t[1..N], does s occur in t? Farach and Thorup gave a randomized O(nlog^2(N/n)+m) time solution for this problem, where n is the size of the compressed representation of t. We improve their result by developing a faster and fully deterministic O(nlog(N/n)+m) time algorithm with the same space complexity. Note that for highly compressible texts, log(N/n) might be of order n, so for such inputs the improvement is very significant. A (tiny) fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan.Comment: submitte

Topics: Computer Science - Data Structures and Algorithms
Year: 2011
OAI identifier: oai:arXiv.org:1104.4203
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://arxiv.org/abs/1104.4203 (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.