Skip to main content
Article thumbnail
Location of Repository

LZW Based Compressed Pattern Matching

By Tao Tao and Amar Mukherjee

Abstract

Compressed pattern matching is an emerging research area that addresses the following problem: given a file in compressed format and a pattern, report the occurrence(s) of the pattern in the file with minimal (or no) decompression. In this paper, we report our work on compressed pattern matching in LZW compressed files. The reported work is based on Amir’s well-known “almost-optimal ” algorithm [1] but has been improved to search not only the first occurrence of the pattern but also all other occurrences. The improvements also include the multi-pattern matching and a faster implementation for so-called “simple pattern”, which is defined as “a pattern with no symbol appearing more than once”. Extensive experiments have been conducted to test the search performance and to compare with not only the “decompressthen-search” approach but also the best available compressed pattern matching algorithms, particularly the BWT-based algorithms [2, 3]. The results showed that our method is competitive among the best algorithms

Year: 2009
OAI identifier: oai:CiteSeerX.psu:10.1.1.135.6456
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://csdl.computer.org/comp/... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.