8 research outputs found

    The effect of flexible parsing for dynamic dictionary-based data compression

    Full text link

    The effect of flexible parsing for dynamic dictionary based data compression

    Get PDF
    We report on the performance evaluation of greedy parsing with a single step lookahead, denoted as flexible parsing. We also introduce a new fingerprint based data structure which enables efficient, linear time implementation

    The effect of flexible parsing for dynamic dictionary based data compression

    No full text
    We report on the performance evaluation of greedy parsing with a single step lookahead (which we call flexible Parsing or FP as an alternative to the commonly used greedy parsing (with no-lookaheads) scheme. Greedy parsing is the basis of most popular compression programs including UNIX compress and gzip, however it usually results in far from optimal parsing/compression with regard to the dictionary construction scheme in use. Flexible parsing, however, is optimal [MS99], i.e. partitions any given input to the smallest number of phrases possible, for dictionary construction schemes which satisfy the prefix property throughout their execution. We focus on the application of FP in the context of the LZW variant of the Lempel-Ziv'78 dictionary construction method [Wel84, ZL78], which is of considerable practical interest. We implement two compression algorithms which use (1) FP with LZW dictionary (LZW-FP), and (2) FP with an alternative flexible dictionary (FPA as introduced in [Hor95]). Our implementations are based on novel on-line data structures enabling us to use linear time and space. We test our implementations on a collection of input sequences which includes textual files, DNA sequences, medical images, and pseudorandom binary files, and compare our results with two of the most popular compression programs UNIX compress and gzip. Our results demonstrate that flexible parsing is especially useful for non-textual data, on which it improves over the compression rates of compress and gzip by up to 20% and 35%, respectively

    The Effect of Flexible Parsing for Dynamic Dictionary Based Data Compression

    No full text
    We report on the performance evaluation of greedy parsing with a single step lookahead, denoted as flexible parsing. We also introduce a new fingerprint based data structure which enables efficient, linear time implementation

    The Effect of Flexible Parsing for Dynamic Dictionary Based Data Compression

    No full text
    We report on the performance evaluation of greedy parsing with a single step lookahead, denoted as flexible parsing. We also introduce a new fingerprint based data structure which enables efficient, linear time implementation. 1 Introduction The most common compression algorithms are based on maintaining a dynamic dictionary of strings that are called phrases, and replacing substrings of an input text with pointers to identical phrases in the dictionary. Dictionary based compression algorithms of particular interest are the LZ78 method [ZL78], its LZW variant [Wel84], and the LZ77 method [ZL77] which are all asymptotically optimal for a wide range of sources. Given a dictionary construction scheme, there is more than one way to parse the input, i.e., choose which substrings in the input text will be replaced by respective codewords. Almost all dynamic dictionary based algorithms in the literature use Part of this work was presented in Workshop on Algorithmic Engineering, Saarbruck..
    corecore