4,530 research outputs found

    On empirical cumulant generating functions of code lengths for individual sequences

    Full text link
    We consider the problem of lossless compression of individual sequences using finite-state (FS) machines, from the perspective of the best achievable empirical cumulant generating function (CGF) of the code length, i.e., the normalized logarithm of the empirical average of the exponentiated code length. Since the probabilistic CGF is minimized in terms of the R\'enyi entropy of the source, one of the motivations of this study is to derive an individual-sequence analogue of the R\'enyi entropy, in the same way that the FS compressibility is the individual-sequence counterpart of the Shannon entropy. We consider the CGF of the code-length both from the perspective of fixed-to-variable (F-V) length coding and the perspective of variable-to-variable (V-V) length coding, where the latter turns out to yield a better result, that coincides with the FS compressibility. We also extend our results to compression with side information, available at both the encoder and decoder. In this case, the V-V version no longer coincides with the FS compressibility, but results in a different complexity measure.Comment: 15 pages; submitted for publicatio

    Weighted universal image compression

    Get PDF
    We describe a general coding strategy leading to a family of universal image compression systems designed to give good performance in applications where the statistics of the source to be compressed are not available at design time or vary over time or space. The basic approach considered uses a two-stage structure in which the single source code of traditional image compression systems is replaced with a family of codes designed to cover a large class of possible sources. To illustrate this approach, we consider the optimal design and use of two-stage codes containing collections of vector quantizers (weighted universal vector quantization), bit allocations for JPEG-style coding (weighted universal bit allocation), and transform codes (weighted universal transform coding). Further, we demonstrate the benefits to be gained from the inclusion of perceptual distortion measures and optimal parsing. The strategy yields two-stage codes that significantly outperform their single-stage predecessors. On a sequence of medical images, weighted universal vector quantization outperforms entropy coded vector quantization by over 9 dB. On the same data sequence, weighted universal bit allocation outperforms a JPEG-style code by over 2.5 dB. On a collection of mixed test and image data, weighted universal transform coding outperforms a single, data-optimized transform code (which gives performance almost identical to that of JPEG) by over 6 dB
    • …
    corecore