653 research outputs found

    Prefix Codes for Power Laws with Countable Support

    Full text link
    In prefix coding over an infinite alphabet, methods that consider specific distributions generally consider those that decline more quickly than a power law (e.g., Golomb coding). Particular power-law distributions, however, model many random variables encountered in practice. For such random variables, compression performance is judged via estimates of expected bits per input symbol. This correspondence introduces a family of prefix codes with an eye towards near-optimal coding of known distributions. Compression performance is precisely estimated for well-known probability distributions using these codes and using previously known prefix codes. One application of these near-optimal codes is an improved representation of rational numbers.Comment: 5 pages, 2 tables, submitted to Transactions on Information Theor

    Lossless and near-lossless source coding for multiple access networks

    Get PDF
    A multiple access source code (MASC) is a source code designed for the following network configuration: a pair of correlated information sequences {X-i}(i=1)(infinity), and {Y-i}(i=1)(infinity) is drawn independent and identically distributed (i.i.d.) according to joint probability mass function (p.m.f.) p(x, y); the encoder for each source operates without knowledge of the other source; the decoder jointly decodes the encoded bit streams from both sources. The work of Slepian and Wolf describes all rates achievable by MASCs of infinite coding dimension (n --> infinity) and asymptotically negligible error probabilities (P-e((n)) --> 0). In this paper, we consider the properties of optimal instantaneous MASCs with finite coding dimension (n 0) performance. The interest in near-lossless codes is inspired by the discontinuity in the limiting rate region at P-e((n)) = 0 and the resulting performance benefits achievable by using near-lossless MASCs as entropy codes within lossy MASCs. Our central results include generalizations of Huffman and arithmetic codes to the MASC framework for arbitrary p(x, y), n, and P-e((n)) and polynomial-time design algorithms that approximate these optimal solutions

    A high-speed distortionless predictive image-compression scheme

    Get PDF
    A high-speed distortionless predictive image-compression scheme that is based on differential pulse code modulation output modeling combined with efficient source-code design is introduced. Experimental results show that this scheme achieves compression that is very close to the difference entropy of the source

    Optimal prefix codes for pairs of geometrically-distributed random variables

    Full text link
    Optimal prefix codes are studied for pairs of independent, integer-valued symbols emitted by a source with a geometric probability distribution of parameter qq, 0<q<10{<}q{<}1. By encoding pairs of symbols, it is possible to reduce the redundancy penalty of symbol-by-symbol encoding, while preserving the simplicity of the encoding and decoding procedures typical of Golomb codes and their variants. It is shown that optimal codes for these so-called two-dimensional geometric distributions are \emph{singular}, in the sense that a prefix code that is optimal for one value of the parameter qq cannot be optimal for any other value of qq. This is in sharp contrast to the one-dimensional case, where codes are optimal for positive-length intervals of the parameter qq. Thus, in the two-dimensional case, it is infeasible to give a compact characterization of optimal codes for all values of the parameter qq, as was done in the one-dimensional case. Instead, optimal codes are characterized for a discrete sequence of values of qq that provide good coverage of the unit interval. Specifically, optimal prefix codes are described for q=2−1/kq=2^{-1/k} (k≄1k\ge 1), covering the range q≄1/2q\ge 1/2, and q=2−kq=2^{-k} (k>1k>1), covering the range q<1/2q<1/2. The described codes produce the expected reduction in redundancy with respect to the one-dimensional case, while maintaining low complexity coding operations.Comment: To appear in IEEE Transactions on Information Theor

    More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding

    Full text link
    There is a large literature devoted to the problem of finding an optimal (min-cost) prefix-free code with an unequal letter-cost encoding alphabet of size. While there is no known polynomial time algorithm for solving it optimally there are many good heuristics that all provide additive errors to optimal. The additive error in these algorithms usually depends linearly upon the largest encoding letter size. This paper was motivated by the problem of finding optimal codes when the encoding alphabet is infinite. Because the largest letter cost is infinite, the previous analyses could give infinite error bounds. We provide a new algorithm that works with infinite encoding alphabets. When restricted to the finite alphabet case, our algorithm often provides better error bounds than the best previous ones known.Comment: 29 pages;9 figures

    The Rightmost Equal-Cost Position Problem

    Full text link
    LZ77-based compression schemes compress the input text by replacing factors in the text with an encoded reference to a previous occurrence formed by the couple (length, offset). For a given factor, the smallest is the offset, the smallest is the resulting compression ratio. This is optimally achieved by using the rightmost occurrence of a factor in the previous text. Given a cost function, for instance the minimum number of bits used to represent an integer, we define the Rightmost Equal-Cost Position (REP) problem as the problem of finding one of the occurrences of a factor which cost is equal to the cost of the rightmost one. We present the Multi-Layer Suffix Tree data structure that, for a text of length n, at any time i, it provides REP(LPF) in constant time, where LPF is the longest previous factor, i.e. the greedy phrase, a reference to the list of REP({set of prefixes of LPF}) in constant time and REP(p) in time O(|p| log log n) for any given pattern p
    • 

    corecore