6,981 research outputs found

    Universal lossless source coding with the Burrows Wheeler transform

    Get PDF
    The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the BWT coefficients. BWT-based compression schemes are widely touted as low-complexity algorithms giving lossless coding rates better than those of the Ziv-Lempel codes (commonly known as LZ'77 and LZ'78) and almost as good as those achieved by prediction by partial matching (PPM) algorithms. To date, the coding performance claims have been made primarily on the basis of experimental results. This work gives a theoretical evaluation of BWT-based coding. The main results of this theoretical evaluation include: (1) statistical characterizations of the BWT output on both finite strings and sequences of length n → ∞, (2) a variety of very simple new techniques for BWT-based lossless source coding, and (3) proofs of the universality and bounds on the rates of convergence of both new and existing BWT-based codes for finite-memory and stationary ergodic sources. The end result is a theoretical justification and validation of the experimentally derived conclusions: BWT-based lossless source codes achieve universal lossless coding performance that converges to the optimal coding performance more quickly than the rate of convergence observed in Ziv-Lempel style codes and, for some BWT-based codes, within a constant factor of the optimal rate of convergence for finite-memory source

    Asymptotic Optimality of Antidictionary Codes

    Full text link
    An antidictionary code is a lossless compression algorithm using an antidictionary which is a set of minimal words that do not occur as substrings in an input string. The code was proposed by Crochemore et al. in 2000, and its asymptotic optimality has been proved with respect to only a specific information source, called balanced binary source that is a binary Markov source in which a state transition occurs with probability 1/2 or 1. In this paper, we prove the optimality of both static and dynamic antidictionary codes with respect to a stationary ergodic Markov source on finite alphabet such that a state transition occurs with probability p(0<p1)p (0 < p \leq 1).Comment: 5 pages, to appear in the proceedings of 2010 IEEE International Symposium on Information Theory (ISIT2010

    Lossless and near-lossless source coding for multiple access networks

    Get PDF
    A multiple access source code (MASC) is a source code designed for the following network configuration: a pair of correlated information sequences {X-i}(i=1)(infinity), and {Y-i}(i=1)(infinity) is drawn independent and identically distributed (i.i.d.) according to joint probability mass function (p.m.f.) p(x, y); the encoder for each source operates without knowledge of the other source; the decoder jointly decodes the encoded bit streams from both sources. The work of Slepian and Wolf describes all rates achievable by MASCs of infinite coding dimension (n --> infinity) and asymptotically negligible error probabilities (P-e((n)) --> 0). In this paper, we consider the properties of optimal instantaneous MASCs with finite coding dimension (n 0) performance. The interest in near-lossless codes is inspired by the discontinuity in the limiting rate region at P-e((n)) = 0 and the resulting performance benefits achievable by using near-lossless MASCs as entropy codes within lossy MASCs. Our central results include generalizations of Huffman and arithmetic codes to the MASC framework for arbitrary p(x, y), n, and P-e((n)) and polynomial-time design algorithms that approximate these optimal solutions

    Linear-Codes-Based Lossless Joint Source-Channel Coding for Multiple-Access Channels

    Full text link
    A general lossless joint source-channel coding (JSCC) scheme based on linear codes and random interleavers for multiple-access channels (MACs) is presented and then analyzed in this paper. By the information-spectrum approach and the code-spectrum approach, it is shown that a linear code with a good joint spectrum can be used to establish limit-approaching lossless JSCC schemes for correlated general sources and general MACs, where the joint spectrum is a generalization of the input-output weight distribution. Some properties of linear codes with good joint spectra are investigated. A formula on the "distance" property of linear codes with good joint spectra is derived, based on which, it is further proved that, the rate of any systematic codes with good joint spectra cannot be larger than the reciprocal of the corresponding alphabet cardinality, and any sparse generator matrices cannot yield linear codes with good joint spectra. The problem of designing arbitrary rate coding schemes is also discussed. A novel idea called "generalized puncturing" is proposed, which makes it possible that one good low-rate linear code is enough for the design of coding schemes with multiple rates. Finally, various coding problems of MACs are reviewed in a unified framework established by the code-spectrum approach, under which, criteria and candidates of good linear codes in terms of spectrum requirements for such problems are clearly presented.Comment: 18 pages, 3 figure

    On the Performance of Lossless Joint Source-Channel Coding Based on Linear Codes

    Full text link
    A general lossless joint source-channel coding scheme based on linear codes is proposed and then analyzed in this paper. It is shown that a linear code with good joint spectrum can be used to establish limit-approaching joint source-channel coding schemes for arbitrary sources and channels, where the joint spectrum of the code is a generalization of the input-output weight distribution.Comment: To appear in Proc. 2006 IEEE Information Theory Workshop, October 22-26, 2006, Chengdu, China. (5 pages, 2 figures

    Linear complexity universal decoding with exponential error probability decay

    Get PDF
    In this manuscript we consider linear complexity binary linear block encoders and decoders that operate universally with exponential error probability decay. Such scenarios may be relevant in wireless scenarios where probability distributions may not be fully characterized due to the dynamic nature of wireless environments. More specifically, we consider the setting of fixed length-to-fixed length near-lossless data compression of a memoryless binary source of unknown probability distribution as well as the dual setting of communicating on a binary symmetric channel (BSC) with unknown crossover probability. We introduce a new 'min-max distance' metric, analogous to minimum distance, that addresses the universal binary setting and has the same properties as that of minimum distance on BSCs with known crossover probability. The code construction and decoding algorithm are universal extensions of the 'expander codes' framework of Barg and Zemor and have identical complexity and exponential error probability performance

    Zero-error Slepian-Wolf Coding of Confined Correlated Sources with Deviation Symmetry

    Full text link
    In this paper, we use linear codes to study zero-error Slepian-Wolf coding of a set of sources with deviation symmetry, where the sources are generalization of the Hamming sources over an arbitrary field. We extend our previous codes, Generalized Hamming Codes for Multiple Sources, to Matrix Partition Codes and use the latter to efficiently compress the target sources. We further show that every perfect or linear-optimal code is a Matrix Partition Code. We also present some conditions when Matrix Partition Codes are perfect and/or linear-optimal. Detail discussions of Matrix Partition Codes on Hamming sources are given at last as examples.Comment: submitted to IEEE Trans Information Theor
    corecore