6,981 research outputs found
Universal lossless source coding with the Burrows Wheeler transform
The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the BWT coefficients. BWT-based compression schemes are widely touted as low-complexity algorithms giving lossless coding rates better than those of the Ziv-Lempel codes (commonly known as LZ'77 and LZ'78) and almost as good as those achieved by prediction by partial matching (PPM) algorithms. To date, the coding performance claims have been made primarily on the basis of experimental results. This work gives a theoretical evaluation of BWT-based coding. The main results of this theoretical evaluation include: (1) statistical characterizations of the BWT output on both finite strings and sequences of length n → ∞, (2) a variety of very simple new techniques for BWT-based lossless source coding, and (3) proofs of the universality and bounds on the rates of convergence of both new and existing BWT-based codes for finite-memory and stationary ergodic sources. The end result is a theoretical justification and validation of the experimentally derived conclusions: BWT-based lossless source codes achieve universal lossless coding performance that converges to the optimal coding performance more quickly than the rate of convergence observed in Ziv-Lempel style codes and, for some BWT-based codes, within a constant factor of the optimal rate of convergence for finite-memory source
Asymptotic Optimality of Antidictionary Codes
An antidictionary code is a lossless compression algorithm using an
antidictionary which is a set of minimal words that do not occur as substrings
in an input string. The code was proposed by Crochemore et al. in 2000, and its
asymptotic optimality has been proved with respect to only a specific
information source, called balanced binary source that is a binary Markov
source in which a state transition occurs with probability 1/2 or 1. In this
paper, we prove the optimality of both static and dynamic antidictionary codes
with respect to a stationary ergodic Markov source on finite alphabet such that
a state transition occurs with probability .Comment: 5 pages, to appear in the proceedings of 2010 IEEE International
Symposium on Information Theory (ISIT2010
Lossless and near-lossless source coding for multiple access networks
A multiple access source code (MASC) is a source code designed for the following network configuration: a pair of correlated information sequences {X-i}(i=1)(infinity), and {Y-i}(i=1)(infinity) is drawn independent and identically distributed (i.i.d.) according to joint probability mass function (p.m.f.) p(x, y); the encoder for each source operates without knowledge of the other source; the decoder jointly decodes the encoded bit streams from both sources. The work of Slepian and Wolf describes all rates achievable by MASCs of infinite coding dimension (n --> infinity) and asymptotically negligible error probabilities (P-e((n)) --> 0). In this paper, we consider the properties of optimal instantaneous MASCs with finite coding dimension (n 0) performance. The interest in near-lossless codes is inspired by the discontinuity in the limiting rate region at P-e((n)) = 0 and the resulting performance benefits achievable by using near-lossless MASCs as entropy codes within lossy MASCs. Our central results include generalizations of Huffman and arithmetic codes to the MASC framework for arbitrary p(x, y), n, and P-e((n)) and polynomial-time design algorithms that approximate these optimal solutions
Linear-Codes-Based Lossless Joint Source-Channel Coding for Multiple-Access Channels
A general lossless joint source-channel coding (JSCC) scheme based on linear
codes and random interleavers for multiple-access channels (MACs) is presented
and then analyzed in this paper. By the information-spectrum approach and the
code-spectrum approach, it is shown that a linear code with a good joint
spectrum can be used to establish limit-approaching lossless JSCC schemes for
correlated general sources and general MACs, where the joint spectrum is a
generalization of the input-output weight distribution. Some properties of
linear codes with good joint spectra are investigated. A formula on the
"distance" property of linear codes with good joint spectra is derived, based
on which, it is further proved that, the rate of any systematic codes with good
joint spectra cannot be larger than the reciprocal of the corresponding
alphabet cardinality, and any sparse generator matrices cannot yield linear
codes with good joint spectra. The problem of designing arbitrary rate coding
schemes is also discussed. A novel idea called "generalized puncturing" is
proposed, which makes it possible that one good low-rate linear code is enough
for the design of coding schemes with multiple rates. Finally, various coding
problems of MACs are reviewed in a unified framework established by the
code-spectrum approach, under which, criteria and candidates of good linear
codes in terms of spectrum requirements for such problems are clearly
presented.Comment: 18 pages, 3 figure
On the Performance of Lossless Joint Source-Channel Coding Based on Linear Codes
A general lossless joint source-channel coding scheme based on linear codes
is proposed and then analyzed in this paper. It is shown that a linear code
with good joint spectrum can be used to establish limit-approaching joint
source-channel coding schemes for arbitrary sources and channels, where the
joint spectrum of the code is a generalization of the input-output weight
distribution.Comment: To appear in Proc. 2006 IEEE Information Theory Workshop, October
22-26, 2006, Chengdu, China. (5 pages, 2 figures
Linear complexity universal decoding with exponential error probability decay
In this manuscript we consider linear complexity binary linear block encoders and decoders that operate universally with exponential error probability decay. Such scenarios may be relevant in wireless scenarios where probability distributions may not be fully characterized due to the dynamic nature of wireless environments. More specifically, we consider the setting of fixed length-to-fixed length near-lossless data compression of a memoryless binary source of unknown probability distribution as well as the dual setting of communicating on a binary symmetric channel (BSC) with unknown crossover probability. We introduce a new 'min-max distance' metric, analogous to minimum distance, that addresses the universal binary setting and has the same properties as that of minimum distance on BSCs with known crossover probability. The code construction and decoding algorithm are universal extensions of the 'expander codes' framework of Barg and Zemor and have identical complexity and exponential error probability performance
Zero-error Slepian-Wolf Coding of Confined Correlated Sources with Deviation Symmetry
In this paper, we use linear codes to study zero-error Slepian-Wolf coding of
a set of sources with deviation symmetry, where the sources are generalization
of the Hamming sources over an arbitrary field. We extend our previous codes,
Generalized Hamming Codes for Multiple Sources, to Matrix Partition Codes and
use the latter to efficiently compress the target sources. We further show that
every perfect or linear-optimal code is a Matrix Partition Code. We also
present some conditions when Matrix Partition Codes are perfect and/or
linear-optimal. Detail discussions of Matrix Partition Codes on Hamming sources
are given at last as examples.Comment: submitted to IEEE Trans Information Theor
- …