568 research outputs found
Lower Bounds on the Redundancy of Huffman Codes with Known and Unknown Probabilities
In this paper we provide a method to obtain tight lower bounds on the minimum
redundancy achievable by a Huffman code when the probability distribution
underlying an alphabet is only partially known. In particular, we address the
case where the occurrence probabilities are unknown for some of the symbols in
an alphabet. Bounds can be obtained for alphabets of a given size, for
alphabets of up to a given size, and for alphabets of arbitrary size. The
method operates on a Computer Algebra System, yielding closed-form numbers for
all results. Finally, we show the potential of the proposed method to shed some
light on the structure of the minimum redundancy achievable by the Huffman
code
Efficient Universal Noiseless Source Codes
Although the existence of universal noiseless variable-rate codes for the class of discrete stationary ergodic sources has previously been established, very few practical universal encoding methods are available. Efficient implementable universal source coding techniques are discussed in this paper. Results are presented on source codes for which a small value of the maximum redundancy is achieved with a relatively short block length. A constructive proof of the existence of universal noiseless codes for discrete stationary sources is first presented. The proof is shown to provide a method for obtaining efficient universal noiseless variable-rate codes for various classes of sources. For memoryless sources, upper and lower bounds are obtained for the minimax redundancy as a function of the block length of the code. Several techniques for constructing universal noiseless source codes for memoryless sources are presented and their redundancies are compared with the bounds. Consideration is given to possible applications to data compression for certain nonstationary sources
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
Guessing under source uncertainty
This paper considers the problem of guessing the realization of a finite
alphabet source when some side information is provided. The only knowledge the
guesser has about the source and the correlated side information is that the
joint source is one among a family. A notion of redundancy is first defined and
a new divergence quantity that measures this redundancy is identified. This
divergence quantity shares the Pythagorean property with the Kullback-Leibler
divergence. Good guessing strategies that minimize the supremum redundancy
(over the family) are then identified. The min-sup value measures the richness
of the uncertainty set. The min-sup redundancies for two examples - the
families of discrete memoryless sources and finite-state arbitrarily varying
sources - are then determined.Comment: 27 pages, submitted to IEEE Transactions on Information Theory, March
2006, revised September 2006, contains minor modifications and restructuring
based on reviewers' comment
- …