2,350 research outputs found
Lossy compression of discrete sources via Viterbi algorithm
We present a new lossy compressor for discrete-valued sources. For coding a
sequence , the encoder starts by assigning a certain cost to each possible
reconstruction sequence. It then finds the one that minimizes this cost and
describes it losslessly to the decoder via a universal lossless compressor. The
cost of each sequence is a linear combination of its distance from the sequence
and a linear function of its order empirical distribution.
The structure of the cost function allows the encoder to employ the Viterbi
algorithm to recover the minimizer of the cost. We identify a choice of the
coefficients comprising the linear function of the empirical distribution used
in the cost function which ensures that the algorithm universally achieves the
optimum rate-distortion performance of any stationary ergodic source in the
limit of large , provided that diverges as . Iterative
techniques for approximating the coefficients, which alleviate the
computational burden of finding the optimal coefficients, are proposed and
studied.Comment: 26 pages, 6 figures, Submitted to IEEE Transactions on Information
Theor
Shannon Information and Kolmogorov Complexity
We compare the elementary theories of Shannon information and Kolmogorov
complexity, the extent to which they have a common purpose, and where they are
fundamentally different. We discuss and relate the basic notions of both
theories: Shannon entropy versus Kolmogorov complexity, the relation of both to
universal coding, Shannon mutual information versus Kolmogorov (`algorithmic')
mutual information, probabilistic sufficient statistic versus algorithmic
sufficient statistic (related to lossy compression in the Shannon theory versus
meaningful information in the Kolmogorov theory), and rate distortion theory
versus Kolmogorov's structure function. Part of the material has appeared in
print before, scattered through various publications, but this is the first
comprehensive systematic comparison. The last mentioned relations are new.Comment: Survey, LaTeX 54 pages, 3 figures, Submitted to IEEE Trans
Information Theor
On Macroscopic Complexity and Perceptual Coding
The theoretical limits of 'lossy' data compression algorithms are considered.
The complexity of an object as seen by a macroscopic observer is the size of
the perceptual code which discards all information that can be lost without
altering the perception of the specified observer. The complexity of this
macroscopically observed state is the simplest description of any microstate
comprising that macrostate. Inference and pattern recognition based on
macrostate rather than microstate complexities will take advantage of the
complexity of the macroscopic observer to ignore irrelevant noise
A Universal Scheme for Wyner–Ziv Coding of Discrete Sources
We consider the Wyner–Ziv (WZ) problem of lossy compression where the decompressor observes a noisy version of the source, whose statistics are unknown. A new family of WZ coding algorithms is proposed and their universal optimality is proven. Compression consists of sliding-window processing followed by Lempel–Ziv (LZ) compression, while the decompressor is based on a modification of the discrete universal denoiser (DUDE) algorithm to take advantage of side information. The new algorithms not only universally attain the fundamental limits, but also suggest a paradigm for practical WZ coding. The effectiveness of our approach is illustrated with experiments on binary images, and English text using a low complexity algorithm motivated by our class of universally optimal WZ codes
A vector quantization approach to universal noiseless coding and quantization
A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which the first stage code can be regarded as a vector quantizer that “quantizes” the input data of length n to one of a fixed collection of block codes. We apply the generalized Lloyd algorithm to the first-stage quantizer, using induced measures of rate and distortion, to design locally optimal two-stage codes. On a source of medical images, two-stage variable-rate vector quantizers designed in this way outperform standard (one-stage) fixed-rate vector quantizers by over 9 dB. The tail of the operational distortion-rate function of the first-stage quantizer determines the optimal rate of convergence of the redundancy of a universal sequence of two-stage codes. We show that there exist two-stage universal noiseless codes, fixed-rate quantizers, and variable-rate quantizers whose per-letter rate and distortion redundancies converge to zero as (k/2)n -1 log n, when the universe of sources has finite dimension k. This extends the achievability part of Rissanen's theorem from universal noiseless codes to universal quantizers. Further, we show that the redundancies converge as O(n-1) when the universe of sources is countable, and as O(n-1+ϵ) when the universe of sources is infinite-dimensional, under appropriate conditions
- …