388 research outputs found
A vector quantization approach to universal noiseless coding and quantization
A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which the first stage code can be regarded as a vector quantizer that “quantizes” the input data of length n to one of a fixed collection of block codes. We apply the generalized Lloyd algorithm to the first-stage quantizer, using induced measures of rate and distortion, to design locally optimal two-stage codes. On a source of medical images, two-stage variable-rate vector quantizers designed in this way outperform standard (one-stage) fixed-rate vector quantizers by over 9 dB. The tail of the operational distortion-rate function of the first-stage quantizer determines the optimal rate of convergence of the redundancy of a universal sequence of two-stage codes. We show that there exist two-stage universal noiseless codes, fixed-rate quantizers, and variable-rate quantizers whose per-letter rate and distortion redundancies converge to zero as (k/2)n -1 log n, when the universe of sources has finite dimension k. This extends the achievability part of Rissanen's theorem from universal noiseless codes to universal quantizers. Further, we show that the redundancies converge as O(n-1) when the universe of sources is countable, and as O(n-1+ϵ) when the universe of sources is infinite-dimensional, under appropriate conditions
Lossy compression of discrete sources via Viterbi algorithm
We present a new lossy compressor for discrete-valued sources. For coding a
sequence , the encoder starts by assigning a certain cost to each possible
reconstruction sequence. It then finds the one that minimizes this cost and
describes it losslessly to the decoder via a universal lossless compressor. The
cost of each sequence is a linear combination of its distance from the sequence
and a linear function of its order empirical distribution.
The structure of the cost function allows the encoder to employ the Viterbi
algorithm to recover the minimizer of the cost. We identify a choice of the
coefficients comprising the linear function of the empirical distribution used
in the cost function which ensures that the algorithm universally achieves the
optimum rate-distortion performance of any stationary ergodic source in the
limit of large , provided that diverges as . Iterative
techniques for approximating the coefficients, which alleviate the
computational burden of finding the optimal coefficients, are proposed and
studied.Comment: 26 pages, 6 figures, Submitted to IEEE Transactions on Information
Theor
Empirical processes, typical sequences and coordinated actions in standard Borel spaces
This paper proposes a new notion of typical sequences on a wide class of
abstract alphabets (so-called standard Borel spaces), which is based on
approximations of memoryless sources by empirical distributions uniformly over
a class of measurable "test functions." In the finite-alphabet case, we can
take all uniformly bounded functions and recover the usual notion of strong
typicality (or typicality under the total variation distance). For a general
alphabet, however, this function class turns out to be too large, and must be
restricted. With this in mind, we define typicality with respect to any
Glivenko-Cantelli function class (i.e., a function class that admits a Uniform
Law of Large Numbers) and demonstrate its power by giving simple derivations of
the fundamental limits on the achievable rates in several source coding
scenarios, in which the relevant operational criteria pertain to reproducing
empirical averages of a general-alphabet stationary memoryless source with
respect to a suitable function class.Comment: 14 pages, 3 pdf figures; accepted to IEEE Transactions on Information
Theor
Tema Con Variazioni: Quantum Channel Capacity
Channel capacity describes the size of the nearly ideal channels, which can
be obtained from many uses of a given channel, using an optimal error
correcting code. In this paper we collect and compare minor and major
variations in the mathematically precise statements of this idea which have
been put forward in the literature. We show that all the variations considered
lead to equivalent capacity definitions. In particular, it makes no difference
whether one requires mean or maximal errors to go to zero, and it makes no
difference whether errors are required to vanish for any sequence of block
sizes compatible with the rate, or only for one infinite sequence.Comment: 32 pages, uses iopart.cl
The Generalized Asymptotic Equipartition Property: Necessary and Sufficient Conditions
Suppose a string generated by a memoryless source
with distribution is to be compressed with distortion no
greater than , using a memoryless random codebook with distribution
. The compression performance is determined by the ``generalized asymptotic
equipartition property'' (AEP), which states that the probability of finding a
-close match between and any given codeword , is
approximately , where the rate function can be
expressed as an infimum of relative entropies. The main purpose here is to
remove various restrictive assumptions on the validity of this result that have
appeared in the recent literature. Necessary and sufficient conditions for the
generalized AEP are provided in the general setting of abstract alphabets and
unbounded distortion measures. All possible distortion levels are
considered; the source can be stationary and ergodic; and the
codebook distribution can have memory. Moreover, the behavior of the matching
probability is precisely characterized, even when the generalized AEP is not
valid. Natural characterizations of the rate function are
established under equally general conditions.Comment: 19 page
- …