39,810 research outputs found
Variable dimension weighted universal vector quantization and noiseless coding
A new algorithm for variable dimension weighted universal coding is introduced. Combining the multi-codebook system of weighted universal vector quantization (WUVQ), the partitioning technique of variable dimension vector quantization, and the optimal design strategy common to both, variable dimension WUVQ allows mixture sources to be effectively carved into their component subsources, each of which can then be encoded with the codebook best matched to that source. Application of variable dimension WUVQ to a sequence of medical images provides up to 4.8 dB improvement in signal to quantization noise ratio over WUVQ and up to 11 dB improvement over a standard full-search vector quantizer followed by an entropy code. The optimal partitioning technique can likewise be applied with a collection of noiseless codes, as found in weighted universal noiseless coding (WUNC). The resulting algorithm for variable dimension WUNC is also described
A vector quantization approach to universal noiseless coding and quantization
A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which the first stage code can be regarded as a vector quantizer that “quantizes” the input data of length n to one of a fixed collection of block codes. We apply the generalized Lloyd algorithm to the first-stage quantizer, using induced measures of rate and distortion, to design locally optimal two-stage codes. On a source of medical images, two-stage variable-rate vector quantizers designed in this way outperform standard (one-stage) fixed-rate vector quantizers by over 9 dB. The tail of the operational distortion-rate function of the first-stage quantizer determines the optimal rate of convergence of the redundancy of a universal sequence of two-stage codes. We show that there exist two-stage universal noiseless codes, fixed-rate quantizers, and variable-rate quantizers whose per-letter rate and distortion redundancies converge to zero as (k/2)n -1 log n, when the universe of sources has finite dimension k. This extends the achievability part of Rissanen's theorem from universal noiseless codes to universal quantizers. Further, we show that the redundancies converge as O(n-1) when the universe of sources is countable, and as O(n-1+ϵ) when the universe of sources is infinite-dimensional, under appropriate conditions
Estimating Mixture Entropy with Pairwise Distances
Mixture distributions arise in many parametric and non-parametric settings --
for example, in Gaussian mixture models and in non-parametric estimation. It is
often necessary to compute the entropy of a mixture, but, in most cases, this
quantity has no closed-form expression, making some form of approximation
necessary. We propose a family of estimators based on a pairwise distance
function between mixture components, and show that this estimator class has
many attractive properties. For many distributions of interest, the proposed
estimators are efficient to compute, differentiable in the mixture parameters,
and become exact when the mixture components are clustered. We prove this
family includes lower and upper bounds on the mixture entropy. The Chernoff
-divergence gives a lower bound when chosen as the distance function,
with the Bhattacharyya distance providing the tightest lower bound for
components that are symmetric and members of a location family. The
Kullback-Leibler divergence gives an upper bound when used as the distance
function. We provide closed-form expressions of these bounds for mixtures of
Gaussians, and discuss their applications to the estimation of mutual
information. We then demonstrate that our bounds are significantly tighter than
well-known existing bounds using numeric simulations. This estimator class is
very useful in optimization problems involving maximization/minimization of
entropy and mutual information, such as MaxEnt and rate distortion problems.Comment: Corrects several errata in published version, in particular in
Section V (bounds on mutual information
Mixing and non-mixing local minima of the entropy contrast for blind source separation
In this paper, both non-mixing and mixing local minima of the entropy are
analyzed from the viewpoint of blind source separation (BSS); they correspond
respectively to acceptable and spurious solutions of the BSS problem. The
contribution of this work is twofold. First, a Taylor development is used to
show that the \textit{exact} output entropy cost function has a non-mixing
minimum when this output is proportional to \textit{any} of the non-Gaussian
sources, and not only when the output is proportional to the lowest entropic
source. Second, in order to prove that mixing entropy minima exist when the
source densities are strongly multimodal, an entropy approximator is proposed.
The latter has the major advantage that an error bound can be provided. Even if
this approximator (and the associated bound) is used here in the BSS context,
it can be applied for estimating the entropy of any random variable with
multimodal density.Comment: 11 pages, 6 figures, To appear in IEEE Transactions on Information
Theor
A Tutorial on Independent Component Analysis
Independent component analysis (ICA) has become a standard data analysis
technique applied to an array of problems in signal processing and machine
learning. This tutorial provides an introduction to ICA based on linear algebra
formulating an intuition for ICA from first principles. The goal of this
tutorial is to provide a solid foundation on this advanced topic so that one
might learn the motivation behind ICA, learn why and when to apply this
technique and in the process gain an introduction to this exciting field of
active research
- …