Search CORE

305 research outputs found

Suboptimality of the Karhunen-Loève transform for transform coding

Author: Effros Michelle
Feng Hanying
Zeger Kenneth
Publication venue
Publication date: 22/04/2003
Field of study

We examine the performance of the Karhunen-Loeve transform (KLT) for transform coding applications. The KLT has long been viewed as the best available block transform for a system that orthogonally transforms a vector source, scalar quantizes the components of the transformed vector using optimal bit allocation, and then inverse transforms the vector. This paper treats fixed-rate and variable-rate transform codes of non-Gaussian sources. The fixed-rate approach uses an optimal fixed-rate scalar quantizer to describe the transform coefficients; the variable-rate approach uses a uniform scalar quantizer followed by an optimal entropy code, and each quantized component is encoded separately. Earlier work shows that for the variable-rate case there exist sources on which the KLT is not unique and the optimal quantization and coding stage matched to a "worst" KLT yields performance as much as 1.5 dB worse than the optimal quantization and coding stage matched to a "best" KLT. In this paper, we strengthen that result to show that in both the fixed-rate and the variable-rate coding frameworks there exist sources for which the performance penalty for using a "worst" KLT can be made arbitrarily large. Further, we demonstrate in both frameworks that there exist sources for which even a best KLT gives suboptimal performance. Finally, we show that even for vector sources where the KLT yields independent coefficients, the KLT can be suboptimal for fixed-rate coding

Caltech Authors

Gaussian Mixture Model-based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec

Author: Davor Petrinović
Tihomir Tadić
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2011
Field of study

In this paper, we investigate the use of a Gaussian MixtureModel (GMM)-based quantizer for quantization of the Line Spectral Frequencies (LSFs) in the Adaptive Multi-Rate (AMR) speech codec. We estimate the parametric GMM model of the probability density function (pdf) for the prediction error (residual) of mean-removed LSF parameters that are used in the AMR codec for speech spectral envelope representation. The studied GMM-based quantizer is based on transform coding using Karhunen-Loeve transform (KLT) and transform domain scalar quantizers (SQ) individually designed for each Gaussian mixture. We have investigated the applicability of such a quantization scheme in the existing AMR codec by solely replacing the AMR LSF quantization algorithm segment. The main novelty in this paper lies in applying and adapting the entropy constrained (EC) coding for fixed-rate scalar quantization of transformed residuals thereby allowing for better adaptation to the local statistics of the source. We study and evaluate the compression efficiency, computational complexity and memory requirements of the proposed algorithm. Experimental results show that the GMM-based EC quantizer provides better rate/distortion performance than the quantization schemes used in the referent AMR codec by saving up to 7.32 bits/frame at much lower rate-independent computational complexity and memory requirements