13,005 research outputs found
Perfect reconstruction QMF banks for two-dimensional applications
A theory is outlined whereby it is possible to design a M x N channel two-dimensional quadrature mirror filter bank which has perfect reconstruction property. Such a property ensures freedom from aliasing, amplitude distortion, and phase distortion. The method is based on a simple property of certain transfer matrices, namely the losslessness property
Coding gain in paraunitary analysis/synthesis systems
A formal proof that bit allocation results hold for the entire class of paraunitary subband coders is presented. The problem of finding an optimal paraunitary subband coder, so as to maximize the coding gain of the system, is discussed. The bit allocation problem is analyzed for the case of the paraunitary tree-structured filter banks, such as those used for generating orthonormal wavelets. The even more general case of nonuniform filter banks is also considered. In all cases it is shown that under optimal bit allocation, the variances of the errors introduced by each of the quantizers have to be equal. Expressions for coding gains for these systems are derived
Tree-structured complementary filter banks using all-pass sections
Tree-structured complementary filter banks are developed with transfer functions that are simultaneously all-pass complementary and power complementary. Using a formulation based on unitary transforms and all-pass functions, we obtain analysis and synthesis filter banks which are related through a transposition operation, such that the cascade of analysis and synthesis filter banks achieves an all-pass function. The simplest structure is obtained using a Hadamard transform, which is shown to correspond to a binary tree structure. Tree structures can be generated for a variety of other unitary transforms as well. In addition, given a tree-structured filter bank where the number of bands is a power of two, simple methods are developed to generate complementary filter banks with an arbitrary number of channels, which retain the transpose relationship between analysis and synthesis banks, and allow for any combination of bandwidths. The structural properties of the filter banks are illustrated with design examples, and multirate applications are outlined
Time and frequency domain algorithms for speech coding
The promise of digital hardware economies (due to recent advances in
VLSI technology), has focussed much attention on more complex and sophisticated
speech coding algorithms which offer improved quality at relatively
low bit rates.
This thesis describes the results (obtained from computer simulations)
of research into various efficient (time and frequency domain) speech
encoders operating at a transmission bit rate of 16 Kbps.
In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM)
systems employing both forward and backward adaptive prediction were
examined. A number of algorithms were proposed and evaluated, including
several variants of the Stochastic Approximation Predictor (SAP). A
Backward Block Adaptive (BBA) predictor was also developed and found to
outperform the conventional stochastic methods, even though its complexity
in terms of signal processing requirements is lower. A simplified
Adaptive Predictive Coder (APC) employing a single tap pitch predictor
considered next provided a slight improvement in performance over ADPCM,
but with rather greater complexity.
The ultimate test of any speech coding system is the perceptual performance
of the received speech. Recent research has indicated that this
may be enhanced by suitable control of the noise spectrum according to
the theory of auditory masking. Various noise shaping ADPCM
configurations were examined, and it was demonstrated that a proposed
pre-/post-filtering arrangement which exploits advantageously the
predictor-quantizer interaction, leads to the best subjective
performance in both forward and backward prediction systems.
Adaptive quantization is instrumental to the performance of ADPCM systems.
Both the forward adaptive quantizer (AQF) and the backward oneword
memory adaptation (AQJ) were examined. In addition, a novel method
of decreasing quantization noise in ADPCM-AQJ coders, which involves the
application of correction to the decoded speech samples, provided
reduced output noise across the spectrum, with considerable high frequency
noise suppression.
More powerful (and inevitably more complex) frequency domain speech
coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder
(SBC) offer good quality speech at 16 Kbps. To reduce complexity and
coding delay, whilst retaining the advantage of sub-band coding, a novel
transform based split-band coder (TSBC) was developed and found to compare
closely in performance with the SBC.
To prevent the heavy side information requirement associated with a
large number of bands in split-band coding schemes from impairing coding
accuracy, without forgoing the efficiency provided by adaptive bit
allocation, a method employing AQJs to code the sub-band signals together
with vector quantization of the bit allocation patterns was also
proposed.
Finally, 'pipeline' methods of bit allocation and step size estimation
(using the Fast Fourier Transform (FFT) on the input signal) were examined.
Such methods, although less accurate, are nevertheless useful in
limiting coding delay associated with SRC schemes employing Quadrature
Mirror Filters (QMF)
Scalable Speech Coding for IP Networks
The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss.
Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support.
This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec.
The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition
Energy Based Split Vector Quantizer Employing Signal Representation in Multiple Transform Domains.
This invention relates to representation of one and multidimensional signal vectors in nonorgothonal domains and design of Vector Quantizers that can be chosen among these representations. There is presented a Vector Quantization technique in multiple nonorthogonal domains for both waveform and model based signal characterization. An iterative codebook accuracy enhancement algorithm, applicable to both waveform and model based Vector Quantization in multiple nonorthogonal domains, which yields further improvement in signal coding performance, is disclosed. Further, Vector Quantization in in nonorthogonal domains is applied to speech and exhibits clear performance improvements of reconstruction quality for the same bit rate compared to existing single domain Vector Quantization techniques. The technique disclosed herein can be easily extended to several other one and multidimensional signal classes
Bit rates in audio source coding
The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a masked error spectrum, prescribing how quantization noise must be distributed over the audio spectrum to obtain a minimal bit rate and an inaudible coding errors. This result cannot only be used to estimate performance bounds, but can also be directly applied in audio coding systems. Subband coding applications to magnetic recording and transmission are discussed in some detail. Performance bounds for this type of subband coding system are derived
- …