39 research outputs found
Theory, design and applications of linear transforms for information transmission
The aim of this dissertation is to study the common features of block transforms, subband filter banks, and wavelets, and demonstrate how discrete uncertainty can be applied to evaluate these different decomposition techniques. In particular, we derive an uncertainty bound for discrete-time functions. It is shown that this bound is the same as that for continuous-time functions, if the discrete-time functions have a certain degree of regularity.
This dissertation also deals with spectral modeling in filter banks. It is shown, both theoretically and experimentally, that subspectral modeling is superior to full spectrum modeling if performed before the rate change. The price paid for this performance improvement is an increase of computations. A few different signal sources were considered in this study. It is shown that the performances of AR and ARMA modeling techniques are comparable in subspectral modeling. The first is desired because of its simplicity. As an application of AR modeling, a coding algorithm of speech, namely CELP embedded in a filter bank structure was also studied. We found that there were no improvements of subband CELP technique over the full band one. The theoretical reasonings of the experimental results are also given.
This dissertation also addresses the problems of what type of transform to be used and to what extent an image should be decomposed. To this aim, an objective and subjective evaluations of different transform bases were done.
We propose a smart algorithm for the decomposition of a channel into its sub-channels in the discrete multitone communications. This algorithm evaluates the unevenness and energy distribution of the channel spectrum in order to get its Variable adaptive partitioning. It is shown that the proposed algorithm leads to a near optimal performance of the discrete multitone transceiver. This flexible splitting of the channel suffers less from the aliasing problem that exists in blind decompositions using fixed transforms. This dissertation extends the discrete multitone to the flexible multiband concept which brings significant performance improvements for digital communications
Subband image coding using filter banks with non-uniform passband distribution
In this paper, subband filter banks with non-uniform passband distribution in frequency domain are studied. Several design examples are presented and compared with conventional uniform bandwidth filter banks. Image coding results show that filter banks with non-uniform bandwidth outperform filter banks with uniform bandwidth, especially in low bit rate coding.published_or_final_versio
Generalized Triangular Decomposition in Transform Coding
A general family of optimal transform coders (TCs) is introduced here based on the generalized triangular decomposition (GTD) developed by Jiang This family includes the Karhunen-Loeve transform (KLT) and the generalized version of the prediction-based lower triangular transform (PLT) introduced by Phoong and Lin as special cases. The coding gain of the entire family, with optimal bit allocation, is equal to that of the KLT and the PLT. Even though the original PLT introduced by Phoong is not applicable for vectors that are not blocked versions of scalar wide sense stationary processes, the GTD-based family includes members that are natural extensions of the PLT, and therefore also enjoy the so-called MINLAB structure of the PLT, which has the unit noise-gain property. Other special cases of the GTD-TC are the geometric mean decomposition (GMD) and the bidiagonal decomposition (BID) transform coders. The GMD-TC in particular has the property that the optimum bit allocation is a uniform allocation; this is because all its transform domain coefficients have the same variance, implying thereby that the dynamic ranges of the coefficients to be quantized are identical
Orthonormal and biorthonormal filter banks as convolvers, and convolutional coding gain
Convolution theorems for filter bank transformers are introduced. Both uniform and nonuniform decimation ratios are considered, and orthonormal as well as biorthonormal cases are addressed. All the theorems are such that the original convolution reduces to a sum of shorter, decoupled convolutions in the subbands. That is, there is no need to have cross convolution between subbands. For the orthonormal case, expressions for optimal bit allocation and the optimized coding gain are derived. The contribution to coding gain comes partly from the nonuniformity of the signal spectrum and partly from nonuniformity of the filter spectrum. With one of the convolved sequences taken to be the unit pulse function,,e coding gain expressions reduce to those for traditional subband and transform coding. The filter-bank convolver has about the same computational complexity as a traditional convolver, if the analysis bank has small complexity compared to the convolution itself
Dual-DCT-Lifting-Based Lapped Transform with Improved Reversible Symmetric Extension
We present a lifting-based lapped transform (L-LT) and a reversible symmetric extension (RSE) in the boundary processing for more effective lossy-to-lossless image coding of data with various qualities from only one piece of lossless compressed data. The proposed dual-DCT-lifting-based LT (D2L-LT) parallel processes two identical LTs and consists of 1-D and 2-D DCT-liftings which allow the direct use of a DCT matrix in each lifting coefficient. Since the DCT-lifting can utilize any existing DCT software or hardware, it has great potential for elegant implementations that are dependent on the architecture and DCT algorithm used. In addition, we present an improved RSE (IRSE) that works by recalculating the boundary processing and solves the boundary problem that the DCT-lifting-based L-LT (DL-LT) has. We show that D2L-LT with IRSE mostly outperforms conventional L-LTs in lossy-to-lossless image coding
A new synthesis procedure for linear-phase paraunitary digital filter banks
In this paper, a new design algorithm is presented for a family of linear phase paraunitary filter banks with generalized filter length and symmetric polarity. A number of new constraints on the distributions of filter length and symmetry polarity among the channels are derived. In the algorithm, the lengths of the filters are gradually reduced through a cascade of lattice structures. The derivations for filter banks with even and odd number of channels are formulated in a unified form.published_or_final_versio
Audio Coding Based on Integer Transforms
Die Audiocodierung hat sich in den letzten Jahren zu einem sehr
populÀren Forschungs- und Anwendungsgebiet entwickelt. Insbesondere
gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3
(MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden hÀufig zur
effizienten Speicherung und Ăbertragung von Audiosignalen verwendet. FĂŒr
professionelle Anwendungen, wie etwa die Archivierung und Ăbertragung im
Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht.
Die bisherigen AnsĂ€tze fĂŒr gehörangepasste und verlustlose
Audiocodierung sind technisch völlig verschieden. Moderne
gehörangepasste Audiocoder basieren meist auf FilterbÀnken, wie etwa der
ĂŒberlappenden orthogonalen Transformation "Modifizierte Diskrete
Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen
verwenden meist prÀdiktive Codierung zur Redundanzreduktion. Nur wenige
AnsÀtze zur transformationsbasierten verlustlosen Audiocodierung wurden
bisher versucht.
Diese Arbeit prÀsentiert einen neuen Ansatz hierzu, der das
Lifting-Schema auf die in der gehörangepassten Audiocodierung
verwendeten ĂŒberlappenden Transformationen anwendet. Dies ermöglicht
eine invertierbare Integer-Approximation der ursprĂŒnglichen
Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die
selbe Technik kann auch fĂŒr FilterbĂ€nke mit niedriger Systemverzögerung
angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler
Lifting-Ansatz und eine Technik zur Spektralformung von
Quantisierungsfehlern eine Verbesserung der Approximation der
ursprĂŒnglichen Transformation.
Basierend auf diesen neuen Integer-Transformationen werden in dieser
Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren
umfassen verlustlose Audiocodierung, eine skalierbare verlustlose
Erweiterung eines gehörangepassten Audiocoders und einen integrierten
Ansatz zur fein skalierbaren gehörangepassten und verlustlosen
Audiocodierung. SchlieĂlich wird mit Hilfe der Integer-Transformationen
ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen
Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for
research and applications. Especially perceptual audio coding schemes,
such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are
widely used for efficient storage and transmission of music
signals. Nevertheless, for professional applications, such as archiving
and transmission in studio environments, lossless audio coding schemes
are considered more appropriate.
Traditionally, the technical approaches used in perceptual and lossless
audio coding have been separate worlds. In perceptual audio coding, the
use of filter banks, such as the lapped orthogonal transform "Modified
Discrete Cosine Transform" (MDCT), has been the approach of choice being
used by many state of the art coding schemes. On the other hand,
lossless audio coding schemes mostly employ predictive coding of
waveforms to remove redundancy. Only few attempts have been made so far
to use transform coding for the purpose of lossless audio coding.
This work presents a new approach of applying the lifting scheme to
lapped transforms used in perceptual audio coding. This allows for an
invertible integer-to-integer approximation of the original transform,
e.g. the IntMDCT as an integer approximation of the MDCT. The same
technique can also be applied to low-delay filter banks. A generalized,
multi-dimensional lifting approach and a noise-shaping technique are
introduced, allowing to further optimize the accuracy of the
approximation to the original transform.
Based on these new integer transforms, this work presents new audio
coding schemes and applications. The audio coding applications cover
lossless audio coding, scalable lossless enhancement of a perceptual
audio coder and fine-grain scalable perceptual and lossless audio
coding. Finally an approach to data hiding with high data rates in
uncompressed audio signals based on integer transforms is described
Audio Coding Based on Integer Transforms
Die Audiocodierung hat sich in den letzten Jahren zu einem sehr
populÀren Forschungs- und Anwendungsgebiet entwickelt. Insbesondere
gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3
(MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden hÀufig zur
effizienten Speicherung und Ăbertragung von Audiosignalen verwendet. FĂŒr
professionelle Anwendungen, wie etwa die Archivierung und Ăbertragung im
Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht.
Die bisherigen AnsĂ€tze fĂŒr gehörangepasste und verlustlose
Audiocodierung sind technisch völlig verschieden. Moderne
gehörangepasste Audiocoder basieren meist auf FilterbÀnken, wie etwa der
ĂŒberlappenden orthogonalen Transformation "Modifizierte Diskrete
Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen
verwenden meist prÀdiktive Codierung zur Redundanzreduktion. Nur wenige
AnsÀtze zur transformationsbasierten verlustlosen Audiocodierung wurden
bisher versucht.
Diese Arbeit prÀsentiert einen neuen Ansatz hierzu, der das
Lifting-Schema auf die in der gehörangepassten Audiocodierung
verwendeten ĂŒberlappenden Transformationen anwendet. Dies ermöglicht
eine invertierbare Integer-Approximation der ursprĂŒnglichen
Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die
selbe Technik kann auch fĂŒr FilterbĂ€nke mit niedriger Systemverzögerung
angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler
Lifting-Ansatz und eine Technik zur Spektralformung von
Quantisierungsfehlern eine Verbesserung der Approximation der
ursprĂŒnglichen Transformation.
Basierend auf diesen neuen Integer-Transformationen werden in dieser
Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren
umfassen verlustlose Audiocodierung, eine skalierbare verlustlose
Erweiterung eines gehörangepassten Audiocoders und einen integrierten
Ansatz zur fein skalierbaren gehörangepassten und verlustlosen
Audiocodierung. SchlieĂlich wird mit Hilfe der Integer-Transformationen
ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen
Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for
research and applications. Especially perceptual audio coding schemes,
such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are
widely used for efficient storage and transmission of music
signals. Nevertheless, for professional applications, such as archiving
and transmission in studio environments, lossless audio coding schemes
are considered more appropriate.
Traditionally, the technical approaches used in perceptual and lossless
audio coding have been separate worlds. In perceptual audio coding, the
use of filter banks, such as the lapped orthogonal transform "Modified
Discrete Cosine Transform" (MDCT), has been the approach of choice being
used by many state of the art coding schemes. On the other hand,
lossless audio coding schemes mostly employ predictive coding of
waveforms to remove redundancy. Only few attempts have been made so far
to use transform coding for the purpose of lossless audio coding.
This work presents a new approach of applying the lifting scheme to
lapped transforms used in perceptual audio coding. This allows for an
invertible integer-to-integer approximation of the original transform,
e.g. the IntMDCT as an integer approximation of the MDCT. The same
technique can also be applied to low-delay filter banks. A generalized,
multi-dimensional lifting approach and a noise-shaping technique are
introduced, allowing to further optimize the accuracy of the
approximation to the original transform.
Based on these new integer transforms, this work presents new audio
coding schemes and applications. The audio coding applications cover
lossless audio coding, scalable lossless enhancement of a perceptual
audio coder and fine-grain scalable perceptual and lossless audio
coding. Finally an approach to data hiding with high data rates in
uncompressed audio signals based on integer transforms is described
State of the art in 2D content representation and compression
Livrable D1.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.1 du projet
A generalized, parametric PR-QMF/wavelet transform design approach for multiresolution signal decomposition
This dissertation aims to emphasize the interrelations and the linkages of the theories of discrete-time filter banks and wavelet transforms. It is shown that the Binomial-QMF banks are identical to the interscale coefficients or filters of the compactly supported orthonormal wavelet transform bases proposed by Daubechies.
A generalized, parametric, smooth 2-band PR-QMF design approach based on Bernstein polynomial approximation is developed. It is found that the most regular compact support orthonormal wavelet filters, coiflet filters are only the special cases of the proposed filter bank design technique.
A new objective performance measure called Non-aliasing Energy Ratio(NER) is developed. Its merits are proven with the comparative performance studies of the well known orthonormal signal decomposition techniques.
This dissertation also addresses the optimal 2-band PR-QMF design problem. The variables of practical significance in image processing and coding are included in the optimization problem. The upper performance bounds of 2-band PR-QMF and their corresponding filter coefficients are derived.
It is objectively shown that there are superior filter bank solutions available over the standard block transform, DCT. It is expected that the theoretical contributions of this dissertation will find its applications particularly in Visual Signal Processing and Coding