Search CORE

10 research outputs found

A practical approach for the design of nonuniform lapped transforms

Author: Malvar Henrique S.
Yoon Byung-Jun
Publication venue
Publication date: 01/08/2006
Field of study

We propose a simple method for the design of lapped transforms with nonuniform frequency resolution and good time localization. The method is a generalization of an approach previously proposed by Princen, where the nonuniform filter bank is obtained by joining uniform cosine-modulated filter banks (CMFBs) using a transition filter. We use several transition filters to obtain a near perfect-reconstruction (PR) nonuniform lapped transform with significantly reduced overall distortion. The main advantage of the proposed method is in reducing the length of the transition filters, which leads to a reduction in processing delay that can be useful for applications such as real-time audio coding

Caltech Authors

Coding overcomplete representations of audio using the MCLT

Author: Malvar Henrique S.
Yoon Byung‐Jun
Publication venue: IEEE Computer Society
Publication date: 01/01/2008
Field of study

We propose a system for audio coding using the modulated complex lapped transform (MCLT). In general, it is difficult to encode signals using overcomplete representations without avoiding a penalty in rate-distortion performance. We show that the penalty can be significantly reduced for MCLT-based representations, without the need for iterative methods of sparsity reduction. We achieve that via a magnitude-phase polar quantization and the use of magnitude and phase prediction. Compared to systems based on quantization of orthogonal representations such as the modulated lapped transform (MLT), the new system allows for reduced warbling artifacts and more precise computation of frequency-domain auditory masking functions

CiteSeerX

Crossref

Caltech Authors

Best Wavelet-Packet Bases for Audio Coding Using Perceptual and Rate-Distortion Criteria

Author: Erne M
Faller Christof
Moschytz G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/10/2005
Field of study

Infoscience - École polytechnique fédérale de Lausanne

A turbo-coded burst-by-burst adaptive wide-band speech transceiver

Author: Lajos Hanzo
M. Munster
T. Keller
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A tutorial on onset detection in music signals

Author: C. Duxbury
J.P. Bello
L. Daudet
M. Davies
M.B. Sandler
S. Abdallah
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Exploring Discrete Cosine Transform for Multi-resolution Analysis

Author: Abedi Safdar Ali Syed
Publication venue: ScholarWorks @ Georgia State University
Publication date: 10/08/2005
Field of study

Multi-resolution analysis has been a very popular technique in the recent years. Wavelets have been used extensively to perform multi resolution image expansion and analysis. DCT, however, has been used to compress image but not for multi resolution image analysis. This thesis is an attempt to explore the possibilities of using DCT for multi-resolution image analysis. Naive implementation of block DCT for multi-resolution expansion has many difficulties that lead to signal distortion. One of the main causes of distortion is the blocking artifacts that appear when reconstructing images transformed by DCT. The new algorithm is based on line DCT which eliminates the need for block processing. The line DCT is one dimensional array based on cascading the image rows and columns in one transform operation. Several images have been used to test the algorithm at various resolution levels. The reconstruction mean square error rate is used as an indication to the success of the method. The proposed algorithm has also been tested against the traditional block DCT

ScholarWorks @ Georgia State University

Audio bandwidth extension based on temporal smoothing cepstral coefficients

Author: AH Gray
AH Nour-Eldin
AH Nour-Eldin
AH Nour-Eldin
AH Nour-Eldin
B Geiser
BCJ Moore
C Yağlı
Chang-Chun Bao
E Larsen
F Norden
F Norden
G Doblinger
GB Song
H Pulakka
H Pulakka
H Pulakka
HP Knagenhjelm
I Cohen
International Telecommunication Union
International Telecommunication Union
International Telecommunication Union
International Telecommunication Union
J Makhoul
KY Park
M Nilsson
M Nilsson
M Slaney
M Tammi
P Jax
P Jax
P Jax
P Vary
S Ragot
S Shlien
T Esch
U Chong
Xin Liu
Y Qian
Y Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Étude de transformées temps-fréquence pour le codage audio faible retard en haute qualité

Author: Virette David
Publication venue: HAL CCSD
Publication date: 10/12/2012
Field of study

In recent years there has been a phenomenal increase in the number of products and applications which make use of audio coding formats. Amongthe most successful audio coding schemes, the MPEG-1 Layer III (mp3), the MPEG-2 Advanced Audio Coding (AAC) or its evolution MPEG-4High Efficiency-Advanced Audio Coding (HE-AAC) can be cited. More recently, perceptual audio coding has been adapted to achieve codingat low-delay such to become suitable for conversational applications. Traditionally, the use of filter bank such as the Modified Discrete CosineTransform (MDCT) is a central component of perceptual audio coding and its adaptation to low delay audio coding has become an important researchtopic. Low delay transforms have been developed in order to retain the performance of standard audio coding while reducing dramatically the associated algorithmic delay.This work presents some elements allowing to better accommodate the delay reduction constraint. Among the contributions, a low delay blockswitching tool which allows the direct transition between long transform and short transform without the insertion of transition window. The sameprinciple has been extended to define new perfect reconstruction conditions for the MDCT with relaxed constraints compared to the original definition.As a consequence, a seamless reconstruction method has been derived to increase the flexibility of transform coding schemes with the possibility toselect a transform for a frame independently from its neighbouring frames. Finally, based on this new approach, a new low delay window design procedure has been derived to obtain an analytic definition for a new family of transforms, permitting high quality with a substantial coding delay reduction. The performance of the proposed transforms has been thoroughly evaluated, an evaluation framework involving an objective measurement of the optimal transform sequence is proposed. It confirms the relevance of the proposed transforms used for audio coding. In addition, the new approaches have been successfully applied to the recent standardisation work items, such as the low delay audio coding developed at MPEG (LD-AAC and ELD-AAC) and they have been evaluated with numerous subjective testing, showing a significant improvement of the quality for transient signals. The new low delay window design has been adopted in G.718, a scalable speech and audio codec standardized in ITU-T and has demonstrated its benefit in terms of delay reduction while maintaining the audio quality of a traditional MDCT.Codage audio à faible retard à l'aide de la définition de nouvelles fenêtres pour la transformée MDCT et l'introduction d'un nouveau schéma de commutation de fenêtre

HAL-CentraleSupelec

Thèses en Ligne

INRIA a CCSD electronic archive server

HAL-Rennes 1