Search CORE

82 research outputs found

Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations

Author: Arai
Arguello
Britanak
Britanak
Britanak
Chan
Chan
Chen
Cheng
Chiang
Crochiere
Duhamel
Duhamel
Duhamel
Fan
Frigo
Gentleman
Gopinath
Hou
Jing
Johnson
Johnson
Kamar
Kok
Krot
Lee
Lee
Lee
Liu
Lundy
Malvar
Malvar
Malvar
Martens
Murthy
Narasimha
Nikolajevic
Painter
Pennebaker
Plonka
Princen
Püschel
Qian
Schatzman
Steven G. Johnson
Suehiro
Takala
Tasche
Vetterli
Wang
Wang
Wang
Xuancheng Shao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We present algorithms for the type-IV discrete cosine transform (DCT-IV) and discrete sine transform (DST-IV), as well as for the modified discrete cosine transform (MDCT) and its inverse, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact count is strictly lowered for all N > 4. These results are derived by considering the DCT to be a special case of a DFT of length 8N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DST-IV and MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

A fast algorithm for the computation of 2-D forward and inverse MDCT

Author: Luo Limin,
Senhadji Lotfi
Shu Huazhong
Wu Jiasong
Publication venue: 'Elsevier BV'
Publication date: 01/06/2008
Field of study

International audienceA fast algorithm for computing the two-dimensional (2-D) forward and inverse modified discrete cosine transform (MDCT and IMDCT) is proposed. The algorithm converts the 2-D MDCT and IMDCT with block size M N into four 2-D discrete cosine transforms (DCTs) with block size ðM=4Þ ðN=4Þ. It is based on an algorithm recently presented by Cho et al. [An optimized algorithm for computing the modified discrete cosine transform and its inverse transform, in: Proceedings of the IEEE TENCON, vol. A, 21–24 November 2004, pp. 626–628] for the efficient calculation of onedimensional MDCT and IMDCT. Comparison of the computational complexity with the traditional row–column method shows that the proposed algorithm reduces significantly the number of arithmetic operations

HAL-Rennes 1

Audio Coding Based on Integer Transforms

Author: Geiger Ralf
Publication venue
Publication date: 25/02/2008
Field of study

Die Audiocodierung hat sich in den letzten Jahren zu einem sehr populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3 (MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur effizienten Speicherung und Übertragung von Audiosignalen verwendet. Für professionelle Anwendungen, wie etwa die Archivierung und Übertragung im Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht. Die bisherigen Ansätze für gehörangepasste und verlustlose Audiocodierung sind technisch völlig verschieden. Moderne gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der überlappenden orthogonalen Transformation "Modifizierte Diskrete Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden bisher versucht. Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das Lifting-Schema auf die in der gehörangepassten Audiocodierung verwendeten überlappenden Transformationen anwendet. Dies ermöglicht eine invertierbare Integer-Approximation der ursprünglichen Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler Lifting-Ansatz und eine Technik zur Spektralformung von Quantisierungsfehlern eine Verbesserung der Approximation der ursprünglichen Transformation. Basierend auf diesen neuen Integer-Transformationen werden in dieser Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren umfassen verlustlose Audiocodierung, eine skalierbare verlustlose Erweiterung eines gehörangepassten Audiocoders und einen integrierten Ansatz zur fein skalierbaren gehörangepassten und verlustlosen Audiocodierung. Schließlich wird mit Hilfe der Integer-Transformationen ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for research and applications. Especially perceptual audio coding schemes, such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are widely used for efficient storage and transmission of music signals. Nevertheless, for professional applications, such as archiving and transmission in studio environments, lossless audio coding schemes are considered more appropriate. Traditionally, the technical approaches used in perceptual and lossless audio coding have been separate worlds. In perceptual audio coding, the use of filter banks, such as the lapped orthogonal transform "Modified Discrete Cosine Transform" (MDCT), has been the approach of choice being used by many state of the art coding schemes. On the other hand, lossless audio coding schemes mostly employ predictive coding of waveforms to remove redundancy. Only few attempts have been made so far to use transform coding for the purpose of lossless audio coding. This work presents a new approach of applying the lifting scheme to lapped transforms used in perceptual audio coding. This allows for an invertible integer-to-integer approximation of the original transform, e.g. the IntMDCT as an integer approximation of the MDCT. The same technique can also be applied to low-delay filter banks. A generalized, multi-dimensional lifting approach and a noise-shaping technique are introduced, allowing to further optimize the accuracy of the approximation to the original transform. Based on these new integer transforms, this work presents new audio coding schemes and applications. The audio coding applications cover lossless audio coding, scalable lossless enhancement of a perceptual audio coder and fine-grain scalable perceptual and lossless audio coding. Finally an approach to data hiding with high data rates in uncompressed audio signals based on integer transforms is described

Fraunhofer-ePrints

Digitale Bibliothek Thüringen

Audio Coding Based on Integer Transforms

Author: Geiger Ralf
Publication venue
Publication date: 25/02/2008
Field of study

Digitale Bibliothek Thüringen

Improved and Extended Mixed-Radix Decimation in Frequency Fast MDCT Algorithm

Author: Britaňák Vladimír
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

Recently, a mixed-radix decimation in frequency (DIF) fast MDCT algorithm only for the mixed-radix decompositions or composite lengths N = 3^m.2, m>0, has been proposed in ShuBao. An improved mixed-radix DIF fast MDCT algorithm both in terms of the regularity and computational complexity is described. Based on observed simple algebraic identities in the original proposed algorithm ShuBao, new formulas are derived resulting in a very regular computational structure. Consequently, the number of arithmetic operations is reduced significantly. Moreover, the improved algorithm is extended to all composite lengths N = 3^m.2^p, m, p>0. The improved algorithm defines new sparse matrix factorizations of the MDCT matrix for the composite lengths N = 3^m.2^p, m, p>0, and finally it provides new implementations of the forward/backward MDCT in MPEG-1/2 layer III (MP3) audio coding standard

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Implementation of a transform based audio encoder

Author: Funken Ralf F.M.
Publication venue
Publication date: 01/01/1997
Field of study

Repository TU/e

Pure OAI Repository

Survey of error concealment schemes for real-time audio transmission systems

Author: Robles Moya Aránzazu
Publication venue
Publication date: 18/09/2012
Field of study

This thesis presents an overview of the main strategies employed for error detection and error concealment in different real-time transmission systems for digital audio. The “Adaptive Differential Pulse-Code Modulation (ADPCM)”, the “Audio Processing Technology Apt-x100”, the “Extended Adaptive Multi-Rate Wideband (AMR-WB+)”, the “Advanced Audio Coding (AAC)”, the “MPEG-1 Audio Layer II (MP2)”, the “MPEG-1 Audio Layer III (MP3)” and finally the “Adaptive Transform Coder 3 (AC3)” are considered. As an example of error management, a simulation of the AMR-WB+ codec is included. The simulation allows an evaluation of the mechanisms included in the codec definition and enables also an evaluation of the different bit error sensitivities of the encoded audio payload.Ingeniería Técnica en Telemátic

Universidad Carlos III de Madrid e-Archivo

On the data hiding theory and multimedia content security applications

Author: Gang Litao
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2002
Field of study

This dissertation is a comprehensive study of digital steganography for multimedia content protection. With the increasing development of Internet technology, protection and enforcement of multimedia property rights has become a great concern to multimedia authors and distributors. Watermarking technologies provide a possible solution for this problem. The dissertation first briefly introduces the current watermarking schemes, including their applications in video,, image and audio. Most available embedding schemes are based on direct Spread Sequence (SS) modulation. A small value pseudo random signature sequence is embedded into the host signal and the information is extracted via correlation. The correlation detection problem is discussed at the beginning. It is concluded that the correlator is not optimum in oblivious detection. The Maximum Likelihood detector is derived and some feasible suboptimal detectors are also analyzed. Through the calculation of extraction Bit Error Rate (BER), it is revealed that the SS scheme is not very efficient due to its poor host noise suppression. The watermark domain selection problem is addressed subsequently. Some implications on hiding capacity and reliability are also studied. The last topic in SS modulation scheme is the sequence selection. The relationship between sequence bandwidth and synchronization requirement is detailed in the work. It is demonstrated that the white sequence commonly used in watermarking may not really boost watermark security. To address the host noise suppression problem, the hidden communication is modeled as a general hypothesis testing problem and a set partitioning scheme is proposed. Simulation studies and mathematical analysis confirm that it outperforms the SS schemes in host noise suppression. The proposed scheme demonstrates improvement over the existing embedding schemes. Data hiding in audio signals are explored next. The audio data hiding is believed a more challenging task due to the human sensitivity to audio artifacts and advanced feature of current compression techniques. The human psychoacoustic model and human music understanding are also covered in the work. Then as a typical audio perceptual compression scheme, the popular MP3 compression is visited in some length. Several schemes, amplitude modulation, phase modulation and noise substitution are presented together with some experimental results. As a case study, a music bitstream encryption scheme is proposed. In all these applications, human psychoacoustic model plays a very important role. A more advanced audio analysis model is introduced to reveal implications on music understanding. In the last part, conclusions and future research are presented

Digital Commons @ New Jersey Institute of Technology (NJIT)

Computational Models Designed in MATLAB to Improve Parameters and Cost of Modern Chips

Author: Peter Malík
Publication venue: 'IntechOpen'
Publication date: 13/10/2011
Field of study

IntechOpen