25 research outputs found

    Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations

    Full text link
    We present algorithms for the type-IV discrete cosine transform (DCT-IV) and discrete sine transform (DST-IV), as well as for the modified discrete cosine transform (MDCT) and its inverse, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact count is strictly lowered for all N > 4. These results are derived by considering the DCT to be a special case of a DFT of length 8N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DST-IV and MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page

    A fast algorithm for the computation of 2-D forward and inverse MDCT

    No full text
    International audienceA fast algorithm for computing the two-dimensional (2-D) forward and inverse modified discrete cosine transform (MDCT and IMDCT) is proposed. The algorithm converts the 2-D MDCT and IMDCT with block size M N into four 2-D discrete cosine transforms (DCTs) with block size ðM=4Þ ðN=4Þ. It is based on an algorithm recently presented by Cho et al. [An optimized algorithm for computing the modified discrete cosine transform and its inverse transform, in: Proceedings of the IEEE TENCON, vol. A, 21–24 November 2004, pp. 626–628] for the efficient calculation of onedimensional MDCT and IMDCT. Comparison of the computational complexity with the traditional row–column method shows that the proposed algorithm reduces significantly the number of arithmetic operations

    Efficient implementation of a class of MDCT/IMDCT filterbanks for speech and audio coding applications

    Full text link

    ECG Signal Compression Using Discrete Wavelet Transform

    Get PDF

    Watermarking Technique for Multimedia Documents in the Frequency Domain

    Get PDF
    In order to secure and maintain the authenticity and integrity of multimedia documents, we use digital watermarking. This discipline can be applied to images, audios, and videos. For this reason, and to be independent of the nature of the signal composing the document to be watermarked, we will propose in this chapter two watermarking techniques, one for the audio and another for the image to watermark a video containing the two components audio and image. MDCT is combined with Watson model and a motion detection algorithm in the image watermarking technique and is combined with a psychoacoustic model to elaborate the audio watermarking technique. For the two techniques, the bits of the mark will be duplicated to increase the capacity of insertion and then inserted into the least significant bit (LSB). We will use an error correction code (Hamming) on the mark for more reliability in the detection phase. To highlight our experimental results point of view robustness and imperceptibility, we will compare the proposed techniques with some other existing techniques

    Scalable Speech Coding for IP Networks

    Get PDF
    The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition

    Contextual biometric watermarking of fingerprint images

    Get PDF
    This research presents contextual digital watermarking techniques using face and demographic text data as multiple watermarks for protecting the evidentiary integrity of fingerprint image. The proposed techniques embed the watermarks into selected regions of fingerprint image in MDCT and DWT domains. A general image watermarking algorithm is developed to investigate the application of MDCT in the elimination of blocking artifacts. The application of MDCT has improved the performance of the watermarking technique compared to DCT. Experimental results show that modifications to fingerprint image are visually imperceptible and maintain the minutiae detail. The integrity of the fingerprint image is verified through high matching score obtained from the AFIS system. There is also a high degree of correlation between the embedded and extracted watermarks. The degree of similarity is computed using pixel-based metrics and human visual system metrics. It is useful for personal identification and establishing digital chain of custody. The results also show that the proposed watermarking technique is resilient to common image modifications that occur during electronic fingerprint transmission

    Headroom and precision requirements of fixed point audio processing in different data domains for real world content

    Full text link
    En las técnicas de procesado de audio actuales, la señal temporal de audio se transforma en otros dominios de datos para su procesado, usando, por ejemplo, la FFT, MDCT o la QMF compleja. Cada uno de estos dominios tiene sus propiedades diferentes, lo que implica tener que satisfacer diferentes requisitos para gestionar correctamente la distorsión y el ruido en cada sistema. El objetivo del trabajo propuesto es profundizar sobre estos requisitos cuando codificamos señales reales usando técnicas de sonido inmersivo o sonido multicanal, y poder proponer pautas para conseguir el adecuado margen dinámico de protección frente a la distorsión (headroom) y la precisión necesaria en los sistemas de codificación de punto fijo. Los procesos que se aplican a las señales que van a estudiarse serán los mismo que se utilizan actualmente en los sistemas Dolby, y que incluyen implementaciones que mejoran la eficiencia tales como el uso de transformaciones de los datos en otros dominios, la reducción del número de canales de audio (como en la transformación y mezcla de las señales para la reproducción en una configuración de un altavoz), y el procesado de la sonoridad de la señal.In modern audio processing technology, the audio signal is converted to various data domains for processing by using, e.g. FFT, MDCT, or Complex QMF transforms. Each domain has different properties resulting in different requirements in headroom and precision to manage distortion and noise in such systems. The aim of the work is to investigate such requirements for immersive and multichannel real-world content and to develop guidelines for headroom and data precision requirements for processing such content. Processing blocks to study are the ones currently used in Dolby systems that can include smart implementations of data domain transforms, reduction of audio channels like rendering to a speaker configuration and down-mixing and signal loudness processing.Carbonell Tena, D. (2018). Estudio del headroom y las necesidades de precisión para el procesado de señales reales de audio en representación de punto fijo usando diferentes dominios. Universitat Politècnica de València. http://hdl.handle.net/10251/98727TFG
    corecore