19 research outputs found

    Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations

    Full text link
    We present algorithms for the type-IV discrete cosine transform (DCT-IV) and discrete sine transform (DST-IV), as well as for the modified discrete cosine transform (MDCT) and its inverse, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact count is strictly lowered for all N > 4. These results are derived by considering the DCT to be a special case of a DFT of length 8N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DST-IV and MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page

    Type-II/III DCT/DST algorithms with reduced number of arithmetic operations

    Full text link
    We present algorithms for the discrete cosine transform (DCT) and discrete sine transform (DST), of types II and III, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N. Furthermore, we show that a further N multiplications may be saved by a certain rescaling of the inputs or outputs, generalizing a well-known technique for N=8 by Arai et al. These results are derived by considering the DCT to be a special case of a DFT of length 4N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DCT-III, DST-II, and DST-III follow immediately from the improved count for the DCT-II.Comment: 9 page

    Modified DCT-based Audio Watermarking Optimization using Genetics Algorithm

    Get PDF
    Ease process digital data information exchange impact on the increase in cases of copyright infringement. Audio watermarking is one solution in providing protection for the owner of the work. This research aims to optimize the insertion parameters on Modified Discrete Cosine Transform (M-DCT) based audio watermarking using a genetic algorithm, to produce better audio resistance. MDCT is applied after reading host audio, then embedding in MDCT domain is applied by Quantization Index Modulation (QIM) technique. Insertion within the MDCT domain is capable of generating a high imperceptible watermarked audio due to its overlapping frame system. The system is optimized using genetic algorithms to improve the value of imperceptibility and robustness in audio watermarking. In this research, the average SNR reaches 20 dB, and ODG reaches -0.062. The subjective quality testing on the system obtains an average MOS of 4.22 out of five songs tested. In addition, the system is able to withstand several attacks. The use of M-DCT in audio watermaking is capable of producing excellent imperceptibility and better watermark robustness

    High efficiency block coding techniques for image data.

    Get PDF
    by Lo Kwok-tung.Thesis (Ph.D.)--Chinese University of Hong Kong, 1992.Includes bibliographical references.ABSTRACT --- p.iACKNOWLEDGEMENTS --- p.iiiLIST OF PRINCIPLE SYMBOLS AND ABBREVIATIONS --- p.ivLIST OF FIGURES --- p.viiLIST OF TABLES --- p.ixTABLE OF CONTENTS --- p.xChapter CHAPTER 1 --- IntroductionChapter 1.1 --- Background - The Need for Image Compression --- p.1-1Chapter 1.2 --- Image Compression - An Overview --- p.1-2Chapter 1.2.1 --- Predictive Coding - DPCM --- p.1-3Chapter 1.2.2 --- Sub-band Coding --- p.1-5Chapter 1.2.3 --- Transform Coding --- p.1-6Chapter 1.2.4 --- Vector Quantization --- p.1-8Chapter 1.2.5 --- Block Truncation Coding --- p.1-10Chapter 1.3 --- Block Based Image Coding Techniques --- p.1-11Chapter 1.4 --- Goal of the Work --- p.1-13Chapter 1.5 --- Organization of the Thesis --- p.1-14Chapter CHAPTER 2 --- Block-Based Image Coding TechniquesChapter 2.1 --- Statistical Model of Image --- p.2-1Chapter 2.1.1 --- One-Dimensional Model --- p.2-1Chapter 2.1.2 --- Two-Dimensional Model --- p.2-2Chapter 2.2 --- Image Fidelity Criteria --- p.2-3Chapter 2.2.1 --- Objective Fidelity --- p.2-3Chapter 2.2.2 --- Subjective Fidelity --- p.2-5Chapter 2.3 --- Transform Coding Theroy --- p.2-6Chapter 2.3.1 --- Transformation --- p.2-6Chapter 2.3.2 --- Quantization --- p.2-10Chapter 2.3.3 --- Coding --- p.2-12Chapter 2.3.4 --- JPEG International Standard --- p.2-14Chapter 2.4 --- Vector Quantization Theory --- p.2-18Chapter 2.4.1 --- Codebook Design and the LBG Clustering Algorithm --- p.2-20Chapter 2.5 --- Block Truncation Coding Theory --- p.2-22Chapter 2.5.1 --- Optimal MSE Block Truncation Coding --- p.2-24Chapter CHAPTER 3 --- Development of New Orthogonal TransformsChapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Weighted Cosine Transform --- p.3-4Chapter 3.2.1 --- Development of the WCT --- p.3-6Chapter 3.2.2 --- Determination of a and β --- p.3-9Chapter 3.3 --- Simplified Cosine Transform --- p.3-10Chapter 3.3.1 --- Development of the SCT --- p.3-11Chapter 3.4 --- Fast Computational Algorithms --- p.3-14Chapter 3.4.1 --- Weighted Cosine Transform --- p.3-14Chapter 3.4.2 --- Simplified Cosine Transform --- p.3-18Chapter 3.4.3 --- Computational Requirement --- p.3-19Chapter 3.5 --- Performance Evaluation --- p.3-21Chapter 3.5.1 --- Evaluation using Statistical Model --- p.3-21Chapter 3.5.2 --- Evaluation using Real Images --- p.3-28Chapter 3.6 --- Concluding Remarks --- p.3-31Chapter 3.7 --- Note on Publications --- p.3-32Chapter CHAPTER 4 --- Pruning in Transform Coding of ImagesChapter 4.1 --- Introduction --- p.4-1Chapter 4.2 --- "Direct Fast Algorithms for DCT, WCT and SCT" --- p.4-3Chapter 4.2.1 --- Discrete Cosine Transform --- p.4-3Chapter 4.2.2 --- Weighted Cosine Transform --- p.4-7Chapter 4.2.3 --- Simplified Cosine Transform --- p.4-9Chapter 4.3 --- Pruning in Direct Fast Algorithms --- p.4-10Chapter 4.3.1 --- Discrete Cosine Transform --- p.4-10Chapter 4.3.2 --- Weighted Cosine Transform --- p.4-13Chapter 4.3.3 --- Simplified Cosine Transform --- p.4-15Chapter 4.4 --- Operations Saved by Using Pruning --- p.4-17Chapter 4.4.1 --- Discrete Cosine Transform --- p.4-17Chapter 4.4.2 --- Weighted Cosine Transform --- p.4-21Chapter 4.4.3 --- Simplified Cosine Transform --- p.4-23Chapter 4.4.4 --- Generalization Pruning Algorithm for DCT --- p.4-25Chapter 4.5 --- Concluding Remarks --- p.4-26Chapter 4.6 --- Note on Publications --- p.4-27Chapter CHAPTER 5 --- Efficient Encoding of DC Coefficient in Transform Coding SystemsChapter 5.1 --- Introduction --- p.5-1Chapter 5.2 --- Minimum Edge Difference (MED) Predictor --- p.5-3Chapter 5.3 --- Performance Evaluation --- p.5-6Chapter 5.4 --- Simulation Results --- p.5-9Chapter 5.5 --- Concluding Remarks --- p.5-14Chapter 5.6 --- Note on Publications --- p.5-14Chapter CHAPTER 6 --- Efficient Encoding Algorithms for Vector Quantization of ImagesChapter 6.1 --- Introduction --- p.6-1Chapter 6.2 --- Sub-Codebook Searching Algorithm (SCS) --- p.6-4Chapter 6.2.1 --- Formation of the Sub-codebook --- p.6-6Chapter 6.2.2 --- Premature Exit Conditions in the Searching Process --- p.6-8Chapter 6.2.3 --- Sub-Codebook Searching Algorithm --- p.6-11Chapter 6.3 --- Predictive Sub-Codebook Searching Algorithm (PSCS) --- p.6-13Chapter 6.4 --- Simulation Results --- p.6-17Chapter 6.5 --- Concluding Remarks --- p.5-20Chapter 6.6 --- Note on Publications --- p.6-21Chapter CHAPTER 7 --- Predictive Classified Address Vector Quantization of ImagesChapter 7.1 --- Introduction --- p.7-1Chapter 7.2 --- Optimal Three-Level Block Truncation Coding --- p.7-3Chapter 7.3 --- Predictive Classified Address Vector Quantization --- p.7-5Chapter 7.3.1 --- Classification of Images using Three-level BTC --- p.7-6Chapter 7.3.2 --- Predictive Mean Removal Technique --- p.7-8Chapter 7.3.3 --- Simplified Address VQ Technique --- p.7-9Chapter 7.3.4 --- Encoding Process of PCAVQ --- p.7-13Chapter 7.4 --- Simulation Results --- p.7-14Chapter 7.5 --- Concluding Remarks --- p.7-18Chapter 7.6 --- Note on Publications --- p.7-18Chapter CHAPTER 8 --- Recapitulation and Topics for Future InvestigationChapter 8.1 --- Recapitulation --- p.8-1Chapter 8.2 --- Topics for Future Investigation --- p.8-3REFERENCES --- p.R-1APPENDICESChapter A. --- Statistics of Monochrome Test Images --- p.A-lChapter B. --- Statistics of Color Test Images --- p.A-2Chapter C. --- Fortran Program Listing for the Pruned Fast DCT Algorithm --- p.A-3Chapter D. --- Training Set Images for Building the Codebook of Standard VQ Scheme --- p.A-5Chapter E. --- List of Publications --- p.A-

    PEMISAHAN VERSE DAN REFF SECARA OTOMATIS PADA MUSIK MP3 MENGGUNAKAN KORELASI ANTAR FRAME BERBASIS CIRI MODIFIED DISCRETE COSINE TRANSFORM (MDCT)

    Get PDF
    Di masa sekarang teknologi telekomunikasi tidak hanya untuk mengirimkan satu informasi dari satu titik ke titik yang lain, tetapi meluas contohnya dunia musik. Dengan adanya pengolahan sinyal informasi dalam dunia musik, dimana bisa mengidentifikasi sinyal informasi pada lagu, lagu dijadikan sebagai objek utama dikarenakan perkembangan entertainment musik yang begitu pesat. Penelitian ini tentang pencarian verse dan reff dengan inputan potongan verse dan reff dari lagu untuk disimpan pada database yang terdiri dari 25 data potongan verse dan reff dan berbagai genre yang diproses secara manual. Tugas Akhir ini menggunakan metode Modified Discrete Cosine Transform (MDCT) yaitu mencari reff dan verse pada lagu secara otomatis, proses yang dilakukan untuk menentukan letak verse dan reff dengan menggunakan korelasi antar frame setelah frame tersebut dilakukan ekstraksi ciri menggunakan MDCT. Di dalam tugas akhir ini, 25 file lagu pada database menghasilkan rata-rata akurasi 75% dari ketepatan letak verse dan reff dalam detik dari hasil metode dibandingkan dengan letak aktual dari hasil pemisahan verse dan reff secara manual pada masing-masing lagu. Waktu komputasi terbaik pada tugas akhir ini 95 detik dengan frame 1000ms untuk pemotongan 1 lagu file mp3. Kata kunci : Modified Discrete Cosine Transform (MDCT), verse, reff

    Design of digital IP block for discrete cosine transform

    Get PDF
    Tato diplomová práce se zabývá návrhem IP bloku pro diskrétní kosinovou transformaci. V~teoretické části jsou shrnuty algoritmy pro výpočet diskrétní kosinové transformace a diskutována jejich použitelnost v~hardwaru. Zvolený algoritmus pro hardwarovou implementaci je modelován v jazyce C. Poté je popsán na RTL úrovni, verifikován a je provedena syntéza v~technologii TSMC 65 nm. Hardwarová implementace je poté zhodnocena s ohledem na datovou propustnost, plochu, rychlost and spotřebu.This diploma thesis deals with design of IP block for discrete cosine transform. Theoretical part summarizes algorithms for computation of discrete cosine transform and their hardware usability is discussed. Chosen algorithm for hardware implementation is modeled in C language. Algorithm is described at RTL level, verified and synthesized to TSMC 65 nm technology. Hardware implementation is then evaluated with respect of throughput, area, speed and power consumption.

    A network transparent, retained mode multimedia processing framework for the Linux operating system environment

    Get PDF
    Die Arbeit präsentiert ein Multimedia-Framework für Linux, das im Unterschied zu früheren Arbeiten auf den Ideen "retained-mode processing" und "lazy evaluation" basiert: Statt Transformationen unmittelbar auszuführen, wird eine abstrakte Repräsentation aller Medienelemente aufgebaut. "renderer"-Treiber fungieren als Übersetzer, die diese Darstellung zur Laufzeit in konkrete Operationen umsetzen, wobei das Datenmodell zahlreiche Optimierungen zur Reduktion der Anzahl der Schritte oder der Minimierung von Kommunikation erlaubt. Dies erlaubt ein stark vereinfachtes Programmiermodell bei gleichzeitiger Effizienzsteigerung. "renderer"-Treiber können zur Ausführung von Transformationen den lokalen Prozessor verwenden, oder können die Operationen delegieren. In der Arbeit wird eine Erweiterung des X Window Systems um Mechanismen zur Medienverarbeitung vorgestellt, sowie ein "renderer"-Treiber, der diese zur Delegation der Verarbeitung nutzt

    Frequency-domain bandwidth extension for low-delay audio coding applications

    Get PDF
    MPEG-4 Spectral Band Replication (SBR) is a sophisticated high-frequency reconstruction (HFR) tool for speech and natural audio which when used in conjunction with an audio codec delivers a broadband high-quality signal at a bit rate of 48 kbps or even below. The major drawback of this technique is that it significantly increases the delay of the underlying core codec. The idea of synthetic signal reconstruction is of particular interest also in real-time communications. There, a HFR method can be employed to further loosen the channel capacity requirements. In this thesis a delay-optimized derivative of SBR is elaborated, which can be used together with a low-delay speech and audio coder like the Fraunhofer ULD. The presented approach is based on a short-time subband representation of an acoustic signal of natural or artificial origin, and as such it utilizes a filter bank for the extraction and the manipulation of sound characteristics. The system delay for a combination of the ULD coder with the proposed low-delay bandwidth extension (LD-BWE) tool adds up to 12 ms at a sampling rate of 48 kHz. At the present stage, LD-BWE generates a subjectively confirmed excellent-quality highband replica at a simulated mean data rate of 12.8 kbps.MPEG-4 Spectral Band Replication (SBR) ist ein technisch ausgereiftes Verfahren zur Rückgewinnung von hochfrequenten Signalkomponenten für Sprache und natürliches Audio, das in Verbindung mit einem Audiocodec angewandt ein hochwertiges Breitbandsignal bei einer Bitrate von nicht mehr als 48 kbps liefert. Ein wesentlicher Nachteil dieser Methode ist, dass sie die Zeitverzögerung des darunter liegenden Kerncodecs maßgeblich vergrößert. Die Idee der synthetischen Signalwiederherstellung ist in Echtzeitkommunikation ebenso von besonderem Interesse. Ein derartiges Verfahren könnte dort eingesetzt werden, um die Anforderungen an die Kanalkapazität weiter zu lockern. In dieser Arbeit wird ein latenzoptimiertes Derivat von SBR ausgearbeitet, welches zusammen mit einem minimal verzögernden Sprach- und Audiocoder, wie dem Fraunhofer ULD, verwendet werden kann. Der vorgestellte Ansatz basiert auf einer Kurzzeit-Teilband-Darstellung eines akustischen Signals natürlichen oder künstlichen Ursprungs, und greift als solcher auf eine Filterbank zur Extraktion und Manipulation von Klangcharakteristika zurück. Die Verzögerungszeit des Gesamtsystems bestehend aus dem ULD-Coder und der vorgeschlagenen Bandbreitenerweiterung beläuft sich bei einer Abtastrate von 48 kHz auf 12 ms. Einem subjektiven Hörtest zufolge, erzeugt die neu entwickelte Bandbreitenerweiterung in ihrem derzeitigen Stadium eine Kopie des Hochbandes von hervorragender Qualität bei einer simulierten mittleren Datenrate von 12.8 kbps.Ilmenau, Techn. Univ., Masterarbeit, 201

    A Removal of Eye Movement and Blink Artifacts from EEG Data Using Morphological Component Analysis

    Get PDF
    EEG signals contain a large amount of ocular artifacts with different time-frequency properties mixing together in EEGs of interest. The artifact removal has been substantially dealt with by existing decomposition methods known as PCA and ICA based on the orthogonality of signal vectors or statistical independence of signal components. We focused on the signal morphology and proposed a systematic decomposition method to identify the type of signal components on the basis of sparsity in the time-frequency domain based on Morphological Component Analysis (MCA), which provides a way of reconstruction that guarantees accuracy in reconstruction by using multiple bases in accordance with the concept of “dictionary.” MCA was applied to decompose the real EEG signal and clarified the best combination of dictionaries for this purpose. In our proposed semirealistic biological signal analysis with iEEGs recorded from the brain intracranially, those signals were successfully decomposed into original types by a linear expansion of waveforms, such as redundant transforms: UDWT, DCT, LDCT, DST, and DIRAC. Our result demonstrated that the most suitable combination for EEG data analysis was UDWT, DST, and DIRAC to represent the baseline envelope, multifrequency wave-forms, and spiking activities individually as representative types of EEG morphologies
    corecore