7 research outputs found

    Arquitectura de Alto Rendimiento para el Cálculo de la DCT

    Get PDF
    En este trabajo se han revisado los principales métodos de cálculo de la Transformada Discreta del Coseno y sus implementaciones. A partir de esta información se ha propuesto una arquitectura de cálculo de alto rendimiento que pone en práctica técnicas de aritmética de computadores en el desarrollo de operadores para crear una estructura compacta que calcula la transformada a partir de su formulación directa. Se ha implementado y simulado el funcionamiento de la arquitectura propuesta en tarjetas reconfigurables para el Procesamiento de señales digitales, para evaluar su rendimiento en términos de área, retardo y potencia consumida. Además, se ha calculado su rendimiento con un modelo homogéneo e independiente de la tecnología de implementación con el propósito de comparar sus prestaciones con las de otras técnicas conocidas

    Accurate Rotations Based on Coefficient Scaling

    Full text link

    VLSI Implementation of a Cost-Efficient Loeffler-DCT Algorithm with Recursive CORDIC for DCT-Based Encoder

    Get PDF
    This paper presents a low-cost and high-quality; hardware-oriented; two-dimensional discrete cosine transform (2-D DCT) signal analyzer for image and video encoders. In order to reduce memory requirement and improve image quality; a novel Loeffler DCT based on a coordinate rotation digital computer (CORDIC) technique is proposed. In addition; the proposed algorithm is realized by a recursive CORDIC architecture instead of an unfolded CORDIC architecture with approximated scale factors. In the proposed design; a fully pipelined architecture is developed to efficiently increase operating frequency and throughput; and scale factors are implemented by using four hardware-sharing machines for complexity reduction. Thus; the computational complexity can be decreased significantly with only 0.01 dB loss deviated from the optimal image quality of the Loeffler DCT. Experimental results show that the proposed 2-D DCT spectral analyzer not only achieved a superior average peak signal–noise ratio (PSNR) compared to the previous CORDIC-DCT algorithms but also designed cost-efficient architecture for very large scale integration (VLSI) implementation. The proposed design was realized using a UMC 0.18-μm CMOS process with a synthesized gate count of 8.04 k and core area of 75,100 μm2. Its operating frequency was 100 MHz and power consumption was 4.17 mW. Moreover; this work had at least a 64.1% gate count reduction and saved at least 22.5% in power consumption compared to previous designs

    Investigation of a Novel Common Subexpression Elimination Method for Low Power and Area Efficient DCT Architecture

    Get PDF
    A wide interest has been observed to find a low power and area efficient hardware design of discrete cosine transform (DCT) algorithm. This research work proposed a novel Common Subexpression Elimination (CSE) based pipelined architecture for DCT, aimed at reproducing the cost metrics of power and area while maintaining high speed and accuracy in DCT applications. The proposed design combines the techniques of Canonical Signed Digit (CSD) representation and CSE to implement the multiplier-less method for fixed constant multiplication of DCT coefficients. Furthermore, symmetry in the DCT coefficient matrix is used with CSE to further decrease the number of arithmetic operations. This architecture needs a single-port memory to feed the inputs instead of multiport memory, which leads to reduction of the hardware cost and area. From the analysis of experimental results and performance comparisons, it is observed that the proposed scheme uses minimum logic utilizing mere 340 slices and 22 adders. Moreover, this design meets the real time constraints of different video/image coders and peak-signal-to-noise-ratio (PSNR) requirements. Furthermore, the proposed technique has significant advantages over recent well-known methods along with accuracy in terms of power reduction, silicon area usage, and maximum operating frequency by 41%, 15%, and 15%, respectively

    Efficient architectures for multidimensional discrete transforms in image and video processing applications

    Get PDF
    PhD ThesisThis thesis introduces new image compression algorithms, their related architectures and data transforms architectures. The proposed architectures consider the current hardware architectures concerns, such as power consumption, hardware usage, memory requirement, computation time and output accuracy. These concerns and problems are crucial in multidimensional image and video processing applications. This research is divided into three image and video processing related topics: low complexity non-transform-based image compression algorithms and their architectures, architectures for multidimensional Discrete Cosine Transform (DCT); and architectures for multidimensional Discrete Wavelet Transform (DWT). The proposed architectures are parameterised in terms of wordlength, pipelining and input data size. Taking such parameterisation into account, efficient non-transform based and low complexity image compression algorithms for better rate distortion performance are proposed. The proposed algorithms are based on the Adaptive Quantisation Coding (AQC) algorithm, and they achieve a controllable output bit rate and accuracy by considering the intensity variation of each image block. Their high speed, low hardware usage and low power consumption architectures are also introduced and implemented on Xilinx devices. Furthermore, efficient hardware architectures for multidimensional DCT based on the 1-D DCT Radix-2 and 3-D DCT Vector Radix (3-D DCT VR) fast algorithms have been proposed. These architectures attain fast and accurate 3-D DCT computation and provide high processing speed and power consumption reduction. In addition, this research also introduces two low hardware usage 3-D DCT VR architectures. Such architectures perform the computation of butterfly and post addition stages without using block memory for data transposition, which in turn reduces the hardware usage and improves the performance of the proposed architectures. Moreover, parallel and multiplierless lifting-based architectures for the 1-D, 2-D and 3-D Cohen-Daubechies-Feauveau 9/7 (CDF 9/7) DWT computation are also introduced. The presented architectures represent an efficient multiplierless and low memory requirement CDF 9/7 DWT computation scheme using the separable approach. Furthermore, the proposed architectures have been implemented and tested using Xilinx FPGA devices. The evaluation results have revealed that a speed of up to 315 MHz can be achieved in the proposed AQC-based architectures. Further, a speed of up to 330 MHz and low utilisation rate of 722 to 1235 can be achieved in the proposed 3-D DCT VR architectures. In addition, in the proposed 3-D DWT architecture, the computation time of 3-D DWT for data size of 144×176×8-pixel is less than 0.33 ms. Also, a power consumption of 102 mW at 50 MHz clock frequency using 256×256-pixel frame size is achieved. The accuracy tests for all architectures have revealed that a PSNR of infinite can be attained
    corecore