947 research outputs found

    VLSI Design of a Fast Pipelined 8x8 Discrete Cosine Transform

    Get PDF
    This paper presents a Very Large Scale Integrated (VLSI) design and implementation of a fixed-point 8x8 multiplierless Discrete Cosine Transform (DCT) using the ISO/IEC 23002-2 algorithm. The standard DCT algorithm, which is mainly used in image and video compression technology, consists of only adders, subtractors, and shifters, therefore making it efficient for hardware implementation. The VLSI implementation of the algorithm given in this paper further enhances the performance of the transform unit. Furthermore, circuit pipelining has been applied to the base design of the DCT, which significantly improves the performance by reducing the longest path in the non-pipeline design. The DCT has been implemented using semi-custom VLSI design methodology using the TSMC 0.13um process technology. Results show that our DCT designs can run up to around 1.7 Giga pixels/s, which is well above the timing required for real-time ultra-high definition 8K video

    Hardware acceleration architectures for MPEG-Based mobile video platforms: a brief overview

    Get PDF
    This paper presents a brief overview of past and current hardware acceleration (HwA) approaches that have been proposed for the most computationally intensive compression tools of the MPEG-4 standard. These approaches are classified based on their historical evolution and architectural approach. An analysis of both evolutionary and functional classifications is carried out in order to speculate on the possible trends of the HwA architectures to be employed in mobile video platforms

    HEVC 2D-DCT architectures comparison for FPGA and ASIC implementations

    Get PDF
    This paper compares ASIC and FPGA implementations of two commonly used architectures for 2-dimensional discrete cosine transform (DCT), the parallel and folded architectures. The DCT has been designed for sizes 4x4, 8x8, and 16x16, and implemented on Silterra 180nm ASIC and Xilinx Kintex Ultrascale FPGA. The objective is to determine suitable low energy architectures to be used as their characteristics greatly differ in terms of cells usage, placement and routing methods on these platforms. The parallel and folded DCT architectures for all three sizes have been designed using Verilog HDL, including the basic serializer-deserializer input and output. Results show that for large size transform of 16x16, ASIC parallel architecture results in roughly 30% less energy compared to folded architecture. As for FPGAs, folded architecture results in roughly 34% less energy compared to parallel architecture. In terms of overall energy consumption between 180nm ASIC and Xilinx Ultrascale, ASIC implementation results in about 58% less energy compared to the FPGA
    • 

    corecore