16 research outputs found

    Low energy HEVC and VVC video compression hardware

    Get PDF
    Video compression standards compress a digital video by reducing and removing redundancy in the digital video using computationally complex algorithms. As spatial and temporal resolutions of videos increase, compression efficiencies of video compression algorithms are also increasing. However, increased compression efficiency comes with increased computational complexity. Therefore, it is necessary to reduce computational complexities of video compression algorithms without reducing their visual quality in order to reduce area and energy consumption of their hardware implementations. In this thesis, we propose a novel technique for reducing amount of computations performed by HEVC intra prediction algorithm. We designed low energy, reconfigurable HEVC intra prediction hardware using the proposed technique. We also designed a low energy FPGA implementation of HEVC intra prediction algorithm using the proposed technique and DSP blocks. We propose a reconfigurable VVC intra prediction hardware architecture. We also propose an efficient VVC intra prediction hardware architecture using DSP blocks. We designed low energy VVC fractional interpolation hardware. We propose a novel approximate absolute difference technique. We designed low energy approximate absolute difference hardware using the proposed technique. We propose a novel approximate constant multiplication technique. We designed approximate constant multiplication hardware using the proposed technique. We quantified computation reductions achieved by the proposed techniques and video quality loss caused by the proposed approximation techniques. The proposed approximate absolute difference technique and approximate constant multiplication technique cause very small PSNR loss. The other proposed techniques cause no PSNR loss. We implemented the proposed hardware architectures in Verilog HDL. We mapped the Verilog RTL codes to Xilinx Virtex 6 or Xilinx Virtex 7 FPGAs and estimated their power consumptions using Xilinx XPower Analyzer tool. The proposed techniques significantly reduced power and energy consumptions of these FPGA implementation

    High performance high quality image demosaicing hardware designs

    Get PDF
    Since capturing three color channels (red, green, and blue) per pixel increases the cost of digital cameras, most digital cameras capture only one color channel per pixel using a single image sensor. The images pass through a color filter array before being captured by the image sensor. Demosaicing is the process of reconstructing the missing color channels of the pixels in the color filtered image using their available neighboring pixels. There are many image demosaicing algorithms with varying reconstructed image quality and computational complexity. In this thesis, high performance hardware architectures are designed for two high quality image demosaicing algorithms with high computational complexity. The proposed hardware architectures are implemented on an FPGA. A high performance Alternating Projections (AP) image demosaicing hardware is proposed. This is the first AP image demosaicing hardware in the literature. A high performance Enhanced Effective Color Interpolation (EECI) image demosaicing hardware is proposed. This is the first EECI image demosaicing hardware in the literature. The proposed hardware architectures are implemented using Verilog HDL. The Verilog RTL codes are mapped to a Xilinx Virtex 6 FPGA. The proposed FPGA implementations are verified with post place & route simulations. They can process 31 and 94 full HD (1920x1080) images per second, respectively

    An efficient FPGA implementation of versatile video coding intra prediction

    Get PDF
    Versatile Video Coding (VVC) is a new international video compression standard offering much better compression efficiency than previous video compression standards at the expense of much higher computational complexity. In this paper, an efficient FPGA implementation of VVC intra prediction for angular prediction modes of 4x4, 8x8, 16x16 and 32x32 prediction unit sizes is proposed. In the proposed FPGA implementation, four constant multiplications used in one intra angular prediction equation are implemented using two DSP blocks and two adders in FPGA. The proposed FPGA implementation of VVC intra prediction, in the worst case, can process 34 full HD (1920x1080) frames per second

    Novel approximate absolute difference hardware

    Get PDF

    Novel approximate absolute difference hardware

    Get PDF
    Approximate hardware designs have higher performance, smaller area or lower power consumption than exact hardware designs at the expense of lower accuracy. Absolute difference (AD) operation is heavily used in many applications such as motion estimation (ME) for video compression, ME for frame rate conversion, stereo matching for depth estimation. Since most of the applications using AD operation are error tolerant by their nature, approximate hardware designs can be used in these applications. In this paper, novel approximate AD hardware designs are proposed. The proposed approximate AD hardware implementations have higher performance, smaller area and lower power consumption than exact AD hardware implementations at the expense of lower accuracy. They also have less error, smaller area and lower power consumption than the approximate AD hardware implementations which use approximate adders proposed in the literature

    An efficient FPGA implementation of HEVC intra prediction

    Get PDF
    Intra prediction algorithm used in High Efficiency Video Coding (HEVC) standard has very high computational complexity. In this paper, an efficient FPGA implementation of HEVC intra prediction is proposed for 4×4, 8×8, 16×16 and 32×32 angular prediction modes. In the proposed FPGA implementation, one intra angular prediction equation is implemented using one DSP block in FPGA. The proposed FPGA implementation, in the worst case, can process 55 Full HD (1920×1080) video frames per second. It has up to 34.66% less energy consumption than the original FPGA implementation of HEVC intra prediction. Therefore, it can be used in portable consumer electronics products that require a real-time HEVC encoder

    Efficient multiple constant multiplication using DSP blocks in FPGA

    Get PDF
    Multiple constant multiplication (MCM) operation multiplies an input variable with multiple constants. MCM operations are widely used in many applications such as video processing and compression. In this paper, a method is proposed for efficient implementation of MCM operations using DSP blocks in Xilinx FPGAs. The proposed method reduces number of DSP blocks used for implementing a given MCM operation by manipulating the multiple constants used in this MCM operation. In this paper, a high level synthesis tool implementing the proposed method is also proposed. The proposed tool takes the input variable bit length and multiple constants as inputs, and generates a Verilog RTL code which efficiently implements this MCM operation using DSP blocks. The proposed method and tool are used for one of the most complex video compression algorithms, HEVC 2D DCT. They reduced number of DSP blocks used in the FPGA implementation of HEVC 2D DCT algorithm by 35.8%

    A novel approximate constant multiplier and HEVC discrete cosine transform case study

    No full text
    Approximate computing is used for error tolerant applications to design faster, smaller area and lower power consuming hardware than exact optimized hardware designs by trading off speed, area and power consumption with quality. Constant multiplication is used in error tolerant applications with high computational complexity such as video processing, video compression and machine learning. Therefore, in this paper, a novel approximate constant multiplication technique is proposed. Approximate constant multiplier hardware implementing the proposed approximation technique reduces constant multiplication to multiplication with a smaller constant. The proposed approximate constant multiplier causes negligible quality loss when it is used to implement the constant multiplications in High Efficiency Video Coding (HEVC) discrete cosine transform (DCT). It reduces area, reduces power consumption and increases performance of HEVC DCT hardware

    Low error approximate absolute difference hardware

    No full text
    In this paper, we propose low error approximate absolute difference (LAD_X) hardware. LAD_X hardware has lower maximum and average error, and higher accuracy than the approximate absolute difference (AD) hardware in the literature. It has similar performance with and smaller area than the approximate AD hardware in the literature. The H.264 motion estimation (ME) hardware using LAD_X hardware performs higher quality ME than the H.264 ME hardware using the approximate AD hardware in the literature. It has similar performance with and smaller area than the H.264 ME hardware using the approximate AD hardware in the literature
    corecore