9 research outputs found

    VLSI architecture design approaches for real-time video processing

    Get PDF
    This paper discusses the programmable and dedicated approaches for real-time video processing applications. Various VLSI architecture including the design examples of both approaches are reviewed. Finally, discussions of several practical designs in real-time video processing applications are then considered in VLSI architectures to provide significant guidelines to VLSI designers for any further real-time video processing design works

    A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264

    No full text

    A fast VLSI architecture for full-search variable block size motion estimation in MPEG--4 AVC/H.264

    Get PDF
    We describe a fast VLSI architecture for full-search motion estimation for the blocks with 7 different sizes in MPEG-4 AVC/H.264. The proposed variable block size motion estimation (VBSME) architecture consists of a 16x16 PE array, an adder tree and comparators to find all 41 motion vectors and their minimum SADs for the blocks of 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4. It employs a 2-D datapath and its control of the search area data is simple and regular. The proposed VBSME can achieve 100% PE utilization by employing a preload register and a search data buffer inside each PE and allow real-time processing of 4CIF(704x576) video with 15 fps at 100 Mhz for a search range of [-32~+31]

    Low power motion estimation hardware designs

    Get PDF
    Motion Estimation (ME) is the most computationally intensive and most power consuming part of video compression and video enhancement systems. ME is used in video compression standards such as H.264/MPEG-4 and it is used in video enhancement algorithms such as frame rate conversion and de-interlacing. Half pixel (HP) ME increases the video coding efficiency at the expense of increased computational complexity. Therefore, in this thesis, we designed and implemented efficient integer pixel (IP) ME hardware implementing full search ME algorithm, and we proposed techniques for reducing the dynamic power consumptions of IP and HP ME hardware. The proposed ME hardware architectures are implemented in Verilog HDL and mapped to Xilinx FPGAs. The FPGA implementations are verified with post place & route simulations. We proposed comparison prediction (CP) technique for reducing the power consumption of IP block matching (BM) ME hardware. CP technique reduces the power consumption of absolute difference operations performed by IP BM ME hardware. The proposed technique can easily be used in all IP BM ME hardware. It reduced the power consumption of a fixed block size IP BM ME hardware implementing full search algorithm by 9.3% with 0.04% PSNR loss on a Xilinx XC2VP30-7 FPGA. We also proposed two techniques for reducing the power consumption of H.264 HP ME hardware. The first technique is vector dependent sum of absolute difference (SAD) reuse which reduces the amount of computations for variable block size H.264 HP ME with no PSNR loss. The second technique is a novel modification of the HP search algorithm which adaptively tries to use the IP motion vector trajectories to reduce HP search to 1-D. This technique causes an average PSNR loss of 0.36 dB. The two techniques reduced the power consumption of a variable block size H.264 HP ME hardware by 6% and 31% on a Xilinx Virtex 6 FPGA respectively

    Dynamic power consumption estimation and reduction for full search motion estimation hardware

    Get PDF
    Motion Estimation (ME) is the most computationally intensive and most power consuming part of video compression and video enhancement systems. ME is used in video compression standards such as MPEG4, H.264 and it is used in video enhancement algorithms such as frame rate conversion and de-interlacing. Since portable devices operate with battery, it is important to reduce power consumption so that the battery life can be increased. In addition, consuming excessive power degrades the performance of integrated circuits, increases packaging and cooling costs, reduces the reliability and may cause device failures. Therefore, estimating and reducing power consumption of motion estimation hardware is very important. In this thesis, we propose a novel dynamic power estimation technique for full search ME hardware. We estimated the power consumption of two full search ME hardware implementations on a Xilinx Virtex II FPGA using several existing high and low level dynamic power estimation techniques and our technique. Gate-level timing simulation based power estimation of full search ME hardware for an average frame using Xilinx XPower tool takes 6 - 18 hours in a state-of-the-art PC, whereas estimating the power consumption of the same ME hardware for the same frame takes a few seconds using our technique. The average and maximum difference between the power consumptions estimated by our technique and the power consumptions estimated by XPower tool for four different video sequences are %3 and %13 respectively. We also propose a novel dynamic power reduction technique for ME hardware. We quantified the impact of glitch reduction, clock gating and the proposed technique on the power consumption of two full search ME hardware implementations on a Xilinx Virtex II FPGA using Xilinx XPower tool. Glitch reduction and clock gating together achieved an average of 21% dynamic power reduction. The proposed technique achieved an average of 23% dynamic power reduction with an average of 0.4dB PSNR loss. The proposed technique achieves better power reduction than pixel truncation technique with a similar PSNR loss. We also showed that our dynamic power estimation technique can be used for developing novel dynamic power reduction techniques. To do this, we used our technique to estimate the dynamic power consumption of the ME hardware when two different dynamic power reduction techniques are used. The results show that if a power reduction technique only changes the input data order of the ME hardware, the proposed dynamic power estimation technique can be used to quickly estimate the effectiveness of that technique. However, if the architecture of the ME hardware is modified, the accuracy of the power consumption estimations decrease. Therefore the proposed power estimation technique should be improved for this case

    Reconfigurable Architecture For H.264/avc Variable Block Size Motion Estimation Based On Motion Activity And Adaptive Search Range

    Get PDF
    Motion Estimation (ME) technique plays a key role in the video coding systems to achieve high compression ratios by removing temporal redundancies among video frames. Especially in the newest H.264/AVC video coding standard, ME engine demands large amount of computational capabilities due to its support for wide range of different block sizes for a given macroblock in order to increase accuracy in finding best matching block in the previous frames. We propose scalable architecture for H.264/AVC Variable Block Size (VBS) Motion Estimation with adaptive computing capability to support various search ranges, input video resolutions, and frame rates. Hardware architecture of the proposed ME consists of scalable Sum of Absolute Difference (SAD) arrays which can perform Full Search Block Matching Algorithm (FSBMA) for smaller 4x4 blocks. It is also shown that by predicting motion activity and adaptively adjusting the Search Range (SR) on the reconfigurable hardware platform, the computational cost of ME required for inter-frame encoding in H.264/AVC video coding standard can be reduced significantly. Dynamic Partial Reconfiguration is a unique feature of Field Programmable Gate Arrays (FPGAs) that makes best use of hardware resources and power by allowing adaptive algorithm to be implemented during run-time. We exploit this feature of FPGA to implement the proposed reconfigurable architecture of ME and maximize the architectural benefits through prediction of motion activities in the video sequences ,adaptation of SR during run-time, and fractional ME refinement. The implemented ME architecture can support real time applications at a maximum frequency of 90MHz with multiple reconfigurable regions. iv When compared to reconfiguration of complete design, partial reconfiguration process results in smaller bitstream size which allows FPGA to implement different configurations at higher speed. The proposed architecture has modular structure, regular data flow, and efficient memory organization with lower memory accesses. By increasing the number of active partial reconfigurable modules from one to four, there is a 4 fold increase in data re-use. Also, by introducing adaptive SR reduction algorithm at frame level, the computational load of ME is reduced significantly with only small degradation in PSNR (≤0.1dB)
    corecore