6 research outputs found

    Efficient hardware implementations of low bit depth motion estimation algorithms

    Get PDF
    In this paper, we present efficient hardware implementation of multiplication free one-bit transform (MF1BT) based and constraint one-bit transform (C-1BT) based motion estimation (ME) algorithms, in order to provide low bit-depth representation based full search block ME hardware for real-time video encoding. We used a source pixel based linear array (SPBLA) hardware architecture for low bit depth ME for the first time in the literature. The proposed SPBLA based implementation results in a genuine data flow scheme which significantly reduces the number of data reads from the current block memory, which in turn reduces the power consumption by at least 50% compared to conventional 1BT based ME hardware architecture presented in the literature. Because of the binary nature of low bit-depth ME algorithms, their hardware architectures are more efficient than existing 8 bits/pixel representation based ME architectures

    A high performance hardware architecture for one bit transform based motion estimation

    Get PDF
    Motion Estimation (ME) is the most computationally intensive part of video compression and video enhancement systems. One bit transform (IBT) based ME algorithms have low computational complexity. Therefore, in this paper, we propose a high performance systolic hardware architecture for IBT based ME. The proposed hardware performs full search ME for 4 Macroblocks in parallel and it is the fastest IBT based ME hardware reported in the literature. In addition, it uses less on-chip memory than the previous IBT based ME hardware by using a novel data reuse scheme and memory organization. The proposed hardware is implemented in Verilog HDL. It consumes %34 of the slices in a Xilinx XC2VP30-7 FPGA. It works at 115 MHz in the same FPGA and is capable of processing 50 1920x1080 full High Definition frames per second. Therefore, it can be used in consumer electronics products that require real-time video processing or compression

    Extended Constraint Mask Based One-Bit Transform for Low-Complexity Fast Motion Estimation

    Get PDF
    In this paper, an improved motion estimation (ME) approach based on weighted constrained one-bit transform is proposed for block-based ME employed in video encoders. Binary ME approaches utilize low bit-depth representation of the original image frames with a Boolean exclusive-OR based hardware efficient matching criterion to decrease computational burden of the ME stage. Weighted constrained one-bit transform (WC‑1BT) based approach improves the performance of conventional C-1BT based ME employing 2-bit depth constraint mask instead of a 1-bit depth mask. In this work, the range of constraint mask is further extended to increase ME performance of WC-1BT approach. Experiments reveal that the proposed method provides better ME accuracy compared existing similar ME methods in the literature

    A high performance hardware for early terminated C-1BT based motion estimation

    Get PDF
    Motion Estimation (ME) is the most computationally intensive part of video compression systems. In this paper, a high performance hardware for early terminated constrained one-bit transform (C-1BT) based low bit depth ME is proposed. The proposed early terminated C-1BT based ME hardware can process more than 30 quad full HD (3840×2160) video frames per second. The early termination algorithm reduced the energy consumption of the proposed ME hardware by 26%

    High performance hardware architectures for one bit transform based motion estimation

    Get PDF
    Motion Estimation (ME) is the most computationally intensive and most power consuming part of video compression and video enhancement systems. ME is used in video compression standards such as MPEG4, H.264 and it is used in video enhancement algorithms such as frame rate conversion and de-interlacing. One bit transform (1BT) based ME algorithms have low computational complexity. Therefore, in this thesis, we propose high performance hardware architectures for 1BT based fixed block size (FBS) single reference frame (SRF) ME, variable block size (VBS) SRF ME, and multiple reference frame (MRF) ME. Constraint One Bit Transform (C-1BT) ME algorithm improves the ME performance of 1BT ME, and the early terminated C-1BT ME algorithm reduces the computational complexity of C-1BT ME. Therefore, in this thesis, we also propose a high performance early terminated C-1BT ME hardware architecture. The proposed FBS SRF ME hardware architectures perform full search ME for 4 Macroblocks in parallel and they are faster than the 1BT based ME hardware reported in the literature. In addition, they use less on-chip memory than the previous 1BT based ME hardware by using a novel data reuse scheme and memory organization. The proposed VBS SRF ME and MRF ME hardware architectures are the first 1BT based VBS ME and MRF ME hardware architectures in the literature. The proposed MRF ME hardware is designed as reconfigurable in order to statically configure the number and selection of reference frames based on the application requirements. The proposed early terminated C-1BT ME hardware architecture is the first early terminated C-1BT ME hardware architecture in the literature. All of the proposed ME hardware architectures are implemented in Verilog HDL and mapped to Xilinx FPGAs. All FPGA implementations are verified with post place & route simulations

    Local Binary Pattern Approach for Fast Block Based Motion Estimation

    Get PDF
    With the rapid growth of video services on smartphones such as video conferencing, video telephone and WebTV, implementation of video compression on mobile terminal becomes extremely important. However, the low computation capability of mobile devices becomes a bottleneck which calls for low complexity techniques for video coding. This work presents two set of algorithms for reducing the complexity of motion estimation. Binary motion estimation techniques using one-bit and two-bit transforms reduce the computational complexity of matching error criterion, however sometimes generate inaccurate motion vectors. The first set includes two neighborhood matching based algorithms which attempt to reduce computations to only a fraction of other methods. Simulation results demonstrate that full search local binary pattern (FS-LBP) algorithm reconstruct visually more accurate frames compared to full search algorithm (FSA). Its reduced complexity LBP (RC-LBP) version decreases computations significantly to only a fraction of the other methods while maintaining acceptable performance. The second set introduces edge detection approach for partial distortion elimination based on binary patterns. Spiral partial distortion elimination (SpiralPDE) has been proposed in literature which matches the pixel-to-pixel distortion in a predefined manner. Since, the contribution of all the pixels to the distortion function is different, therefore, it is important to analyze and extract these cardinal pixels. The proposed algorithms are called lossless fast full search partial distortion elimination ME based on local binary patterns (PLBP) and lossy edge-detection pixel decimation technique based on local binary patterns (ELBP). PLBP reduces the matching complexity by matching more contributable pixels early by identifying the most diverse pixels in a local neighborhood. ELBP captures the most representative pixels in a block in order of contribution to the distortion function by evaluating whether the individual pixels belong to the edge or background. Experimental results demonstrate substantial reduction in computational complexity of ELBP with only a marginal loss in prediction quality
    corecore