Search CORE

1 research outputs found

Study of interpolation filters for motion estimation with application in H.264/AVC encoders

Author: Georgis G
Lentaris G
Reisis DI
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2012
Field of study

Image super-resolution plays an important role in a plethora of applications, including video compression and motion estimation. Detecting fractional displacements among frames facilitates the removal of temporal redundancy and improves the video quality by 2-4 dB PSNR [1] [2]. However, the increased complexity of the Fractional Motion Estimation (FME) process adds a significant computational load to the encoder and sets constraints to real-time designs. Timing analysis shows that FME accounts for almost half of the entire motion estimation period, which in turn accounts for 60−90% of the total encoding time depending on the design configuration. FME bases on an interpolation procedure to increase the resolution of any frame region by generating sub-pixels between the original pixels. Modern compression standards specify the exact filter to use in the Motion Compensation module allowing the encoder and the decoder to create and use identical reference frames. In particular, H.264/AVC specifies a 6-tap filter for computing the luma values of half-pixels and a low cost 2-tap filter for computing quarter-pixels. Even though it is common practice for encoder designers to integrate the standard 6-tap filter also in the Estimation module (before Compensation), the fact is that the interpolation technique used for detecting the displacements (not computing their residual) is an open choice following certain performance trade-offs. Aiming at speeding up the Estimation, a process of considerably higher computational demand than the Compensation, this work builds on the potential to implement a lower complexity interpolation technique instead of the H.264 6-tap filter. We integrate in the Estimation module several distinct interpolation techniques not included in the H.264 standard, while keeping the standard H.264/AVC Compensation to measure their impact on the outcome of the prediction engine. Related bibliography includes both ideas to avoid/replace the standard computations, as well as architecturestargeting the efficient implementation of the H.264 6-tap filtering procedure and the support of its increased memory requirements. To this end, we note that H.264 specifies a kernel with coefficients ⟨1,−5,20,20,−5,1⟩ to be multiplied with six consecutive pixels of the frame (either in column or row format). The resulting six products are accumulated and normalized for the generation of a single half-pixel (between 3 rd and 4 th tap). The operation must be repeated for each “horizontal” and “vertical” half-pixelby sliding the kernel on the frame, both in row and column order. Moreover, there exist as many “diagonal” half-pixels to be generated by applying the kernel on previously computed horizontal or vertical half-pixels. That is to say, depending on its position, we must process 6 or 36 frame pixels to compute a single half-pixel. To avoid the costly H.264 filter in the Estimation module, we study similar interpolation techniques using less than 6 taps, possibly exploiting gradients on the image. Section II shows three commonly used interpolation techniques and introduces three novel techniques to point out the differences of the proposed. Section III reports the performance results of these techniques and Section IV concludes the paper

University of Surrey

Surrey Research Insight