2,133 research outputs found

    Dynamically variable step search motion estimation algorithm and a dynamically reconfigurable hardware for its implementation

    Get PDF
    Motion Estimation (ME) is the most computationally intensive part of video compression and video enhancement systems. For the recently available High Definition (HD) video formats, the computational complexity of De full search (FS) ME algorithm is prohibitively high, whereas the PSNR obtained by fast search ME algorithms is low. Therefore, ill this paper, we present Dynamically Variable Step Search (DVSS) ME algorithm for Processing high definition video formats and a dynamically reconfigurable hardware efficiently implementing DVSS algorithm. The architecture for efficiently implementing DVSS algorithm. The simulation results showed that DVSS algorithm performs very close to FS algorithm by searching much fewer search locations than FS algorithm and it outperforms successful past search ME algorithms by searching more search locations than these algorithms. The proposed hardware is implemented in VHDL and is capable, of processing high definition video formats in real time. Therefore, it can be used in consumer electronics products for video compression, frame rate up-conversion and de-interlacing(1)

    Efficient hardware implementations of low bit depth motion estimation algorithms

    Get PDF
    In this paper, we present efficient hardware implementation of multiplication free one-bit transform (MF1BT) based and constraint one-bit transform (C-1BT) based motion estimation (ME) algorithms, in order to provide low bit-depth representation based full search block ME hardware for real-time video encoding. We used a source pixel based linear array (SPBLA) hardware architecture for low bit depth ME for the first time in the literature. The proposed SPBLA based implementation results in a genuine data flow scheme which significantly reduces the number of data reads from the current block memory, which in turn reduces the power consumption by at least 50% compared to conventional 1BT based ME hardware architecture presented in the literature. Because of the binary nature of low bit-depth ME algorithms, their hardware architectures are more efficient than existing 8 bits/pixel representation based ME architectures

    An FPGA Implementation of HW/SW Codesign Architecture for H.263 Video Coding

    Get PDF
    Chapitre 12 http://www.intechopen.com/download/pdf/pdfs_id/1574

    Implementation of a motion estimation algorithm for Intel FPGAs using OpenCL

    Get PDF
    Producción CientíficaMotion Estimation is one of the main tasks behind any video encoder. It is a compu- tationally costly task; therefore, it is usually delegated to specific or reconfigurable hardware, such as FPGAs. Over the years, multiple FPGA implementations have been developed, mainly using hardware description languages such as Verilog or VHDL. Since programming using hardware description languages is a complex task, it is desirable to use higher-level languages to develop FPGA applications.The aim of this work is to evaluate OpenCL, in terms of expressiveness, as a tool for devel- oping this kind of FPGA applications. To do so, we present and evaluate a parallel implementation of the Block Matching Motion Estimation process using OpenCL for Intel FPGAs, usable and tested on an Intel Stratix 10 FPGA. The implementa- tion efficiently processes Full HD frames completely inside the FPGA. In this work, we show the resource utilization when synthesizing the code on an Intel Stratix 10 FPGA, as well as a performance comparison with multiple CPU implementations with varying levels of optimization and vectorization capabilities. We also compare the proposed OpenCL implementation, in terms of resource utilization and perfor- mance, with estimations obtained from an equivalent VHDL implementation.Junta de Castilla y León - Consejería de Educación de la Proyecto PROPHET-2 (VA226P20)Ministerio de Economía, Industria y Competitividad: (PID2019- 104834 GB-I00) and European Regional Development Fund (ERDF) program: Project PCAS (TIN2017-88614-R)Ministerio de Ciencia e Innovación (PID2019-104184RB-I00 / AEI / 10.13039/501100011033)Xunta de Galicia y fondos FEDER de la UE (Centro de Investigación de Galicia acreditación 2019-2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación y “European Union NextGenerationEU/PRTR” : (MCIN/ AEI/10.13039/501100011033) - grant TED2021-130367B-I00Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

    A low complexity hardware architecture for motion estimation

    Get PDF
    This paper tackles the problem of accelerating motion estimation for video processing. A novel architecture using binary data is proposed, which attempts to reduce power consumption. The solution exploits redundant operations in the sum of absolute differences (SAD) calculation, by a mechanism known as early termination. Further data redundancies are exploited by using a run length coding addressing scheme, where access to pixels which do not contribute to the final SAD value is minimised. By using these two techniques operations and memory accesses are reduced by 93.29% and 69.17% respectively relative to a systolic array implementation

    Optimization of the motion estimation for parallel embedded systems in the context of new video standards

    Get PDF
    15 pagesInternational audienceThe effciency of video compression methods mainly depends on the motion compensation stage, and the design of effcient motion estimation techniques is still an important issue. An highly accurate motion estimation can significantly reduce the bit-rate, but involves a high computational complexity. This is particularly true for new generations of video compression standards, MPEG AVC and HEVC, which involves techniques such as different reference frames, sub-pixel estimation, variable block sizes. In this context, the design of fast motion estimation solutions is necessary, and can concerned two linked aspects: a high quality algorithm and its effcient implementation. This paper summarizes our main contributions in this domain. In particular, we first present the HME (Hierarchical Motion Estimation) technique. It is based on a multi-level refinement process where the motion estimation vectors are first estimated on a sub-sampled image. The multi-levels decomposition provides robust predictions and is particularly suited for variable block sizes motion estimations. The HME method has been integrated in a AVC encoder, and we propose a parallel implementation of this technique, with the motion estimation at pixel level performed by a DSP processor, and the sub-pixel refinement realized in an FPGA. The second technique that we present is called HDS for Hierarchical Diamond Search. It combines the multi-level refinement of HME, with a fast search at pixel-accuracy inspired by the EPZS method. This paper also presents its parallel implementation onto a multi-DSP platform and the its use in the HEVC context

    A toolset for the analysis and optimization of motion estimation algorithms and processors

    Get PDF

    Accelerated hardware video object segmentation: From foreground detection to connected components labelling

    Get PDF
    This is the preprint version of the Article - Copyright @ 2010 ElsevierThis paper demonstrates the use of a single-chip FPGA for the segmentation of moving objects in a video sequence. The system maintains highly accurate background models, and integrates the detection of foreground pixels with the labelling of objects using a connected components algorithm. The background models are based on 24-bit RGB values and 8-bit gray scale intensity values. A multimodal background differencing algorithm is presented, using a single FPGA chip and four blocks of RAM. The real-time connected component labelling algorithm, also designed for FPGA implementation, run-length encodes the output of the background subtraction, and performs connected component analysis on this representation. The run-length encoding, together with other parts of the algorithm, is performed in parallel; sequential operations are minimized as the number of run-lengths are typically less than the number of pixels. The two algorithms are pipelined together for maximum efficiency
    corecore