6 research outputs found

    Parallel H.264/AVC Fast Rate-Distortion Optimized Motion Estimation using Graphics Processing Unit and Dedicated Hardware

    Get PDF
    Heterogeneous systems on a single chip composed of CPU, Graphical Processing Unit (GPU), and Field Programmable Gate Array (FPGA) are expected to emerge in near future. In this context, the System on Chip (SoC) can be dynamically adapted to employ different architectures for execution of data-intensive applications. Motion estimation is one such task that can be accelerated using FPGA and GPU for high performance H.264/AVC encoder implementation. In most of works on parallel implementation of motion estimation, the bit rate cost of motion vectors is generally ignored. On the contrary, this paper presents a fast rate-distortion optimized parallel motion estimation algorithm implemented on GPU using OpenCL and FPGA/ASIC using VHDL. The predicted motion vectors are estimated from temporally preceding motion vectors and used for evaluating the bit rate cost of the motion vectors simultaneously. The experimental results show that the proposed scheme achieves significant speedup on GPU and FPGA, and has comparable ratedistortion performance with respect to sequential fast motion estimation algorithm

    Algoritmo de estimação de movimento e sua arquitetura de hardware para HEVC

    Get PDF
    Doutoramento em Engenharia EletrotécnicaVideo coding has been used in applications like video surveillance, video conferencing, video streaming, video broadcasting and video storage. In a typical video coding standard, many algorithms are combined to compress a video. However, one of those algorithms, the motion estimation is the most complex task. Hence, it is necessary to implement this task in real time by using appropriate VLSI architectures. This thesis proposes a new fast motion estimation algorithm and its implementation in real time. The results show that the proposed algorithm and its motion estimation hardware architecture out performs the state of the art. The proposed architecture operates at a maximum operating frequency of 241.6 MHz and is able to process 1080p@60Hz with all possible variables block sizes specified in HEVC standard as well as with motion vector search range of up to ±64 pixels.A codificação de vídeo tem sido usada em aplicações tais como, vídeovigilância, vídeo-conferência, video streaming e armazenamento de vídeo. Numa norma de codificação de vídeo, diversos algoritmos são combinados para comprimir o vídeo. Contudo, um desses algoritmos, a estimação de movimento é a tarefa mais complexa. Por isso, é necessário implementar esta tarefa em tempo real usando arquiteturas de hardware apropriadas. Esta tese propõe um algoritmo de estimação de movimento rápido bem como a sua implementação em tempo real. Os resultados mostram que o algoritmo e a arquitetura de hardware propostos têm melhor desempenho que os existentes. A arquitetura proposta opera a uma frequência máxima de 241.6 MHz e é capaz de processar imagens de resolução 1080p@60Hz, com todos os tamanhos de blocos especificados na norma HEVC, bem como um domínio de pesquisa de vetores de movimento até ±64 pixels

    Traitement des signaux et images en temps réel ("implantation de H.264 sur MPSoC")

    Get PDF
    Cette thèse est élaborée en cotutelle entre l université Badji Mokhtar (Laboratoire LERICA) et l université de bourgogne (Laboratoire LE2I, UMR CNRS 5158). Elle constitue une contribution à l étude et l implantation de l encodeur H.264/AVC. Durent l évolution des normes de compression vidéo, une réalité sure est vérifiée de plus en plus : avoir une bonne performance du processus de compression nécessite l élaboration d équipements beaucoup plus performants en termes de puissance de calcul, de flexibilité et de portabilité et ceci afin de répondre aux exigences des différents traitements et satisfaire au critère Temps Réel . Pour assurer un temps réel pour ce genre d applications, une solution reste possible est l utilisation des systèmes sur puce (SoC) ou bien des systèmes multiprocesseurs sur puce (MPSoC) implantés sur des plateformes reconfigurables à base de circuit FPGA. L objective de cette thèse consiste à l étude et l implantation des algorithmes de traitement des signaux et images et en particulier la norme H.264/AVC, et cela dans le but d assurer un temps réel pour le cycle codage-décodage. Nous utilisons deux plateformes FPGA de Xilinx (ML501 et XUPV5). Dans la littérature, il existe déjà plusieurs implémentations du décodeur. Pour l encodeur, malgré les efforts énormes réalisés, il reste toujours du travail pour l optimisation des algorithmes et l extraction des parallélismes possibles surtout avec une variété de profils et de niveaux de la norme H.264/AVC.Dans un premier temps de cette thèse, nous proposons une implantation matérielle d un contrôleur mémoire spécialement pour l encodeur H.264/AVC. Ce contrôleur est réalisé en ajoutant, au contrôleur mémoire DDR2 des deux plateformes de Xilinx, une couche intelligente capable de calculer les adresses et récupérer les données nécessaires pour les différents modules de traitement de l encodeur. Ensuite, nous proposons des implantations matérielles (niveau RTL) des modules de traitement de l encodeur H.264. Sur ces implantations, nous allons exploiter les deux principes de parallélisme et de pipelining autorisé par l encodeur en vue de la grande dépendance inter-blocs. Nous avons ainsi proposé plusieurs améliorations et nouvelles techniques dans les modules de la chaine Intra et le filtre anti-blocs. A la fin de cette thèse, nous utilisons les modules réalisés en matériels pour la l implantation Matérielle/logicielle de l encodeur H.264/AVC. Des résultats de synthèse et de simulation, en utilisant les deux plateformes de Xilinx, sont montrés et comparés avec les autres implémentations existantesThis thesis has been carried out in joint supervision between the Badji Mokhtar University (LERICA Laboratory) and the University of Burgundy (LE2I laboratory, UMR CNRS 5158). It is a contribution to the study and implementation of the H.264/AVC encoder. The evolution in video coding standards have historically demanded stringent performances of the compression process, which imposes the development of platforms that perform much better in terms of computing power, flexibility and portability. Such demands are necessary to fulfill requirements of the different treatments and to meet "Real Time" processing constraints. In order to ensure real-time performances, a possible solution is to made use of systems on chip (SoC) or multiprocessor systems on chip (MPSoC) built on platforms based reconfigurable FPGAs. The objective of this thesis is the study and implementation of algorithms for signal and image processing (in particular the H.264/AVC standard); especial attention was given to provide real-time coding-decoding cycles. We use two FPGA platforms (ML501 and XUPV5 from Xilinx) to implement our architectures. In the literature, there are already several implementations of the decoder. For the encoder part, despite the enormous efforts made, work remains to optimize algorithms and extract the inherent parallelism of the architecture. This is especially true with a variety of profiles and levels of H.264/AVC. Initially, we proposed a hardware implementation of a memory controller specifically targeted to the H.264/AVC encoder. This controller is obtained by adding, to the DDR2 memory controller, an intelligent layer capable of calculating the addresses and to retrieve the necessary data for several of the processing modules of the encoder. Afterwards, we proposed hardware implementations (RTL) for the processing modules of the H.264 encoder. In these implementations, we made use of principles of parallelism and pipelining, taking into account the constraints imposed by the inter-block dependency in the encoder. We proposed several enhancements and new technologies in the channel Intra modules and the deblocking filter. At the end of this thesis, we use the modules implemented in hardware for implementing the H.264/AVC encoder in a hardware/software design. Synthesis and simulation results, using both platforms for Xilinx, are shown and compared with other existing implementationsDIJON-BU Doc.électronique (212319901) / SudocSudocFranceF

    Estudio de Arquitecturas VLSI de la etapa de predicción de la compensación de movimiento, para compresión de imágenes y video con Algoritmos full-search. Aplicación al estándar H.264/AVC

    Full text link
    En esta tesis doctoral se presenta el diseño y realización de arquitecturas VLSI de estimación de movimiento, en sus versiones de pixeles enteros y fraccionarios, para la etapa de predicción de la compensación de movimiento del estándar de codificación de video H.264/AVC. Las arquitecturas propuestas son estructuras de procesamiento pipeline-paralelas con alta eficiencia en su data_path y una administración optima de la memoria. Utilizando el algoritmo full-search block matching, los diseños cumplen los requerimientos de tamaño de bloque variable y resolución de ¼ de píxel del estándar con máxima calidad. Los estimadores de movimiento combinan las características de las arquitecturas consideradas en el estado del arte junto con la aplicación de nuevos esquemas y algoritmos hardware, en el proceso de codificación del componente luma de la señal de video. Diseñadas como coprocesadores de aceleración hardware para procesadores de 32 bits, las arquitecturas que se presentan han sido simuladas y sintetizadas para FPGA Virtex-4 de Xilinx, utilizando el lenguaje de descripción de hardware VHDL.Mora Campos, A. (2008). Estudio de Arquitecturas VLSI de la etapa de predicción de la compensación de movimiento, para compresión de imágenes y video con Algoritmos full-search. Aplicación al estándar H.264/AVC [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/3446Palanci

    FPSoC-Based Architecture for a Fast Motion Estimation Algorithm in H.264/AVC

    No full text
    There is an increasing need for high quality video on low power, portable devices. Possible target applications range from entertainment and personal communications to security and health care. While H.264/AVC answers the need for high quality video at lower bit rates, it is significantly more complex than previous coding standards and thus results in greater power consumption in practical implementations. In particular, motion estimation (ME), in H.264/AVC consumes the largest power in an H.264/AVC encoder. It is therefore critical to speed-up integer ME in H.264/AVC via fast motion estimation (FME) algorithms and hardware acceleration. In this paper, we present our hardware oriented modifications to a hybrid FME algorithm, our architecture based on the modified algorithm, and our implementation and prototype on a PowerPC-based Field Programmable System on Chip (FPSoC). Our results show that the modified hybrid FME algorithm on average, outperforms previous state-of-the-art FME algorithms, while its losses when compared with FSME, in terms of PSNR performance and computation time, are insignificant. We show that although our implementation platform is FPGA-based, our implementation results compare favourably with previous architectures implemented on ASICs. Finally we also show an improvement over some existing architectures implemented on FPGAs
    corecore