520 research outputs found

    Optimization of the motion estimation for parallel embedded systems in the context of new video standards

    Get PDF
    15 pagesInternational audienceThe effciency of video compression methods mainly depends on the motion compensation stage, and the design of effcient motion estimation techniques is still an important issue. An highly accurate motion estimation can significantly reduce the bit-rate, but involves a high computational complexity. This is particularly true for new generations of video compression standards, MPEG AVC and HEVC, which involves techniques such as different reference frames, sub-pixel estimation, variable block sizes. In this context, the design of fast motion estimation solutions is necessary, and can concerned two linked aspects: a high quality algorithm and its effcient implementation. This paper summarizes our main contributions in this domain. In particular, we first present the HME (Hierarchical Motion Estimation) technique. It is based on a multi-level refinement process where the motion estimation vectors are first estimated on a sub-sampled image. The multi-levels decomposition provides robust predictions and is particularly suited for variable block sizes motion estimations. The HME method has been integrated in a AVC encoder, and we propose a parallel implementation of this technique, with the motion estimation at pixel level performed by a DSP processor, and the sub-pixel refinement realized in an FPGA. The second technique that we present is called HDS for Hierarchical Diamond Search. It combines the multi-level refinement of HME, with a fast search at pixel-accuracy inspired by the EPZS method. This paper also presents its parallel implementation onto a multi-DSP platform and the its use in the HEVC context

    Implementing video compression algorithms on reconfigurable devices

    Get PDF
    The increasing density offered by Field Programmable Gate Arrays(FPGA), coupled with their short design cycle, has made them a popular choice for implementing a wide range of algorithms and complete systems. In this thesis the implementation of video compression algorithms on FPGAs is studied. Two areas are specifically focused on; the integration of a video encoder into a complete system and the power consumption of FPGA based video encoders. Two FPGA based video compression systems are described, one which targets surveillance applications and one which targets video conferencing applications. The FPGA video surveillance system makes use of a novel memory format to improve the efficiency with which input video sequences can be loaded over the system bus. The power consumption of a FPGA video encoder is analyzed. The results indicating that the motion estimation encoder stage requires the most power consumption. An algorithm, which reuses the intra prediction results generated during the encoding process, is then proposed to reduce the power consumed on an FPGA video encoder’s external memory bus. Finally, the power reduction algorithm is implemented within an FPGA video encoder. Results are given showing that, in addition to reducing power on the external memory bus, the algorithm also reduces power in the motion estimation stage of a FPGA based video encoder

    Video Compression from the Hardware Perspective

    Get PDF

    Architecture and algorithms for the implementation of digital wireless receivers in FPGA and ASIC: ISDB-T and DVB-S2 cases

    Full text link
    [EN] The first generation of Terrestrial Digital Television(DTV) has been in service for over a decade. In 2013, several countries have already completed the transition from Analog to Digital TV Broadcasting, most of which in Europe. In South America, after several studies and trials, Brazil adopted the Japanese standard with some innovations. Japan and Brazil started Digital Terrestrial Television Broadcasting (DTTB) services in December 2003 and December 2007 respectively, using Integrated Services Digital Broadcasting - Terrestrial (ISDB-T), also known as ARIB STD-B31. In June 2005 the Committee for the Information Technology Area (CATI) of Brazilian Ministry of Science and Technology and Innovation MCTI approved the incorporation of the IC-Brazil Program, in the National Program for Microelectronics (PNM) . The main goals of IC-Brazil are the formal qualification of IC designers, support to the creation of semiconductors companies focused on projects of ICs within Brazil, and the attraction of semiconductors companies focused on the design and development of ICs in Brazil. The work presented in this thesis originated from the unique momentum created by the combination of the birth of Digital Television in Brazil and the creation of the IC-Brazil Program by the Brazilian government. Without this combination it would not have been possible to make these kind of projects in Brazil. These projects have been a long and costly journey, albeit scientifically and technologically worthy, towards a Brazilian DTV state-of-the-art low complexity Integrated Circuit, with good economy scale perspectives, due to the fact that at the beginning of this project ISDB-T standard was not adopted by several countries like DVB-T. During the development of the ISDB-T receiver proposed in this thesis, it was realized that due to the continental dimensions of Brazil, the DTTB would not be enough to cover the entire country with open DTV signal, specially for the case of remote localizations far from the high urban density regions. Then, Eldorado Research Institute and Idea! Electronic Systems, foresaw that, in a near future, there would be an open distribution system for high definition DTV over satellite, in Brazil. Based on that, it was decided by Eldorado Research Institute, that would be necessary to create a new ASIC for broadcast satellite reception. At that time DVB-S2 standard was the strongest candidate for that, and this assumption still stands nowadays. Therefore, it was decided to apply to a new round of resources funding from the MCTI - that was granted - in order to start the new project. This thesis discusses in details the Architecture and Algorithms proposed for the implementation of a low complexity Intermediate Frequency(IF) ISDB-T Receiver on Application Specific Integrated Circuit (ASIC) CMOS. The Architecture proposed here is highly based on the COordinate Rotation Digital Computer (CORDIC) Algorithm, that is a simple and efficient algorithm suitable for VLSI implementations. The receiver copes with the impairments inherent to wireless channels transmission and the receiver crystals. The thesis also discusses the Methodology adopted and presents the implementation results. The receiver performance is presented and compared to those obtained by means of simulations. Furthermore, the thesis also presents the Architecture and Algorithms for a DVB-S2 receiver targeting its ASIC implementation. However, unlike the ISDB-T receiver, only preliminary ASIC implementation results are introduced. This was mainly done in order to have an early estimation of die area to prove that the project in ASIC is economically viable, as well as to verify possible bugs in early stage. As in the case of ISDB-T receiver, this receiver is highly based on CORDIC algorithm and it was prototyped in FPGA. The Methodology used for the second receiver is derived from that used for the ISDB-T receiver, with minor additions given the project characteristics.[ES] La primera generación de Televisión Digital Terrestre(DTV) ha estado en servicio por más de una década. En 2013, varios países completaron la transición de transmisión analógica a televisión digital, la mayoría de ellas en Europa. En América del Sur, después de varios estudios y ensayos, Brasil adoptó el estándar japonés con algunas innovaciones. Japón y Brasil comenzaron a prestar el servicio de Difusión de Televisión Digital Terrestre (DTTB) en diciembre de 2003 y diciembre de 2007 respectivamente, utilizando Radiodifusión Digital de Servicios Integrados Terrestres (ISDB-T), también conocida como ARIB STD-B31. En junio de 2005, el Comité del Área de Tecnología de la Información (CATI) del Ministerio de Ciencia, Tecnología e Innovación de Brasil - MCTI aprobó la incorporación del Programa CI-Brasil, en el Programa Nacional de Microelectrónica (PNM). Los principales objetivos de la CI-Brasil son la formación de diseñadores de CIs, apoyar la creación de empresas de semiconductores enfocadas en proyectos de circuitos integrados dentro de Brasil, y la atracción de empresas de semiconductores interesadas en el diseño y desarrollo de circuitos integrados. El trabajo presentado en esta tesis se originó en el impulso único creado por la combinación del nacimiento de la televisión digital en Brasil y la creación del Programa de CI-Brasil por el gobierno brasileño. Sin esta combinación no hubiera sido posible realizar este tipo de proyectos en Brasil. Estos proyectos han sido un trayecto largo y costoso, aunque meritorio desde el punto de vista científico y tecnológico, hacia un Circuito Integrado brasileño de punta y de baja complejidad para DTV, con buenas perspectivas de economía de escala debido al hecho que al inicio de este proyecto, el estándar ISDB-T no fue adoptado por varios países como DVB-T. Durante el desarrollo del receptor ISDB-T propuesto en esta tesis, se observó que debido a las dimensiones continentales de Brasil, la DTTB no sería suficiente para cubrir todo el país con la señal de televisión digital abierta, especialmente para el caso de localizaciones remotas, apartadas de las regiones de alta densidad urbana. En ese momento, el Instituto de Investigación Eldorado e Idea! Sistemas Electrónicos, previeron que en un futuro cercano habría un sistema de distribución abierto para DTV de alta definición por satélite en Brasil. Con base en eso, el Instituto de Investigación Eldorado decidió que sería necesario crear un nuevo ASIC para la recepción de radiodifusión por satélite, basada el estándar DVB-S2. En esta tesis se analiza en detalle la Arquitectura y algoritmos propuestos para la implementación de un receptor ISDB-T de baja complejidad y frecuencia intermedia (IF) en un Circuito Integrado de Aplicación Específica (ASIC) CMOS. La arquitectura aquí propuesta se basa fuertemente en el algoritmo Computadora Digital para Rotación de Coordenadas (CORDIC), el cual es un algoritmo simple, eficiente y adecuado para implementaciones VLSI. El receptor hace frente a las deficiencias inherentes a las transmisiones por canales inalámbricos y los cristales del receptor. La tesis también analiza la metodología adoptada y presenta los resultados de la implementación. Por otro lado, la tesis también presenta la arquitectura y los algoritmos para un receptor DVB-S2 dirigido a la implementación en ASIC. Sin embargo, a diferencia del receptor ISDB-T, se introducen sólo los resultados preliminares de implementación en ASIC. Esto se hizo principalmente con el fin de tener una estimación temprana del área del die para demostrar que el proyecto en ASIC es económicamente viable, así como para verificar posibles errores en etapa temprana. Como en el caso de receptor ISDB-T, este receptor se basa fuertemente en el algoritmo CORDIC y fue un prototipado en FPGA. La metodología utilizada para el segundo receptor se deriva de la utilizada para el re[CA] La primera generació de Televisió Digital Terrestre (TDT) ha estat en servici durant més d'una dècada. En 2013, diversos països ja van completar la transició de la radiodifusió de televisió analògica a la digital, i la majoria van ser a Europa. A Amèrica del Sud, després de diversos estudis i assajos, Brasil va adoptar l'estàndard japonés amb algunes innovacions. Japó i Brasil van començar els servicis de Radiodifusió de Televisió Terrestre Digital (DTTB) al desembre de 2003 i al desembre de 2007, respectivament, utilitzant la Radiodifusió Digital amb Servicis Integrats de (ISDB-T), coneguda com a ARIB STD-B31. Al juny de 2005, el Comité de l'Àrea de Tecnologia de la Informació (CATI) del Ministeri de Ciència i Tecnologia i Innovació del Brasil (MCTI) va aprovar la incorporació del programa CI Brasil al Programa Nacional de Microelectrònica (PNM). Els principals objectius de CI Brasil són la qualificació formal dels dissenyadors de circuits integrats, el suport a la creació d'empreses de semiconductors centrades en projectes de circuits integrats dins del Brasil i l'atracció d'empreses de semiconductors centrades en el disseny i desenvolupament de circuits integrats. El treball presentat en esta tesi es va originar en l'impuls únic creat per la combinació del naixement de la televisió digital al Brasil i la creació del programa Brasil CI pel govern brasiler. Sense esta combinació no hauria estat possible realitzar este tipus de projectes a Brasil. Estos projectes han suposat un viatge llarg i costós, tot i que digne científicament i tecnològica, cap a un circuit integrat punter de baixa complexitat per a la TDT brasilera, amb bones perspectives d'economia d'escala perquè a l'inici d'este projecte l'estàndard ISDB-T no va ser adoptat per diversos països, com el DVB-T. Durant el desenvolupament del receptor de ISDB-T proposat en esta tesi, va resultar que, a causa de les dimensions continentals de Brasil, la DTTB no seria suficient per cobrir tot el país amb el senyal de TDT oberta, especialment pel que fa a les localitzacions remotes allunyades de les regions d'alta densitat urbana.. En este moment, l'Institut de Recerca Eldorado i Idea! Sistemes Electrònics van preveure que, en un futur pròxim, no hi hauria a Brasil un sistema de distribució oberta de TDT d'alta definició a través de satèl¿lit. D'acord amb això, l'Institut de Recerca Eldorado va decidir que seria necessari crear un nou ASIC per a la recepció de radiodifusió per satèl¿lit. basat en l'estàndard DVB-S2. En esta tesi s'analitza en detall l'arquitectura i els algorismes proposats per l'execució d'un receptor ISDB-T de Freqüència Intermèdia (FI) de baixa complexitat sobre CMOS de Circuit Integrat d'Aplicacions Específiques (ASIC). L'arquitectura ací proposada es basa molt en l'algorisme de l'Ordinador Digital de Rotació de Coordenades (CORDIC), que és un algorisme simple i eficient adequat per implementacions VLSI. El receptor fa front a les deficiències inherents a la transmissió de canals sense fil i els cristalls del receptor. Esta tesi també analitza la metodologia adoptada i presenta els resultats de l'execució. Es presenta el rendiment del receptor i es compara amb els obtinguts per mitjà de simulacions. D'altra banda, esta tesi també presenta l'arquitectura i els algorismes d'un receptor de DVB-S2 de cara a la seua implementació en ASIC. No obstant això, a diferència del receptor ISDB-T, només s'introdueixen resultats preliminars d'implementació en ASIC. Això es va fer principalment amb la finalitat de tenir una estimació primerenca de la zona de dau per demostrar que el projecte en ASIC és econòmicament viable, així com per verificar possibles errors en l'etapa primerenca. Com en el cas del receptor ISDB-T, este receptor es basa molt en l'algorisme CORDIC i va ser un prototip de FPGA. La metodologia utilitzada per al segon receptor es deriva de la utilitzada per al receptor IRodrigues De Lima, E. (2016). Architecture and algorithms for the implementation of digital wireless receivers in FPGA and ASIC: ISDB-T and DVB-S2 cases [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/61967TESI

    Low Power Architectures for MPEG-4 AVC/H.264 Video Compression

    Get PDF

    VLSI architecture design approaches for real-time video processing

    Get PDF
    This paper discusses the programmable and dedicated approaches for real-time video processing applications. Various VLSI architecture including the design examples of both approaches are reviewed. Finally, discussions of several practical designs in real-time video processing applications are then considered in VLSI architectures to provide significant guidelines to VLSI designers for any further real-time video processing design works

    Implementation of JPEG compression and motion estimation on FPGA hardware

    Full text link
    A hardware implementation of JPEG allows for real-time compression in data intensivve applications, such as high speed scanning, medical imaging and satellite image transmission. Implementation options include dedicated DSP or media processors, FPGA boards, and ASICs. Factors that affect the choice of platform selection involve cost, speed, memory, size, power consumption, and case of reconfiguration. The proposed hardware solution is based on a Very high speed integrated circuit Hardware Description Language (VHDL) implememtation of the codec with prefered realization using an FPGA board due to speed, cost and flexibility factors; The VHDL language is commonly used to model hardware impletations from a top down perspective. The VHDL code may be simulated to correct mistakes and subsequently synthesized into hardware using a synthesis tool, such as the xilinx ise suite. The same VHDL code may be synthesized into a number of sifferent hardware architetcures based on constraints given. For example speed was the major constraint when synthesizing the pipeline of jpeg encoding and decoding, while chip area and power consumption were primary constraints when synthesizing the on-die memory because of large area. Thus, there is a trade off between area and speed in logic synthesis

    SoftCast

    Get PDF
    The focus of this demonstration is the performance of streaming video over the mobile wireless channel. We compare two schemes: the standard approach to video which transmits H.264/AVC-encoded stream over 802.11-like PHY, and SoftCast -- a clean-slate design for wireless video where the source transmits one video stream that each receiver decodes to a video quality commensurate with its specific instantaneous channel quality

    Reconfigurable Architecture For H.264/avc Variable Block Size Motion Estimation Based On Motion Activity And Adaptive Search Range

    Get PDF
    Motion Estimation (ME) technique plays a key role in the video coding systems to achieve high compression ratios by removing temporal redundancies among video frames. Especially in the newest H.264/AVC video coding standard, ME engine demands large amount of computational capabilities due to its support for wide range of different block sizes for a given macroblock in order to increase accuracy in finding best matching block in the previous frames. We propose scalable architecture for H.264/AVC Variable Block Size (VBS) Motion Estimation with adaptive computing capability to support various search ranges, input video resolutions, and frame rates. Hardware architecture of the proposed ME consists of scalable Sum of Absolute Difference (SAD) arrays which can perform Full Search Block Matching Algorithm (FSBMA) for smaller 4x4 blocks. It is also shown that by predicting motion activity and adaptively adjusting the Search Range (SR) on the reconfigurable hardware platform, the computational cost of ME required for inter-frame encoding in H.264/AVC video coding standard can be reduced significantly. Dynamic Partial Reconfiguration is a unique feature of Field Programmable Gate Arrays (FPGAs) that makes best use of hardware resources and power by allowing adaptive algorithm to be implemented during run-time. We exploit this feature of FPGA to implement the proposed reconfigurable architecture of ME and maximize the architectural benefits through prediction of motion activities in the video sequences ,adaptation of SR during run-time, and fractional ME refinement. The implemented ME architecture can support real time applications at a maximum frequency of 90MHz with multiple reconfigurable regions. iv When compared to reconfiguration of complete design, partial reconfiguration process results in smaller bitstream size which allows FPGA to implement different configurations at higher speed. The proposed architecture has modular structure, regular data flow, and efficient memory organization with lower memory accesses. By increasing the number of active partial reconfigurable modules from one to four, there is a 4 fold increase in data re-use. Also, by introducing adaptive SR reduction algorithm at frame level, the computational load of ME is reduced significantly with only small degradation in PSNR (≤0.1dB)

    Algorithms and Hardware Co-Design of HEVC Intra Encoders

    Get PDF
    Digital video is becoming extremely important nowadays and its importance has greatly increased in the last two decades. Due to the rapid development of information and communication technologies, the demand for Ultra-High Definition (UHD) video applications is becoming stronger. However, the most prevalent video compression standard H.264/AVC released in 2003 is inefficient when it comes to UHD videos. The increasing desire for superior compression efficiency to H.264/AVC leads to the standardization of High Efficiency Video Coding (HEVC). Compared with the H.264/AVC standard, HEVC offers a double compression ratio at the same level of video quality or substantial improvement of video quality at the same video bitrate. Yet, HE-VC/H.265 possesses superior compression efficiency, its complexity is several times more than H.264/AVC, impeding its high throughput implementation. Currently, most of the researchers have focused merely on algorithm level adaptations of HEVC/H.265 standard to reduce computational intensity without considering the hardware feasibility. What’s more, the exploration of efficient hardware architecture design is not exhaustive. Only a few research works have been conducted to explore efficient hardware architectures of HEVC/H.265 standard. In this dissertation, we investigate efficient algorithm adaptations and hardware architecture design of HEVC intra encoders. We also explore the deep learning approach in mode prediction. From the algorithm point of view, we propose three efficient hardware-oriented algorithm adaptations, including mode reduction, fast coding unit (CU) cost estimation, and group-based CABAC (context-adaptive binary arithmetic coding) rate estimation. Mode reduction aims to reduce mode candidates of each prediction unit (PU) in the rate-distortion optimization (RDO) process, which is both computation-intensive and time-consuming. Fast CU cost estimation is applied to reduce the complexity in rate-distortion (RD) calculation of each CU. Group-based CABAC rate estimation is proposed to parallelize syntax elements processing to greatly improve rate estimation throughput. From the hardware design perspective, a fully parallel hardware architecture of HEVC intra encoder is developed to sustain UHD video compression at 4K@30fps. The fully parallel architecture introduces four prediction engines (PE) and each PE performs the full cycle of mode prediction, transform, quantization, inverse quantization, inverse transform, reconstruction, rate-distortion estimation independently. PU blocks with different PU sizes will be processed by the different prediction engines (PE) simultaneously. Also, an efficient hardware implementation of a group-based CABAC rate estimator is incorporated into the proposed HEVC intra encoder for accurate and high-throughput rate estimation. To take advantage of the deep learning approach, we also propose a fully connected layer based neural network (FCLNN) mode preselection scheme to reduce the number of RDO modes of luma prediction blocks. All angular prediction modes are classified into 7 prediction groups. Each group contains 3-5 prediction modes that exhibit a similar prediction angle. A rough angle detection algorithm is designed to determine the prediction direction of the current block, then a small scale FCLNN is exploited to refine the mode prediction
    corecore