24 research outputs found

    Methodology and optimizing of multiple frame format buffering within FPGA H.264/AVC decoder with FRExt.

    Get PDF
    Digital representation of video data is an inherently resource demanding problem that continues to necessitate the development and refinement of coding methods. The H.264/AVC standard, along with its recent Fidelity Range Extensions amendment (FRExt), is quickly being adopted as the standard codec for broadcast and distribution of high definition video. The FRExt amendment, while not necessarily affecting the overall decoder architecture, presents an added complexity of providing efficient memory management for buffering intermediate frames of various pixel color samplings and depths. This thesis evaluated the role of designing the frame buffer of a hardware video decoder, with integrated support for the H.264/AVC codec plus FRExt. With focus on organizing external memory data access, the frame buffer was designed to provide intermediate data storage for the decoder, while using an efficient store and load scheme that takes into consideration each frame pixel format of the video data. VHDL was used to model the frame buffer. Exploitation of reconfigurability and post-synthesis FPGA simulations were used to evaluate behavior, scalability and power consumption, while providing an analysis of approaches to adding FRExt to the memory management. Real-time buffer performance was achieved for two common frame formats at 1080 HD resolution; and an innovative pipeline design provides dynamic switching of formats between video sequences. As an additional consequence of verifying the model, a preexisting Baseline H.264/AVC decoder testbench was augmented to support testing of multiple frame formats

    Network-on-Chip Based H.264 Video Decoder on a Field Programmable Gate Array

    Get PDF
    This thesis develops the first fully network-on-chip (NoC) based h.264 video decoder implemented in real hardware on a field programmable gate array (FPGA). This thesis starts with an overview of the h.264 video coding standard and an introduction to the NoC communication paradigm. Following this, a series of processing elements (PEs) are developed which implement the component algorithms making up the h.264 video decoder. These PEs, described primarily in VHDL with some Verilog and C, are then mapped to an NoC which is generated using the CONNECT NoC generation tool. To demonstrate the scalability of the proposed NoC based design, a second NoC based video decoder is implemented on a smaller FPGA using the same PEs on a more compact NoC topology. The performance of both decoders, as well as their component PEs, is evaluated on real hardware. An analysis of the performance results is conducted and recommendations for future work are made based on the results of this analysis. Aside from the development of the proposed decoder, a major contribution of this thesis is the release of all source materials for this design as open source hardware and software. The release of these materials will allow other researchers to more easily replicate this work, as well as create derivative works in the areas of NoC based designs for FPGA, video coding and decoding, and related areas

    Hardware Implementation of a High Speed Deblocking Filter for the H.264 Video Codec

    Get PDF
    H.264/MPEG-4 part 10 or Advanced Video Coding (AVC) is a standard for video compression. MPEG-4 is currently one of the most widely used formats for recording, compression and distribution of high definition video. One feature of the AVC codec is the inclusion of an in-loop deblocking filter. The goal of the deblocking filter is to remove blocking artifacts that exist at macroblock boundaries. However, due to the complexity of the deblocking algorithm, the filter can easily account for one-third of the computational complexity of a decoder. In this thesis, a modification to the deblocking algorithm given in the AVC standard is presented. This modification allows the algorithm to finish the filtering of a macroblock to finish twenty clock cycles faster than previous single filter designs. This thesis also presents a hardware architecture of the H.264 deblocking filter to be used in the H.264 decoder. The developed architecture allows the filtering of videos streams using 4:2:2 chroma subsampling and 10-bit pixel precision in real-time. The filter was described in VHDL and synthesized for a Spartan-6 FPGA device. Timing analysis showed that is was capable of filtering a macroblock using 4:2:0 chroma subsampling in 124 clock cycles and 4:2:2 chroma subsampling streams in 162 clock cycles. The filter can also provide real-time deblocking of HDTV video (1920x1080) of up to 988 frames per second

    Video Compression from the Hardware Perspective

    Get PDF

    Codificação de vídeo MPEG-4 em FPGA

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesO trabalho apresentado nesta dissertação foi realizado no âmbito do grupo ISG (Implementation Study Group) do MPEG (Motion Picture Expert Group). Tem como objectivo principal analisar o desempenho da implementação de diversões módulos da norma MPEG-4 e da sua versão avançada MPEG-4 AVC (Advanced Video Coding) em FPGA (Field Programmable Gate Array). A dissertação é essencialmente constituída por duas partes. Na primeira parte, estudam-se os conceitos básicos da codificação de vídeo, nomeadamente os conceitos integrantes da norma MPEG-4. De seguida, estuda-se a norma MPEG-4 AVC. Na segunda parte, esta dissertação apresenta o desenvolvimento de seis módulos implementados em VHDL e compilados para o circuito FPGA da XILINX, Virtex-II. Dos seis módulos, quatro constituem um descodificador MPEG-4 (excepto a compensação de movimento) enquanto os outros dois fazem parte do codificador MPEG-4 AVC.This dissertation shows the work developed in the framework of the ISG (Implementation Study Group) group. The main objective of this work is to analyse the performance of several MPEG-4 modules and MPEG-4 AVC (Advanced Video Coding) modules in FPGA (field Programmable Gate Array). This dissertation is organized into two parts. In the first part, the basic video coding concepts are studied, mainly the MPEG-4 concepts, followed by the MPEG-4 AVC standard. In the second part, it is discussed six modules implemented in VHDL and compiled for the XILINX Virtex-II FPGA. Four of those six developed modules are used to implement an MPEG-4 decoder (except the motion compensation stage) and the other two are modules of the MPEG-4 AVC coder

    Architecture design of a scalable adaptive deblocking filter for H.264/AVC

    Get PDF
    Due to significant bit-rate savings and improved perceptual quality, H.264/AVC, the latest video compression standard from the Joint Video Team, is receiving widespread adoption. Greater coding efficiency relative to previous standards is a result of additional techniques and features. One important change is the inclusion of an in-loop deblocking filter for removal of blocking artifacts. Since the filter can easily account for one-third of the computational complexity of a decoder, its addition was a source of debate during the development of the H.264/AVC standard. Ample research on architecture design of the deblocking filter has been carried out, generally targeted toward high performance profiles. To the best of our knowledge no other research investigated designs that can be scaled from low-power extended profiles up to high performance profiles. This work investigated the design of a scalable architecture for the deblocking filter. Four different designs were implemented. The relative performance of the designs were then compared against each other and existing research through simulation. All designs were targeted towards a Xilinx Virtex 5 field programmable gate array (FPGA)

    Parallel algorithms and architectures for low power video decoding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 197-204).Parallelism coupled with voltage scaling is an effective approach to achieve high processing performance with low power consumption. This thesis presents parallel architectures and algorithms designed to deliver the power and performance required for current and next generation video coding. Coding efficiency, area cost and scalability are also addressed. First, a low power video decoder is presented for the current state-of-the-art video coding standard H.264/AVC. Parallel architectures are used along with voltage scaling to deliver high definition (HD) decoding at low power levels. Additional architectural optimizations such as reducing memory accesses and multiple frequency/voltage domains are also described. An H.264/AVC Baseline decoder test chip was fabricated in 65-nm CMOS. It can operate at 0.7 V for HD (720p, 30 fps) video decoding and with a measured power of 1.8 mW. The highly scalable decoder can tradeoff power and performance across >100x range. Second, this thesis demonstrates how serial algorithms, such as Context-based Adaptive Binary Arithmetic Coding (CABAC), can be redesigned for parallel architectures to enable high throughput with low coding efficiency cost. A parallel algorithm called the Massively Parallel CABAC (MP-CABAC) is presented that uses syntax element partitions and interleaved entropy slices to achieve better throughput-coding efficiency and throughput-area tradeoffs than H.264/AVC. The parallel algorithm also improves scalability by providing a third dimension to tradeoff coding efficiency for power and performance. Finally, joint algorithm-architecture optimizations are used to increase performance and reduce area with almost no coding penalty. The MP-CABAC is mapped to a highly parallel architecture with 80 parallel engines, which together delivers >10x higher throughput than existing H.264/AVC CABAC implementations. A MP-CABAC test chip was fabricated in 65-nm CMOS to demonstrate the power-performance-coding efficiency tradeoff.by Vivienne. Sze.Ph.D

    H.264 Decoder: A Case Study in Multiple Design Points

    Full text link
    H.264, a state-of-the-art video compression standard, is used across a range of products from cell phones to HDTV. These products have vastly different performance, power and cost requirements, necessitating different hardware-software solutions for H.264 decoding. We show that a de-sign methodology and associated tools which support syn-thesis from high-level descriptions and which allow mod-ular refinement throughout the design cycle, can share the majority of design effort across multiple design points. Us-ing Bluespec SystemVerilog, we have created a variety of designs for the H.264 decoder tuned to support decod-ing at resolutions ranging from QCIF video (176 × 14

    Dynamically Reconfigurable Architectures and Systems for Time-varying Image Constraints (DRASTIC) for Image and Video Compression

    Get PDF
    In the current information booming era, image and video consumption is ubiquitous. The associated image and video coding operations require significant computing resources for both small-scale computing systems as well as over larger network systems. For different scenarios, power, bitrate and image quality can impose significant time-varying constraints. For example, mobile devices (e.g., phones, tablets, laptops, UAVs) come with significant constraints on energy and power. Similarly, computer networks provide time-varying bandwidth that can depend on signal strength (e.g., wireless networks) or network traffic conditions. Alternatively, the users can impose different constraints on image quality based on their interests. Traditional image and video coding systems have focused on rate-distortion optimization. More recently, distortion measures (e.g., PSNR) are being replaced by more sophisticated image quality metrics. However, these systems are based on fixed hardware configurations that provide limited options over power consumption. The use of dynamic partial reconfiguration with Field Programmable Gate Arrays (FPGAs) provides an opportunity to effectively control dynamic power consumption by jointly considering software-hardware configurations. This dissertation extends traditional rate-distortion optimization to rate-quality-power/energy optimization and demonstrates a wide variety of applications in both image and video compression. In each application, a family of Pareto-optimal configurations are developed that allow fine control in the rate-quality-power/energy optimization space. The term Dynamically Reconfiguration Architecture Systems for Time-varying Image Constraints (DRASTIC) is used to describe the derived systems. DRASTIC covers both software-only as well as software-hardware configurations to achieve fine optimization over a set of general modes that include: (i) maximum image quality, (ii) minimum dynamic power/energy, (iii) minimum bitrate, and (iv) typical mode over a set of opposing constraints to guarantee satisfactory performance. In joint software-hardware configurations, DRASTIC provides an effective approach for dynamic power optimization. For software configurations, DRASTIC provides an effective method for energy consumption optimization by controlling processing times. The dissertation provides several applications. First, stochastic methods are given for computing quantization tables that are optimal in the rate-quality space and demonstrated on standard JPEG compression. Second, a DRASTIC implementation of the DCT is used to demonstrate the effectiveness of the approach on motion JPEG. Third, a reconfigurable deblocking filter system is investigated for use in the current H.264/AVC systems. Fourth, the dissertation develops DRASTIC for all 35 intra-prediction modes as well as intra-encoding for the emerging High Efficiency Video Coding standard (HEVC)

    High-Level Synthesis Based VLSI Architectures for Video Coding

    Get PDF
    High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified
    corecore