1,513 research outputs found

    Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding

    Get PDF
    Real-time and high-quality video coding is gaining a wide interest in the research and industrial community for different applications. H.264/AVC, a recent standard for high performance video coding, can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require to sustain up to tens of Mbits/s. To that purpose this paper proposes optimized architectures for H.264/AVC most critical tasks, Motion estimation and context adaptive binary arithmetic coding. Post synthesis results on sub-micron CMOS standard-cells technologies show that the proposed architectures can actually process in real-time 720 × 480 video sequences at 30 frames/s and grant more than 50 Mbits/s. The achieved circuit complexity and power consumption budgets are suitable for their integration in complex VLSI multimedia systems based either on AHB bus centric on-chip communication system or on novel Network-on-Chip (NoC) infrastructures for MPSoC (Multi-Processor System on Chip

    Hardware acceleration architectures for MPEG-Based mobile video platforms: a brief overview

    Get PDF
    This paper presents a brief overview of past and current hardware acceleration (HwA) approaches that have been proposed for the most computationally intensive compression tools of the MPEG-4 standard. These approaches are classified based on their historical evolution and architectural approach. An analysis of both evolutionary and functional classifications is carried out in order to speculate on the possible trends of the HwA architectures to be employed in mobile video platforms

    VLSI architectures design for encoders of High Efficiency Video Coding (HEVC) standard

    Get PDF
    The growing popularity of high resolution video and the continuously increasing demands for high quality video on mobile devices are producing stronger needs for more efficient video encoder. Concerning these desires, HEVC, a newest video coding standard, has been developed by a joint team formed by ISO/IEO MPEG and ITU/T VCEG. Its design goal is to achieve a 50% compression gain over its predecessor H.264 with an equal or even higher perceptual video quality. Motion Estimation (ME) being as one of the most critical module in video coding contributes almost 50%-70% of computational complexity in the video encoder. This high consumption of the computational resources puts a limit on the performance of encoders, especially for full HD or ultra HD videos, in terms of coding speed, bit-rate and video quality. Thus the major part of this work concentrates on the computational complexity reduction and improvement of timing performance of motion estimation algorithms for HEVC standard. First, a new strategy to calculate the SAD (Sum of Absolute Difference) for motion estimation is designed based on the statistics on property of pixel data of video sequences. This statistics demonstrates the size relationship between the sum of two sets of pixels has a determined connection with the distribution of the size relationship between individual pixels from the two sets. Taking the advantage of this observation, only a small proportion of pixels is necessary to be involved in the SAD calculation. Simulations show that the amount of computations required in the full search algorithm is reduced by about 58% on average and up to 70% in the best case. Secondly, from the scope of parallelization an enhanced TZ search for HEVC is proposed using novel schemes of multiple MVPs (motion vector predictor) and shared MVP. Specifically, resorting to multiple MVPs the initial search process is performed in parallel at multiple search centers, and the ME processing engine for PUs within one CU are parallelized based on the MVP sharing scheme on CU (coding unit) level. Moreover, the SAD module for ME engine is also parallelly implemented for PU size of 32×32. Experiments indicate it achieves an appreciable improvement on the throughput and coding efficiency of the HEVC video encoder. In addition, the other part of this thesis is contributed to the VLSI architecture design for finding the first W maximum/minimum values targeting towards high speed and low hardware cost. The architecture based on the novel bit-wise AND scheme has only half of the area of the best reference solution and its critical path delay is comparable with other implementations. While the FPCG (full parallel comparison grid) architecture, which utilizes the optimized comparator-based structure, achieves 3.6 times faster on average on the speed and even 5.2 times faster at best comparing with the reference architectures. Finally the architecture using the partial sorting strategy reaches a good balance on the timing performance and area, which has a slightly lower or comparable speed with FPCG architecture and a acceptable hardware cost

    Energy-efficient acceleration of MPEG-4 compression tools

    Get PDF
    We propose novel hardware accelerator architectures for the most computationally demanding algorithms of the MPEG-4 video compression standard-motion estimation, binary motion estimation (for shape coding), and the forward/inverse discrete cosine transforms (incorporating shape adaptive modes). These accelerators have been designed using general low-energy design philosophies at the algorithmic/architectural abstraction levels. The themes of these philosophies are avoiding waste and trading area/performance for power and energy gains. Each core has been synthesised targeting TSMC 0.09 μm TCBN90LP technology, and the experimental results presented in this paper show that the proposed cores improve upon the prior art

    Low power techniques for video compression

    Get PDF
    This paper gives an overview of low-power techniques proposed in the literature for mobile multimedia and Internet applications. Exploitable aspects are discussed in the behavior of different video compression tools. These power-efficient solutions are then classified by synthesis domain and level of abstraction. As this paper is meant to be a starting point for further research in the area, a lowpower hardware & software co-design methodology is outlined in the end as a possible scenario for video-codec-on-a-chip implementations on future mobile multimedia platforms

    Optimization of the motion estimation for parallel embedded systems in the context of new video standards

    Get PDF
    15 pagesInternational audienceThe effciency of video compression methods mainly depends on the motion compensation stage, and the design of effcient motion estimation techniques is still an important issue. An highly accurate motion estimation can significantly reduce the bit-rate, but involves a high computational complexity. This is particularly true for new generations of video compression standards, MPEG AVC and HEVC, which involves techniques such as different reference frames, sub-pixel estimation, variable block sizes. In this context, the design of fast motion estimation solutions is necessary, and can concerned two linked aspects: a high quality algorithm and its effcient implementation. This paper summarizes our main contributions in this domain. In particular, we first present the HME (Hierarchical Motion Estimation) technique. It is based on a multi-level refinement process where the motion estimation vectors are first estimated on a sub-sampled image. The multi-levels decomposition provides robust predictions and is particularly suited for variable block sizes motion estimations. The HME method has been integrated in a AVC encoder, and we propose a parallel implementation of this technique, with the motion estimation at pixel level performed by a DSP processor, and the sub-pixel refinement realized in an FPGA. The second technique that we present is called HDS for Hierarchical Diamond Search. It combines the multi-level refinement of HME, with a fast search at pixel-accuracy inspired by the EPZS method. This paper also presents its parallel implementation onto a multi-DSP platform and the its use in the HEVC context

    Efficient hardware architectures for MPEG-4 core profile

    Get PDF
    Efficient hardware acceleration architectures are proposed for the most demandingMPEG-4 core profile algorithms, namely; texture motion estimation (TME), binary motion estimation (BME)and the shape adaptive discrete cosine transform (SA-DCT). The proposed ME designs may also be used for H.264, since both architectures can handle variable block sizes. Both ME architectures employ early termination techniques that reduce latency and save needless memory accesses and power consumption. They also use a pixel subsampling technique to facilitate parallelism, while balancing the computational load. The BME datapath also saves operations by using Run Length Coded (RLC) pixel addressing. The SA-DCT module has a re-configuring multiplier-less serial datapath using adders and multiplexers only to improve area and power. The SA-DCT packing steps are done using a minimal switching addressing scheme with guarded evaluation. All three modules have been synthesised targeting the WildCard-II FPGA benchmarking platform adopted by the MPEG-4 Part9 reference hardware group
    corecore