186 research outputs found

    Design of a Processor Optimized for Syntax Parsing in Video Decoders

    No full text
    8International audienceHeterogeneous platforms aim to offer both performance and flexibility by providing designers processors and programmable logical units on a single platform. Processors implemented on these platforms are usually soft-cores (e.g. Altera NIOS) or ASIC (e.g. ARM Cortex-A8). However, these processors still face limitations in terms of performance compared to full hardware designs in particular for real-time video decoding applications. We present in this paper an innovative approach to improve performance using both a processor optimized for the syntax parsing (an Application-Specific Instruction-set Processor) and a FPGA. The case study has been synthesized on a Xilinx FPGA at a frequency of 100MHz and we estimate the performance that could be obtained with an ASIC

    Application-Specific Cache and Prefetching for HEVC CABAC Decoding

    Get PDF
    Context-based Adaptive Binary Arithmetic Coding (CABAC) is the entropy coding module in the HEVC/H.265 video coding standard. As in its predecessor, H.264/AVC, CABAC is a well-known throughput bottleneck due to its strong data dependencies. Besides other optimizations, the replacement of the context model memory by a smaller cache has been proposed for hardware decoders, resulting in an improved clock frequency. However, the effect of potential cache misses has not been properly evaluated. This work fills the gap by performing an extensive evaluation of different cache configurations. Furthermore, it demonstrates that application-specific context model prefetching can effectively reduce the miss rate and increase the overall performance. The best results are achieved with two cache lines consisting of four or eight context models. The 2 × 8 cache allows a performance improvement of 13.2 percent to 16.7 percent compared to a non-cached decoder due to a 17 percent higher clock frequency and highly effective prefetching. The proposed HEVC/H.265 CABAC decoder allows the decoding of high-quality Full HD videos in real-time using few hardware resources on a low-power FPGA.EC/H2020/645500/EU/Improving European VoD Creative Industry with High Efficiency Video Delivery/Film26

    CABAC accelerator architectures for video compression in future multimedida : a survey

    Get PDF
    The demands for high quality, real-time performance and multi-format video support in consumer multimedia products are ever increasing. In particular, the future multimedia systems require efficient video coding algorithms and corresponding adaptive high-performance computational platforms. The H.264/AVC video coding algorithms provide high enough compression efficiency to be utilized in these systems, and multimedia processors are able to provide the required adaptability, but the algorithms complexity demands for more efficient computing platforms. Heterogeneous (re-)configurable systems composed of multimedia processors and hardware accelerators constitute the main part of such platforms. In this paper, we survey the hardware accelerator architectures for Context-based Adaptive Binary Arithmetic Coding (CABAC) of Main and High profiles of H.264/AVC. The purpose of the survey is to deliver a critical insight in the proposed solutions, and this way facilitate further research on accelerator architectures, architecture development methods and supporting EDA tools. The architectures are analyzed, classified and compared based on the core hardware acceleration concepts, algorithmic characteristics, video resolution support and performance parameters, and some promising design directions are discussed. The comparative analysis shows that the parallel pipeline accelerator architecture seems to be the most promising

    A High-Performance Hardware Accelerator for HEVC Motion Compensation

    Get PDF
    The presented master’s thesis has focused on the design and implementation of a motion compensation hardware accelerator for use in HEVC hybrid decoders, i.e. decoders that contain hardware as well as software parts. As the motion compensation is the most time consuming step in the decoding process it is crucial to implement it in a fast and efficient way. This paper elaborates the theoretical background and motivation and highlights the main design choices. In the following evaluation a comparison between the hybrid decoder and a pure software decoder is performed. The results show that the design is capable of increasing the decoding frame rate in the range of 60% for 1080p video streams when running at 100 MHz

    Joint Algorithm-Architecture Optimization of CABAC

    Get PDF
    This paper uses joint algorithm and architecture design to enable high coding efficiency in conjunction with high processing speed and low area cost. Specifically, it presents several optimizations that can be performed on Context Adaptive Binary Arithmetic Coding (CABAC), a form of entropy coding used in H.264/AVC, to achieve the throughput necessary for real-time low power high definition video coding. The combination of syntax element partitions and interleaved entropy slices, referred to as Massively Parallel CABAC, increases the number of binary symbols that can be processed in a cycle. Subinterval reordering is used to reduce the cycle time required to process each binary symbol. Under common conditions using the JM12.0 software, the Massively Parallel CABAC, increases the bins per cycle by 2.7 to 32.8× at a cost of 0.25 to 6.84% coding loss compared with sequential single slice H.264/AVC CABAC. It also provides a 2× reduction in area cost, and reduces memory bandwidth. Subinterval reordering reduces the critical path delay by 14 to 22%, while modifications to context selection reduces the memory requirement by 67%. This work demonstrates that accounting for implementation cost during video coding algorithms design can enable higher processing speed and reduce hardware cost, while still delivering high coding efficiency in the next generation video coding standard.Texas Instruments Incorporated (Graduate Women's Fellowship for Leadership in Microelectronics)Natural Sciences and Engineering Research Council of Canad

    Survey of advanced CABAC accelarator architectures for future multimedia.

    Get PDF
    The future high quality multimedia systems require efficient video coding algorithms and corresponding adaptive high-performance computational platforms. In this paper, we survey the hardware accelerator architectures for Context-based Adaptive Binary Arithmetic Coding (CABAC) of H.264/AVC. The purpose of the survey is to deliver a critical insight in the proposed solutions, and this way facilitate further research on accelerator architectures, architecture development methods and supporting EDA tools. The architectures are analyzed, classified and compared based on the core hardware acceleration concepts, algorithmic characteristics, video resolution support and performance parameters, and some promising design directions are discussed

    An MPEG-4 performance study for non-SIMD, general purpose architectures

    Get PDF
    MPEG-4 is an important international standard with wide applicability. This paper focuses on MPEG-4's main profile, video, whose approach allows more efficiency in coding and more flexibility in managing heterogeneous media objects than previous MPEG standards. This study presents evidence to support the assertion that for non-SIMD architectures and computational models, most memory-system optimizations will have little effect on MPEG-4 performance. This paper makes two contributions. First, it serves as an independent confirmation that for current, general-purpose architectures, MPEG-4 video is computation bound (just like most other media processing applications). Second, our findings should prove useful to other researchers and practitioners considering how to (or how not to) optimize MPEG-4 performance.Peer ReviewedPostprint (published version

    Decoder Hardware Architecture for HEVC

    Get PDF
    This chapter provides an overview of the design challenges faced in the implementation of hardware HEVC decoders. These challenges can be attributed to the larger and diverse coding block sizes and transform sizes, the larger interpolation filter for motion compensation, the increased number of steps in intra prediction and the introduction of a new in-loop filter. Several solutions to address these implementation challenges are discussed. As a reference, results for an HEVC decoder test chip are also presented.Texas Instruments Incorporate

    A 249-Mpixel/s HEVC Video-Decoder Chip for 4K Ultra-HD Applications

    Get PDF
    High Efficiency Video Coding, the latest video standard, uses larger and variable-sized coding units and longer interpolation filters than [H.264 over AVC] to better exploit redundancy in video signals. These algorithmic techniques enable a 50% decrease in bitrate at the cost of computational complexity, external memory bandwidth, and, for ASIC implementations, on-chip SRAM of the video codec. This paper describes architectural optimizations for an HEVC video decoder chip. The chip uses a two-stage subpipelining scheme to reduce on-chip SRAM by 56 kbytes-a 32% reduction. A high-throughput read-only cache combined with DRAM-latency-aware memory mapping reduces DRAM bandwidth by 67%. The chip is built for HEVC Working Draft 4 Low Complexity configuration and occupies 1.77 mm[superscript 2] in 40-nm CMOS. It performs 4K Ultra HD 30-fps video decoding at 200 MHz while consuming 1.19 [nJ over pixel] of normalized system power.Texas Instruments Incorporate
    • 

    corecore