Search CORE

186 research outputs found

Design of a Processor Optimized for Syntax Parsing in Video Decoders

Author: Nezan Jean François
Rhatay Aimad
Siret Nicolas
Publication venue: 'TECSI'
Publication date: 02/11/2011
Field of study

8International audienceHeterogeneous platforms aim to offer both performance and flexibility by providing designers processors and programmable logical units on a single platform. Processors implemented on these platforms are usually soft-cores (e.g. Altera NIOS) or ASIC (e.g. ARM Cortex-A8). However, these processors still face limitations in terms of performance compared to full hardware designs in particular for real-time video decoding applications. We present in this paper an innovative approach to improve performance using both a processor optimized for the syntax parsing (an Application-Specific Instruction-set Processor) and a FPGA. The case study has been synthesized on a Xilinx FPGA at a frequency of 100MHz and we estimate the performance that could be obtained with an ASIC

HAL-CentraleSupelec

HAL-Rennes 1

Application-Specific Cache and Prefetching for HEVC CABAC Decoding

Author: Chi Chi Ching
Habermann Philipp
Juurlink Ben
Álvarez-Mesa Mauricio
Publication venue
Publication date: 01/01/2017
Field of study

Context-based Adaptive Binary Arithmetic Coding (CABAC) is the entropy coding module in the HEVC/H.265 video coding standard. As in its predecessor, H.264/AVC, CABAC is a well-known throughput bottleneck due to its strong data dependencies. Besides other optimizations, the replacement of the context model memory by a smaller cache has been proposed for hardware decoders, resulting in an improved clock frequency. However, the effect of potential cache misses has not been properly evaluated. This work fills the gap by performing an extensive evaluation of different cache configurations. Furthermore, it demonstrates that application-specific context model prefetching can effectively reduce the miss rate and increase the overall performance. The best results are achieved with two cache lines consisting of four or eight context models. The 2 × 8 cache allows a performance improvement of 13.2 percent to 16.7 percent compared to a non-cached decoder due to a 17 percent higher clock frequency and highly effective prefetching. The proposed HEVC/H.265 CABAC decoder allows the decoding of high-quality Full HD videos in real-time using few hardware resources on a low-power FPGA.EC/H2020/645500/EU/Improving European VoD Creative Industry with High Efficiency Video Delivery/Film26

DepositOnce

CABAC accelerator architectures for video compression in future multimedida : a survey

Author: Jan Y.
Jozwiak L.
Publication venue: Springer
Publication date: 01/01/2009
Field of study

The demands for high quality, real-time performance and multi-format video support in consumer multimedia products are ever increasing. In particular, the future multimedia systems require efficient video coding algorithms and corresponding adaptive high-performance computational platforms. The H.264/AVC video coding algorithms provide high enough compression efficiency to be utilized in these systems, and multimedia processors are able to provide the required adaptability, but the algorithms complexity demands for more efficient computing platforms. Heterogeneous (re-)configurable systems composed of multimedia processors and hardware accelerators constitute the main part of such platforms. In this paper, we survey the hardware accelerator architectures for Context-based Adaptive Binary Arithmetic Coding (CABAC) of Main and High profiles of H.264/AVC. The purpose of the survey is to deliver a critical insight in the proposed solutions, and this way facilitate further research on accelerator architectures, architecture development methods and supporting EDA tools. The architectures are analyzed, classified and compared based on the core hardware acceleration concepts, algorithmic characteristics, video resolution support and performance parameters, and some promising design directions are discussed. The comparative analysis shows that the parallel pipeline accelerator architecture seems to be the most promising

Repository TU/e

Crossref

Pure OAI Repository

A High-Performance Hardware Accelerator for HEVC Motion Compensation

Author: Göbel Matthias
Publication venue
Publication date: 01/01/2014
Field of study

The presented master’s thesis has focused on the design and implementation of a motion compensation hardware accelerator for use in HEVC hybrid decoders, i.e. decoders that contain hardware as well as software parts. As the motion compensation is the most time consuming step in the decoding process it is crucial to implement it in a fast and efficient way. This paper elaborates the theoretical background and motivation and highlights the main design choices. In the following evaluation a comparison between the hybrid decoder and a pure software decoder is performed. The results show that the design is capable of increasing the decoding frame rate in the range of 60% for 1080p video streams when running at 100 MHz

DepositOnce

Joint Algorithm-Architecture Optimization of CABAC

Author: Chandrakasan Anantha P.
Sze Vivienne
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2012
Field of study

This paper uses joint algorithm and architecture design to enable high coding efficiency in conjunction with high processing speed and low area cost. Specifically, it presents several optimizations that can be performed on Context Adaptive Binary Arithmetic Coding (CABAC), a form of entropy coding used in H.264/AVC, to achieve the throughput necessary for real-time low power high definition video coding. The combination of syntax element partitions and interleaved entropy slices, referred to as Massively Parallel CABAC, increases the number of binary symbols that can be processed in a cycle. Subinterval reordering is used to reduce the cycle time required to process each binary symbol. Under common conditions using the JM12.0 software, the Massively Parallel CABAC, increases the bins per cycle by 2.7 to 32.8× at a cost of 0.25 to 6.84% coding loss compared with sequential single slice H.264/AVC CABAC. It also provides a 2× reduction in area cost, and reduces memory bandwidth. Subinterval reordering reduces the critical path delay by 14 to 22%, while modifications to context selection reduces the memory requirement by 67%. This work demonstrates that accounting for implementation cost during video coding algorithms design can enable higher processing speed and reduce hardware cost, while still delivering high coding efficiency in the next generation video coding standard.Texas Instruments Incorporated (Graduate Women's Fellowship for Leadership in Microelectronics)Natural Sciences and Engineering Research Council of Canad

DSpace@MIT

The berkeley software MPEG-1 video decoder

Author: Bahl P.
Brian C. Smith
Eckert S.
Gong K.
Ketan Mayer-Patel
Lawrence A. Rowe
Lee R. B.
Mayer-Patel K.
McMillan L.
Patel K.
Rowe L.
Schank P.
Soderquist P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Survey of advanced CABAC accelarator architectures for future multimedia.

Author: Jan Y.
Jozwiak L.
Publication venue: Springer
Publication date: 01/01/2009
Field of study

The future high quality multimedia systems require efficient video coding algorithms and corresponding adaptive high-performance computational platforms. In this paper, we survey the hardware accelerator architectures for Context-based Adaptive Binary Arithmetic Coding (CABAC) of H.264/AVC. The purpose of the survey is to deliver a critical insight in the proposed solutions, and this way facilitate further research on accelerator architectures, architecture development methods and supporting EDA tools. The architectures are analyzed, classified and compared based on the core hardware acceleration concepts, algorithmic characteristics, video resolution support and performance parameters, and some promising design directions are discussed

Repository TU/e

Crossref

Pure OAI Repository

An MPEG-4 performance study for non-SIMD, general purpose architectures

Author: Fang Zhen
McKee Sally A.
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

MPEG-4 is an important international standard with wide applicability. This paper focuses on MPEG-4's main profile, video, whose approach allows more efficiency in coding and more flexibility in managing heterogeneous media objects than previous MPEG standards. This study presents evidence to support the assertion that for non-SIMD architectures and computational models, most memory-system optimizations will have little effect on MPEG-4 performance. This paper makes two contributions. First, it serves as an independent confirmation that for current, general-purpose architectures, MPEG-4 video is computation bound (just like most other media processing applications). Second, our findings should prove useful to other researchers and practitioners considering how to (or how not to) optimize MPEG-4 performance.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Decoder Hardware Architecture for HEVC

Author: C-T Huang
C-T Huang
D Marpe
D Zhou
DF Finchelstein
DF Finchelstein
J Vanne
K Kawakami
K Xu
P Tummeltshammer
P Zhang
T Xanthopoulos
V Sze
V Sze
Y Yi
YC Yang
Z Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This chapter provides an overview of the design challenges faced in the implementation of hardware HEVC decoders. These challenges can be attributed to the larger and diverse coding block sizes and transform sizes, the larger interpolation filter for motion compensation, the increased number of steps in intra prediction and the introduction of a new in-loop filter. Several solutions to address these implementation challenges are discussed. As a reference, results for an HEVC decoder test chip are also presented.Texas Instruments Incorporate

DSpace@MIT

Crossref

A 249-Mpixel/s HEVC Video-Decoder Chip for 4K Ultra-HD Applications

Author: Chandrakasan Anantha P.
Huang Chao-Tsung
Juvekar Chiraag Shashikant
Sze Vivienne
Tikekar Mehul
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2013
Field of study

High Efficiency Video Coding, the latest video standard, uses larger and variable-sized coding units and longer interpolation filters than [H.264 over AVC] to better exploit redundancy in video signals. These algorithmic techniques enable a 50% decrease in bitrate at the cost of computational complexity, external memory bandwidth, and, for ASIC implementations, on-chip SRAM of the video codec. This paper describes architectural optimizations for an HEVC video decoder chip. The chip uses a two-stage subpipelining scheme to reduce on-chip SRAM by 56 kbytes-a 32% reduction. A high-throughput read-only cache combined with DRAM-latency-aware memory mapping reduces DRAM bandwidth by 67%. The chip is built for HEVC Working Draft 4 Low Complexity configuration and occupies 1.77 mm[superscript 2] in 40-nm CMOS. It performs 4K Ultra HD 30-fps video decoding at 200 MHz while consuming 1.19 [nJ over pixel] of normalized system power.Texas Instruments Incorporate

DSpace@MIT