108 research outputs found
High-Level Synthesis Based VLSI Architectures for Video Coding
High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified
Decoder Hardware Architecture for HEVC
This chapter provides an overview of the design challenges faced in the implementation of hardware HEVC decoders. These challenges can be attributed to the larger and diverse coding block sizes and transform sizes, the larger interpolation filter for motion compensation, the increased number of steps in intra prediction and the introduction of a new in-loop filter. Several solutions to address these implementation challenges are discussed. As a reference, results for an HEVC decoder test chip are also presented.Texas Instruments Incorporate
A 249-Mpixel/s HEVC Video-Decoder Chip for 4K Ultra-HD Applications
High Efficiency Video Coding, the latest video standard, uses larger and variable-sized coding units and longer interpolation filters than [H.264 over AVC] to better exploit redundancy in video signals. These algorithmic techniques enable a 50% decrease in bitrate at the cost of computational complexity, external memory bandwidth, and, for ASIC implementations, on-chip SRAM of the video codec. This paper describes architectural optimizations for an HEVC video decoder chip. The chip uses a two-stage subpipelining scheme to reduce on-chip SRAM by 56 kbytes-a 32% reduction. A high-throughput read-only cache combined with DRAM-latency-aware memory mapping reduces DRAM bandwidth by 67%. The chip is built for HEVC Working Draft 4 Low Complexity configuration and occupies 1.77 mm[superscript 2] in 40-nm CMOS. It performs 4K Ultra HD 30-fps video decoding at 200 MHz while consuming 1.19 [nJ over pixel] of normalized system power.Texas Instruments Incorporate
VLSI architecture design approaches for real-time video processing
This paper discusses the programmable and dedicated approaches for real-time video processing applications. Various VLSI architecture including the design examples of both approaches are reviewed. Finally, discussions of several practical designs in real-time video processing applications are then considered in VLSI architectures to provide significant guidelines to VLSI designers for any further real-time video processing design works
An efficient multi-core SIMD implementation for H.264/AVC encoder
The optimization process of a H.264/AVC encoder on three different architectures is presented. The architectures are multi- and singlecore and SIMD instruction sets have different vector registers size. The need of code optimization is fundamental when addressing HD resolutions with real-time constraints. The encoder is subdivided in functional modules in order to better understand where the optimization is a key factor and to evaluate in details the performance improvement. Common issues in both partitioning a video encoder into parallel architectures and SIMD optimization are described, and author solutions are presented for all the architectures. Besides showing efficient video encoder implementations, one of the main purposes of this paper is to discuss how the characteristics of different architectures and different set of SIMD instructions can impact on the target application performance. Results about the achieved speedup are provided in order to compare the different implementations and evaluate the more suitable solutions for present and next generation video-coding algorithms
Low Complexity Interpolation Filters for Motion Estimation and Application to the H.264 Encoders
Techniques for image super-resolution play an important role in a plethora of applications, which include video compression and motion estimation. The detection of the fractional displacements among frames facilitates the removal of temporal redundancy and improves the video quality by 2-4 dB PSNR. However, the increased complexity of the Fractional Motion Estimation (FME) process adds a significant computational load to the encoder and sets constraints to real-time designs. Researchers have performed timing analysis for the motion estimation process and they reported that FME accounts for almost half of the entire motion estimation period, which in turn accounts for 60-90% of the total encoding time depending on the design configuration
Algoritmo de estimação de movimento e sua arquitetura de hardware para HEVC
Doutoramento em Engenharia EletrotécnicaVideo coding has been used in applications like video surveillance, video
conferencing, video streaming, video broadcasting and video storage. In a
typical video coding standard, many algorithms are combined to compress a
video. However, one of those algorithms, the motion estimation is the most
complex task. Hence, it is necessary to implement this task in real time by
using appropriate VLSI architectures. This thesis proposes a new fast motion
estimation algorithm and its implementation in real time. The results show that
the proposed algorithm and its motion estimation hardware architecture out
performs the state of the art. The proposed architecture operates at a
maximum operating frequency of 241.6 MHz and is able to process
1080p@60Hz with all possible variables block sizes specified in HEVC
standard as well as with motion vector search range of up to ±64 pixels.A codificação de vídeo tem sido usada em aplicações tais como, vídeovigilância,
vídeo-conferência, video streaming e armazenamento de vídeo.
Numa norma de codificação de vídeo, diversos algoritmos são combinados
para comprimir o vídeo. Contudo, um desses algoritmos, a estimação de
movimento é a tarefa mais complexa. Por isso, é necessário implementar esta
tarefa em tempo real usando arquiteturas de hardware apropriadas. Esta tese
propõe um algoritmo de estimação de movimento rápido bem como a sua
implementação em tempo real. Os resultados mostram que o algoritmo e a
arquitetura de hardware propostos têm melhor desempenho que os existentes.
A arquitetura proposta opera a uma frequência máxima de 241.6 MHz e é
capaz de processar imagens de resolução 1080p@60Hz, com todos os
tamanhos de blocos especificados na norma HEVC, bem como um domínio de
pesquisa de vetores de movimento até ±64 pixels
- …