69 research outputs found

    An efficient interpolation filter VLSI architecture for HEVC standard

    Get PDF

    A Motion Estimation based Algorithm for Encoding Time Reduction in HEVC

    Get PDF
    High Efficiency Video Coding (HEVC) is a video compression standard that offers 50% more efficiency at the expense of high encoding time contrasted with the H.264 Advanced Video Coding (AVC) standard. The encoding time must be reduced to satisfy the needs of real-time applications. This paper has proposed the Multi- Level Resolution Vertical Subsampling (MLRVS) algorithm to reduce the encoding time. The vertical subsampling minimizes the number of Sum of Absolute Difference (SAD) computations during the motion estimation process. The complexity reduction algorithm is also used for fast coding the coefficients of the quantised block using a flag decision. Two distinct search patterns are suggested: New Cross Diamond Diamond (NCDD) and New Cross Diamond Hexagonal (NCDH) search patterns, which reduce the time needed to locate the motion vectors. In this paper, the MLRVS algorithm with NCDD and MLRVS algorithm with NCDH search patterns are simulated separately and analyzed. The results show that the encoding time of the encoder is decreased by 55% with MLRVS algorithm using NCDD search pattern and 56% with MLRVS using NCDH search pattern compared to HM16.5 with Test Zone (TZ) search algorithm. These results are achieved with a slight increase in bit rate and negligible deterioration in output video quality

    HEVC의 소수 단위 움직임 추정을 위한 보간 필터 중복 연산 감소 방법

    Get PDF
    학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 8. 이혁재.High-Efficiency Video Coding (HEVC) [1] is the latest video coding standard established by Joint Collaborative Team on Video Coding (JCT-VC) aiming to achieve twice encoding efficiency with comparatively high video quality compared to its predecessor, the H.264 standard. Motion Estimation (ME) which consists of integer motion estimation (IME) and fractional motion estimation (FME) is the bottleneck of HEVC computation. In the execution of the HM reference software, ME alone accounts for about 50 % of the execution time in which IME contributes to about 20 % and FME does around 30% [2].The FMEs enormous computational complexity can be explained by two following reasons: • A large number of FME refinements processed: In HEVC, a frame is divided into CTU, whose size is usually 64x64 pixels. One 64x64 CTU consists of 85 CUs including one 64x64 CU at depth 0, four 32x32 CUs at depth 1, 16 16x16 CUs at depth 2, and 64 8x8 CUs at depth 3. Each CU can be partitioned into PUs according to a set of 8 allowable partition types. An HEVC encoder processes FME refinement for all possible PUs with usually 4 reference frames before deciding the best configuration for a CTU. As a result, typically in HEVCs reference software, HM, for one CTU, it has to process 2,372 FME refinements, which consumes a lot of computational resources. • A complicated and redundant interpolation process: Conventionally, FME refinement, which consists of interpolation and sum of absolute transformed difference (SATD), is processed for every PU in 4 reference frames. As a result, for a 64x64 CTU, in order to process fractional pixel refinement, FME needs to interpolate 6,232,900 fractional pixels. In addition, In HEVC, fractional pixels which consist half fractional pixels and quarter fractional pixels, are interpolated by 8-tap filters and 7-tap filters instead of 6-tap filters and bilinear filters as previous standards. As a result, interpolation process in FME imposes an extreme computational burden on HEVC encoders. This work proposes two algorithms which tackle each one of the two above reasons. The first algorithm, Advanced Decision of PU Partitions and CU Depths for FME, estimates the cost of IMEs and selects the PU partition types at the CU level and the CU depths at the coding tree unit (CTU) level for FME. Experimental results show that the algorithm effectively reduces the complexity by 67.47% with a BD-BR degrade of 1.08%. The second algorithm, A Reduction of the Interpolation Redundancy for FME, reduces up to 86.46% interpolation computation without any encoding performance decrease. The combination of the two algorithms forms a coherent solution to reduce the complexity of FME. Considering interpolation is a half of the complexity of an FME refinement, then the complexity of FME could be reduced more than 85% with a BD-BR increase of 1.66%Chapter 1. Introduction 1 1. Introduction to Video Coding 1 1.1. Definition of Video Coding 1 1.2. The Need of Video Coding 1 1.3. Basics of Video Coding 2 1.4. Video Coding Standard 2 2. Introduction to HEVC 6 2.1. HEVC Background and Development 6 2.2. Block Partitioning Structure in HEVC 9 Chapter 2. Fractional Motion Estimation in HEVC and Related Works on Complexity Reduction 21 1. Motion Estimation 21 2. Fractional Motion Estimation 22 2.1. Interpolation 22 2.2. Sum of Absolute Transformed Difference Calculation 27 2.3. Fractional Motion Estimation Procedure 28 Chapter 3. Complexity Reduction for FME 31 1. Problem Statement and Previous Studies 31 1.1. Problem Statement 31 1.2. Previous Studies 32 2. Proposed Algorithms 34 2.1. Advanced Decision of PU Partitions and CU Depths for Fractional Motion Estimation in HEVC 34 2.2. Range-based interpolation algorithm 40 Chapter 4. Experiment Results 43 1. Advanced Decision of PU Partitions and CU Depths for Fractional Motion Estimation in HEVC Algorithms 43 1.1. Advanced Decision of PU Partitions 43 1.2. Advanced Decision of CU Partitions 47 1.3. Combination of Advanced PU Partition and CU Depth Decision 47 1.4. Comparison with Other Similar Works 48 2. Range-based Algorithm 49 2.1. Software Implementation 49 2.2. Hardware Implementation of the Algorithm 50 Chapter 5. Conclusion 61 Bibliography 64 Abstract in Korean 66Maste

    High-Level Synthesis Based VLSI Architectures for Video Coding

    Get PDF
    High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified

    Low energy video processing and compression hardware designs

    Get PDF
    Digital video processing and compression algorithms are used in many commercial products such as mobile devices, unmanned aerial vehicles, and autonomous cars. Increasing resolution of videos used in these commercial products increased computational complexities of digital video processing and compression algorithms. Therefore, it is necessary to reduce computational complexities of digital video processing and compression algorithms, and energy consumptions of digital video processing and compression hardware without reducing visual quality. In this thesis, we propose a novel adaptive 2D digital image processing algorithm for 2D median filter, Gaussian blur and image sharpening. We designed low energy 2D median filter, Gaussian blur and image sharpening hardware using the proposed algorithm. We propose approximate HEVC intra prediction and HEVC fractional interpolation algorithms. We designed low energy approximate HEVC intra prediction and HEVC fractional interpolation hardware. We also propose several HEVC fractional interpolation hardware architectures. We propose novel computational complexity and energy reduction techniques for HEVC DCT and inverse DCT/DST. We designed high performance and low energy hardware for HEVC DCT and inverse DCT/DST including the proposed techniques. VII We quantified computation reductions achieved and video quality loss caused by the proposed algorithms and techniques. We implemented the proposed hardware architectures in Verilog HDL. We mapped the Verilog RTL codes to Xilinx Virtex 6 and Xilinx ZYNQ FPGAs, and estimated their power consumptions using Xilinx XPower Analyzer tool. The proposed algorithms and techniques significantly reduced the power and energy consumptions of these FPGA implementations in some cases with no PSNR loss and in some cases with very small PSNR loss

    Algoritmo de estimação de movimento e sua arquitetura de hardware para HEVC

    Get PDF
    Doutoramento em Engenharia EletrotécnicaVideo coding has been used in applications like video surveillance, video conferencing, video streaming, video broadcasting and video storage. In a typical video coding standard, many algorithms are combined to compress a video. However, one of those algorithms, the motion estimation is the most complex task. Hence, it is necessary to implement this task in real time by using appropriate VLSI architectures. This thesis proposes a new fast motion estimation algorithm and its implementation in real time. The results show that the proposed algorithm and its motion estimation hardware architecture out performs the state of the art. The proposed architecture operates at a maximum operating frequency of 241.6 MHz and is able to process 1080p@60Hz with all possible variables block sizes specified in HEVC standard as well as with motion vector search range of up to ±64 pixels.A codificação de vídeo tem sido usada em aplicações tais como, vídeovigilância, vídeo-conferência, video streaming e armazenamento de vídeo. Numa norma de codificação de vídeo, diversos algoritmos são combinados para comprimir o vídeo. Contudo, um desses algoritmos, a estimação de movimento é a tarefa mais complexa. Por isso, é necessário implementar esta tarefa em tempo real usando arquiteturas de hardware apropriadas. Esta tese propõe um algoritmo de estimação de movimento rápido bem como a sua implementação em tempo real. Os resultados mostram que o algoritmo e a arquitetura de hardware propostos têm melhor desempenho que os existentes. A arquitetura proposta opera a uma frequência máxima de 241.6 MHz e é capaz de processar imagens de resolução 1080p@60Hz, com todos os tamanhos de blocos especificados na norma HEVC, bem como um domínio de pesquisa de vetores de movimento até ±64 pixels

    High Performance Multiview Video Coding

    Get PDF
    Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications
    corecore