65 research outputs found

    CTU Depth Decision Algorithms for HEVC: A Survey

    Get PDF
    High-Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64x64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1(AV1)

    Dynamically Reconfigurable Architectures and Systems for Time-varying Image Constraints (DRASTIC) for Image and Video Compression

    Get PDF
    In the current information booming era, image and video consumption is ubiquitous. The associated image and video coding operations require significant computing resources for both small-scale computing systems as well as over larger network systems. For different scenarios, power, bitrate and image quality can impose significant time-varying constraints. For example, mobile devices (e.g., phones, tablets, laptops, UAVs) come with significant constraints on energy and power. Similarly, computer networks provide time-varying bandwidth that can depend on signal strength (e.g., wireless networks) or network traffic conditions. Alternatively, the users can impose different constraints on image quality based on their interests. Traditional image and video coding systems have focused on rate-distortion optimization. More recently, distortion measures (e.g., PSNR) are being replaced by more sophisticated image quality metrics. However, these systems are based on fixed hardware configurations that provide limited options over power consumption. The use of dynamic partial reconfiguration with Field Programmable Gate Arrays (FPGAs) provides an opportunity to effectively control dynamic power consumption by jointly considering software-hardware configurations. This dissertation extends traditional rate-distortion optimization to rate-quality-power/energy optimization and demonstrates a wide variety of applications in both image and video compression. In each application, a family of Pareto-optimal configurations are developed that allow fine control in the rate-quality-power/energy optimization space. The term Dynamically Reconfiguration Architecture Systems for Time-varying Image Constraints (DRASTIC) is used to describe the derived systems. DRASTIC covers both software-only as well as software-hardware configurations to achieve fine optimization over a set of general modes that include: (i) maximum image quality, (ii) minimum dynamic power/energy, (iii) minimum bitrate, and (iv) typical mode over a set of opposing constraints to guarantee satisfactory performance. In joint software-hardware configurations, DRASTIC provides an effective approach for dynamic power optimization. For software configurations, DRASTIC provides an effective method for energy consumption optimization by controlling processing times. The dissertation provides several applications. First, stochastic methods are given for computing quantization tables that are optimal in the rate-quality space and demonstrated on standard JPEG compression. Second, a DRASTIC implementation of the DCT is used to demonstrate the effectiveness of the approach on motion JPEG. Third, a reconfigurable deblocking filter system is investigated for use in the current H.264/AVC systems. Fourth, the dissertation develops DRASTIC for all 35 intra-prediction modes as well as intra-encoding for the emerging High Efficiency Video Coding standard (HEVC)

    Visual Saliency Estimation Via HEVC Bitstream Analysis

    Get PDF
    Abstract Since Information Technology developed dramatically from the last century 50's, digital images and video are ubiquitous. In the last decade, image and video processing have become more and more popular in biomedical, industrial, art and other fields. People made progress in the visual information such as images or video display, storage and transmission. The attendant problem is that video processing tasks in time domain become particularly arduous. Based on the study of the existing compressed domain video saliency detection model, a new saliency estimation model for video based on High Efficiency Video Coding (HEVC) is presented. First, the relative features are extracted from HEVC encoded bitstream. The naive Bayesian model is used to train and test features based on original YUV videos and ground truth. The intra frame saliency map can be achieved after training and testing intra features. And inter frame saliency can be achieved by intra saliency with moving motion vectors. The ROC of our proposed intra mode is 0.9561. Other classification methods such as support vector machine (SVM), k nearest neighbors (KNN) and the decision tree are presented to compare the experimental outcomes. The variety of compression ratio has been analysis to affect the saliency

    Improved intra-prediction for video coding

    Full text link
    This thesis focuses on improving the HEVC (High Efficiency Video Coding) standard. HEVC is the newest video coding standard developed by the ITU-T Video Coding Experts Group (VCEG) and the ISO / IEC Moving Picture Experts Group (MPEG), as a successor to the popular state-of-the-art H.264/MPEG-4 AVC (Advanced Video Coding) standard. HEVC makes use of prediction to exploit redundancies in the signal and therefore achieve high compression efficiency. In particular, the Intra-Picture prediction block consists of predicting a block in the current frame using the reference information from neighbouring blocks in the same frame. It supports three different modes, the angular mode with 33 different directions, the planar mode and DC mode. HEVC is reportedly able to achieve in average more than 50% higher efficiency than H.264/MPEG-4 AVC, but this comes at the cost of very high computational complexity. The contributions of this thesis mainly consist in improvements to the Intra-Picture prediction block, with the goal of drastically reducing computational complexity and, at the same time achieving comparable compression efficiency as conventional HEVC. In average, 16.5% encoding operations can be saved using the proposed approach at the cost of relatively small compression efficiency losses.Éste proyecto se va a centrar en mejorar el estándar HEVC (High Efficiency Video Coding). HEVC es el estándar de codificación de video más reciente desarrollado por el UIT-T Video Coding Experts Group (VCEG) e ISO/IEC Moving Picture Experts Group (MPEG), siendo sucesor del popular estado del arte H.264/MPEG-4 AVC (Advanced Video Coding) estándar . HEVC hace uso de la predicción para aprovechar las redundancias en las señales y por lo tanto conseguir una alta eficiencia de compresión. En particular, el bloque Intra-Picture prediction consiste en predecir un bloque en el cuadro actual, utilizando información de referencia de bloques vecinos en el mismo cuadro. Soporta tres modos distintos, el modo angular con 33 diferentes direcciones, el modo Planar y el modo DC. HEVC es suficientemente capaz de lograr de media una eficiencia mayor del 50% que H.264/MPEG-4 AVC, a costa de una alta complejidad computacional. Las aportaciones a esta tesis consisten principalmente en mejoras en el bloque Intra-Picture prediction, con el objetivo de reducir drásticamente la complejidad computacional y a la vez, lograr una eficiencia de compresión comparable al HEVC convencional. En promedio, un 16.5% de las operaciones de codificación pueden evitarse usando el enfoque propuesto a costa de pérdidas relativamente pequeñas de la eficiencia de compresión
    corecore