260 research outputs found
Complexity Analysis Of Next-Generation VVC Encoding and Decoding
While the next generation video compression standard, Versatile Video Coding
(VVC), provides a superior compression efficiency, its computational complexity
dramatically increases. This paper thoroughly analyzes this complexity for both
encoder and decoder of VVC Test Model 6, by quantifying the complexity
break-down for each coding tool and measuring the complexity and memory
requirements for VVC encoding/decoding. These extensive analyses are performed
for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD),
Random-Access (RA), and All-Intra (AI) conditions (a total of 320
encoding/decoding). Results indicate that the VVC encoder and decoder are 5x
and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI,
respectively. Detailed analysis of coding tools reveals that in LD on average,
motion estimation tools with 53%, transformation and quantization with 22%, and
entropy coding with 7% dominate the encoding complexity. In decoding, loop
filters with 30%, motion compensation with 20%, and entropy decoding with 16%,
are the most complex modules. Moreover, the required memory bandwidth for VVC
encoding/decoding are measured through memory profiling, which are 30x and 3x
of HEVC. The reported results and insights are a guide for future research and
implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202
Maximum-Entropy-Model-Enabled Complexity Reduction Algorithm in Modern Video Coding Standards
Symmetry considerations play a key role in modern science, and any differentiable symmetry of the action of a physical system has a corresponding conservation law. Symmetry may be regarded as reduction of Entropy. This work focuses on reducing the computational complexity of modern video coding standards by using the maximum entropy principle. The high computational complexity of the coding unit (CU) size decision in modern video coding standards is a critical challenge for real-time applications. This problem is solved in a novel approach considering CU termination, skip, and normal decisions as three-class making problems. The maximum entropy model (MEM) is formulated to the CU size decision problem, which can optimize the conditional entropy; the improved iterative scaling (IIS) algorithm is used to solve this optimization problem. The classification features consist of the spatio-temporal information of the CU, including the rate–distortion (RD) cost, coded block flag (CBF), and depth. For the case analysis, the proposed method is based on High Efficiency Video Coding (H.265/HEVC) standards. The experimental results demonstrate that the proposed method can reduce the computational complexity of the H.265/HEVC encoder significantly. Compared with the H.265/HEVC reference model, the proposed method can reduce the average encoding time by 53.27% and 56.36% under low delay and random access configurations, while Bjontegaard Delta Bit Rates (BD-BRs) are 0.72% and 0.93% on average
High Performance Multiview Video Coding
Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications
Optimal coding unit decision for early termination in high efficiency video coding using enhanced whale optimization algorithm
Video compression is an emerging research topic in the field of block based video encoders. Due to the growth of video coding technologies, high efficiency video coding (HEVC) delivers superior coding performance. With the increased encoding complexity, the HEVC enhances the rate-distortion (RD) performance. In the video compression, the out-sized coding units (CUs) have higher encoding complexity. Therefore, the computational encoding cost and complexity remain vital concerns, which need to be considered as an optimization task. In this manuscript, an enhanced whale optimization algorithm (EWOA) is implemented to reduce the computational time and complexity of the HEVC. In the EWOA, a cosine function is incorporated with the controlling parameter A and two correlation factors are included in the WOA for controlling the position of whales and regulating the movement of search mechanism during the optimization and search processes. The bit streams in the Luma-coding tree block are selected using EWOA that defines the CU neighbors and is used in the HEVC. The results indicate that the EWOA achieves best bit rate (BR), time saving, and peak signal to noise ratio (PSNR). The EWOA showed 0.006-0.012 dB higher PSNR than the existing models in the real-time videos
Algorithms and methods for video transcoding.
Video transcoding is the process of dynamic video adaptation. Dynamic video adaptation can be defined as the process of converting video from one format to another, changing the bit rate, frame rate or resolution of the encoded video, which is mainly necessitated by the end user requirements. H.264 has been the predominantly used video compression standard for the last 15 years. HEVC (High Efficiency Video Coding) is the latest video compression standard finalised in 2013, which is an improvement over H.264 video compression standard. HEVC performs significantly better than H.264 in terms of the Rate-Distortion performance. As H.264 has been widely used in the last decade, a large amount of video content exists in H.264 format. There is a need to convert H.264 video content to HEVC format to achieve better Rate-Distortion performance and to support legacy video formats on newer devices. However, the computational complexity of HEVC encoder is 2-10 times higher than that of H.264 encoder. This makes it necessary to develop low complexity video transcoding algorithms to transcode from H.264 to HEVC format. This research work proposes low complexity algorithms for H.264 to HEVC video transcoding. The proposed algorithms reduce the computational complexity of H.264 to HEVC video transcoding significantly, with negligible loss in Rate-Distortion performance. This work proposes three different video transcoding algorithms. The MV-based mode merge algorithm uses the block mode and MV variances to estimate the split/non-split decision as part of the HEVC block prediction process. The conditional probability-based mode mapping algorithm models HEVC blocks of sizes 16×16 and lower as a function of H.264 block modes, H.264 and HEVC Quantisation Parameters (QP). The motion-compensated MB residual-based mode mapping algorithm makes the split/non-split decision based on content-adaptive classification models. With a combination of the proposed set of algorithms, the computational complexity of the HEVC encoder is reduced by around 60%, with negligible loss in Rate-Distortion performance, outperforming existing state-of-art algorithms by 20-25% in terms of computational complexity. The proposed algorithms can be used in computation-constrained video transcoding applications, to support video format conversion in smart devices, migration of large-scale H.264 video content from host servers to HEVC, cloud computing-based transcoding applications, and also to support high quality videos over bandwidth-constrained networks
Quality of Experience (QoE)-Aware Fast Coding Unit Size Selection for HEVC Intra-prediction
The exorbitant increase in the computational complexity of modern video coding standards, such as High Efficiency Video Coding (HEVC), is a compelling challenge for resource-constrained consumer electronic devices. For instance, the brute force evaluation of all possible combinations of available coding modes and quadtree-based coding structure in HEVC to determine the optimum set of coding parameters for a given content demand a substantial amount of computational and energy resources. Thus, the resource requirements for real time operation of HEVC has become a contributing factor towards the Quality of Experience (QoE) of the end users of emerging multimedia and future internet applications. In this context, this paper proposes a content-adaptive Coding Unit (CU) size selection algorithm for HEVC intra-prediction. The proposed algorithm builds content-specific weighted Support Vector Machine (SVM) models in real time during the encoding process, to provide an early estimate of CU size for a given content, avoiding the brute force evaluation of all possible coding mode combinations in HEVC. The experimental results demonstrate an average encoding time reduction of 52.38%, with an average Bjøntegaard Delta Bit Rate (BDBR) increase of 1.19% compared to the HM16.1 reference encoder. Furthermore, the perceptual visual quality assessments conducted through Video Quality Metric (VQM) show minimal visual quality impact on the reconstructed videos of the proposed algorithm compared to state-of-the-art approaches
Efficient HEVC-based video adaptation using transcoding
In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints.
These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency.
This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications
Fast and Efficient Lenslet Image Compression
Light field imaging is characterized by capturing brightness, color, and
directional information of light rays in a scene. This leads to image
representations with huge amount of data that require efficient coding schemes.
In this paper, lenslet images are rendered into sub-aperture images. These
images are organized as a pseudo-sequence input for the HEVC video codec. To
better exploit redundancy among the neighboring sub-aperture images and
consequently decrease the distances between a sub-aperture image and its
references used for prediction, sub-aperture images are divided into four
smaller groups that are scanned in a serpentine order. The most central
sub-aperture image, which has the highest similarity to all the other images,
is used as the initial reference image for each of the four regions.
Furthermore, a structure is defined that selects spatially adjacent
sub-aperture images as prediction references with the highest similarity to the
current image. In this way, encoding efficiency increases, and furthermore it
leads to a higher similarity among the co-located Coding Three Units (CTUs).
The similarities among the co-located CTUs are exploited to predict Coding Unit
depths.Moreover, independent encoding of each group division enables parallel
processing, that along with the proposed coding unit depth prediction decrease
the encoding execution time by almost 80% on average. Simulation results show
that Rate-Distortion performance of the proposed method has higher compression
gain than the other state-of-the-art lenslet compression methods with lower
computational complexity
- …