6 research outputs found
Reducing Complexity on Coding Unit Partitioning in Video Coding: A Review
In this article, we present a survey on the low complexity video coding on a coding unit (CU) partitioning with the aim for researchers to understand the foundation of video coding and fast CU partition algorithms. Firstly, we introduce video coding technologies by explaining the trending standards and reference models. They are High Efficiency Video Coding (HEVC), Joint Exploration Test Model (JEM), and VVC, which introduce novel quadtree (QT), quadtree plus binary tree (QTBT), quadtree plus multi-type tree (QTMT) block partitioning with expensive computation complexity, respectively. Secondly, we present a comprehensive explanation of the time-consuming CU partitioning, especially for researchers who are not familiar with CU partitioning. The newer the video coding standard, the more flexible partition structures and the higher the computational complexity. Then, we provide a deep and comprehensive survey of recent and state-of-the-art researches. Finally, we include a discussion section about the advantages and disadvantage of heuristic based and learning based approaches for the readers to explore quickly the performance of the existing algorithms and their limitations. To our knowledge, it is the first comprehensive survey to provide sufficient information about fast CU partitioning on HEVC, JEM, and VVC
Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation
The ever-growing multimedia traffic has underscored the importance of
effective multimedia codecs. Among them, the up-to-date lossy video coding
standard, Versatile Video Coding (VVC), has been attracting attentions of video
coding community. However, the gain of VVC is achieved at the cost of
significant encoding complexity, which brings the need to realize fast encoder
with comparable Rate Distortion (RD) performance. In this paper, we propose to
optimize the VVC complexity at intra-frame prediction, with a two-stage
framework of deep feature fusion and probability estimation. At the first
stage, we employ the deep convolutional network to extract the spatialtemporal
neighboring coding features. Then we fuse all reference features obtained by
different convolutional kernels to determine an optimal intra coding depth. At
the second stage, we employ a probability-based model and the spatial-temporal
coherence to select the candidate partition modes within the optimal coding
depth. Finally, these selected depths and partitions are executed whilst
unnecessary computations are excluded. Experimental results on standard
database demonstrate the superiority of proposed method, especially for High
Definition (HD) and Ultra-HD (UHD) video sequences.Comment: 10 pages, 10 figure
CNN-based Prediction of Partition Path for VVC Fast Inter Partitioning Using Motion Fields
The Versatile Video Coding (VVC) standard has been recently finalized by the
Joint Video Exploration Team (JVET). Compared to the High Efficiency Video
Coding (HEVC) standard, VVC offers about 50% compression efficiency gain, in
terms of Bjontegaard Delta-Rate (BD-rate), at the cost of a 10-fold increase in
encoding complexity. In this paper, we propose a method based on Convolutional
Neural Network (CNN) to speed up the inter partitioning process in VVC.
Firstly, a novel representation for the quadtree with nested multi-type tree
(QTMT) partition is introduced, derived from the partition path. Secondly, we
develop a U-Net-based CNN taking a multi-scale motion vector field as input at
the Coding Tree Unit (CTU) level. The purpose of CNN inference is to predict
the optimal partition path during the Rate-Distortion Optimization (RDO)
process. To achieve this, we divide CTU into grids and predict the Quaternary
Tree (QT) depth and Multi-type Tree (MT) split decisions for each cell of the
grid. Thirdly, an efficient partition pruning algorithm is introduced to employ
the CNN predictions at each partitioning level to skip RDO evaluations of
unnecessary partition paths. Finally, an adaptive threshold selection scheme is
designed, making the trade-off between complexity and efficiency scalable.
Experiments show that the proposed method can achieve acceleration ranging from
16.5% to 60.2% under the RandomAccess Group Of Picture 32 (RAGOP32)
configuration with a reasonable efficiency drop ranging from 0.44% to 4.59% in
terms of BD-rate, which surpasses other state-of-the-art solutions.
Additionally, our method stands out as one of the lightest approaches in the
field, which ensures its applicability to other encoders
Learned-based Intra Coding Tools for Video Compression.
PhD Theses.The increase in demand for video rendering in 4K and beyond displays, as well
as immersive video formats, requires the use of e cient compression techniques. In
this thesis novel methods for enhancing the e ciency of current and next generation
video codecs are investigated. Several aspects that in
uence the way conventional video
coding methods work are considered. The methods proposed in this thesis utilise Neural
Networks (NNs) trained for regression tasks in order to predict data. In particular,
Convolutional Neural Networks (CNNs) are used to predict Rate-Distortion (RD) data
for intra-coded frames. Moreover, a novel intra-prediction methods are proposed with
the aim of providing new ways to exploit redundancies overlooked by traditional intraprediction
tools. Additionally, it is shown how such methods can be simpli ed in order
to derive less resource-demanding tools