2 research outputs found
AV1 Video Coding Using Texture Analysis With Convolutional Neural Networks
Modern video codecs including the newly developed AOM/AV1 utilize hybrid
coding techniques to remove spatial and temporal redundancy. However, efficient
exploitation of statistical dependencies measured by a mean squared error (MSE)
does not always produce the best psychovisual result. One interesting approach
is to only encode visually relevant information and use a different coding
method for "perceptually insignificant" regions in the frame, which can lead to
substantial data rate reductions while maintaining visual quality. In this
paper, we introduce a texture analyzer before encoding the input sequences to
identify detail irrelevant texture regions in the frame using convolutional
neural networks. We designed and developed a new coding tool referred to as
texture mode for AV1, where if texture mode is selected at the encoder, no
inter-frame prediction is performed for the identified texture regions.
Instead, displacement of the entire region is modeled by just one set of motion
parameters. Therefore, only the model parameters are transmitted to the decoder
for reconstructing the texture regions. Non-texture regions in the frame are
coded conventionally. We show that for many standard test sets, the proposed
method achieved significant data rate reductions
Convolutional Neural Networks Based Texture Modeling For AV1
Modern video codecs including the newly developed AOMedia Video 1 (AV1)
utilize hybrid coding techniques to remove spatial and temporal redundancy.
However, efficient exploitation of statistical dependencies measured by a mean
squared error (MSE) does not always produce the best psychovisual result. One
interesting approach is to only encode visually relevant information and use a
different coding method for "perceptually insignificant" regions in the frame,
which can lead to substantial data rate reductions while maintaining visual
quality. In this paper, we introduce a texture analyzer before encoding the
input sequences to identify "perceptually insignificant" regions in the frame
using convolutional neural networks. We designed and developed a new scheme
that integrate the texture analyzer into the codec that can largely reduce the
temporal flickering artifact for codec with hierarchical coding structure. The
proposed method is implemented in AV1 codec by introducing a new coding tool
called texture mode, where texture mode is a special inter mode treated at the
encoder, that if texture mode is selected, no inter prediction is performed for
the identified texture regions. Instead, displacement of the entire region is
modeled by just one set of motion parameters. Therefore, only the model
parameters are transmitted to the decoder for reconstructing the texture
regions. Non-texture regions in the frame are coded conventionally. We show
that for many standard test sets, the proposed method achieved significant data
rate reductions with satisfying visual quality.Comment: 22 pages, 7 figures. arXiv admin note: substantial text overlap with
arXiv:1804.0929