5,816 research outputs found
Perceptually-Driven Video Coding with the Daala Video Codec
The Daala project is a royalty-free video codec that attempts to compete with
the best patent-encumbered codecs. Part of our strategy is to replace core
tools of traditional video codecs with alternative approaches, many of them
designed to take perceptual aspects into account, rather than optimizing for
simple metrics like PSNR. This paper documents some of our experiences with
these tools, which ones worked and which did not. We evaluate which tools are
easy to integrate into a more traditional codec design, and show results in the
context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital
Image Processing (ADIP), 201
CARNet:Compression Artifact Reduction for Point Cloud Attribute
A learning-based adaptive loop filter is developed for the Geometry-based
Point Cloud Compression (G-PCC) standard to reduce attribute compression
artifacts. The proposed method first generates multiple Most-Probable Sample
Offsets (MPSOs) as potential compression distortion approximations, and then
linearly weights them for artifact mitigation. As such, we drive the filtered
reconstruction as close to the uncompressed PCA as possible. To this end, we
devise a Compression Artifact Reduction Network (CARNet) which consists of two
consecutive processing phases: MPSOs derivation and MPSOs combination. The
MPSOs derivation uses a two-stream network to model local neighborhood
variations from direct spatial embedding and frequency-dependent embedding,
where sparse convolutions are utilized to best aggregate information from
sparsely and irregularly distributed points. The MPSOs combination is guided by
the least square error metric to derive weighting coefficients on the fly to
further capture content dynamics of input PCAs. The CARNet is implemented as an
in-loop filtering tool of the GPCC, where those linear weighting coefficients
are encapsulated into the bitstream with negligible bit rate overhead.
Experimental results demonstrate significant improvement over the latest GPCC
both subjectively and objectively.Comment: 13pages, 8figure
Mesh-based video coding for low bit-rate communications
In this paper, a new method for low bit-rate content-adaptive mesh-based video coding is proposed. Intra-frame coding of this method employs feature map extraction for node distribution at specific threshold levels to achieve higher density placement of initial nodes for regions that contain high frequency features and conversely sparse placement of initial nodes for smooth regions. Insignificant nodes are largely removed using a subsequent node elimination scheme. The Hilbert scan is then applied before quantization and entropy coding to reduce amount of transmitted information. For moving images, both node position and color parameters of only a subset of nodes may change from frame to frame. It is sufficient to transmit only these changed parameters. The proposed method is well-suited for video coding at very low bit rates, as processing results demonstrate that it provides good subjective and objective image quality at a lower number of required bits
Designs and Implementations in Neural Network-based Video Coding
The past decade has witnessed the huge success of deep learning in well-known
artificial intelligence applications such as face recognition, autonomous
driving, and large language model like ChatGPT. Recently, the application of
deep learning has been extended to a much wider range, with neural
network-based video coding being one of them. Neural network-based video coding
can be performed at two different levels: embedding neural network-based
(NN-based) coding tools into a classical video compression framework or
building the entire compression framework upon neural networks. This paper
elaborates some of the recent exploration efforts of JVET (Joint Video Experts
Team of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC29) in the name of neural
network-based video coding (NNVC), falling in the former category.
Specifically, this paper discusses two major NN-based video coding
technologies, i.e. neural network-based intra prediction and neural
network-based in-loop filtering, which have been investigated for several
meeting cycles in JVET and finally adopted into the reference software of NNVC.
Extensive experiments on top of the NNVC have been conducted to evaluate the
effectiveness of the proposed techniques. Compared with VTM-11.0_nnvc, the
proposed NN-based coding tools in NNVC-4.0 could achieve {11.94%, 21.86%,
22.59%}, {9.18%, 19.76%, 20.92%}, and {10.63%, 21.56%, 23.02%} BD-rate
reductions on average for {Y, Cb, Cr} under random-access, low-delay, and
all-intra configurations respectively
Recommended from our members
Efficient Debanding Filtering for Inverse Tone Mapped High Dynamic Range Videos
- …