Search CORE

5,816 research outputs found

Perceptually-Driven Video Coding with the Daala Video Codec

Author: Bankoski
Daede
Daede
Dai
de Oliveira
Duda
Egge
Egge
Fukuma
Fuldseth
Grange
Han
Ponomarenko
Reader
Sezer
Stuiver
Terriberry
Terriberry
Tran
Valin
Valin
Valin
Wang
Watanabe
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 08/10/2016
Field of study

The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

arXiv.org e-Print Archive

Crossref

CARNet:Compression Artifact Reduction for Point Cloud Attribute

Author: Ding Dandan
Ma Zhan
Wang Jianqiang
Zhang Junzhe
Publication venue
Publication date: 17/09/2022
Field of study

A learning-based adaptive loop filter is developed for the Geometry-based Point Cloud Compression (G-PCC) standard to reduce attribute compression artifacts. The proposed method first generates multiple Most-Probable Sample Offsets (MPSOs) as potential compression distortion approximations, and then linearly weights them for artifact mitigation. As such, we drive the filtered reconstruction as close to the uncompressed PCA as possible. To this end, we devise a Compression Artifact Reduction Network (CARNet) which consists of two consecutive processing phases: MPSOs derivation and MPSOs combination. The MPSOs derivation uses a two-stream network to model local neighborhood variations from direct spatial embedding and frequency-dependent embedding, where sparse convolutions are utilized to best aggregate information from sparsely and irregularly distributed points. The MPSOs combination is guided by the least square error metric to derive weighting coefficients on the fly to further capture content dynamics of input PCAs. The CARNet is implemented as an in-loop filtering tool of the GPCC, where those linear weighting coefficients are encapsulated into the bitstream with negligible bit rate overhead. Experimental results demonstrate significant improvement over the latest GPCC both subjectively and objectively.Comment: 13pages, 8figure

arXiv.org e-Print Archive

Mesh-based video coding for low bit-rate communications

Author: Ahmed KM
Fernando WAC
Kocharoen P
Rajatheva RMAP
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In this paper, a new method for low bit-rate content-adaptive mesh-based video coding is proposed. Intra-frame coding of this method employs feature map extraction for node distribution at specific threshold levels to achieve higher density placement of initial nodes for regions that contain high frequency features and conversely sparse placement of initial nodes for smooth regions. Insignificant nodes are largely removed using a subsequent node elimination scheme. The Hilbert scan is then applied before quantization and entropy coding to reduce amount of transmitted information. For moving images, both node position and color parameters of only a subset of nodes may change from frame to frame. It is sufficient to transmit only these changed parameters. The proposed method is well-suited for video coding at very low bit rates, as processing results demonstrate that it provides good subjective and objective image quality at a lower number of required bits

Surrey Research Insight

Brunel University Research Archive

Designs and Implementations in Neural Network-based Video Coding

Author: Andersson Kenneth
Coban Muhammed
Dumas Thierry
Galpin Franck
Li Junru
Li Yue
Lin Chaoyi
Liu Du
Ström Jacob
Wang Hongtao
Zhang Kai
Zhang Li
Publication venue
Publication date: 13/09/2023
Field of study

The past decade has witnessed the huge success of deep learning in well-known artificial intelligence applications such as face recognition, autonomous driving, and large language model like ChatGPT. Recently, the application of deep learning has been extended to a much wider range, with neural network-based video coding being one of them. Neural network-based video coding can be performed at two different levels: embedding neural network-based (NN-based) coding tools into a classical video compression framework or building the entire compression framework upon neural networks. This paper elaborates some of the recent exploration efforts of JVET (Joint Video Experts Team of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC29) in the name of neural network-based video coding (NNVC), falling in the former category. Specifically, this paper discusses two major NN-based video coding technologies, i.e. neural network-based intra prediction and neural network-based in-loop filtering, which have been investigated for several meeting cycles in JVET and finally adopted into the reference software of NNVC. Extensive experiments on top of the NNVC have been conducted to evaluate the effectiveness of the proposed techniques. Compared with VTM-11.0_nnvc, the proposed NN-based coding tools in NNVC-4.0 could achieve {11.94%, 21.86%, 22.59%}, {9.18%, 19.76%, 20.92%}, and {10.63%, 21.56%, 23.02%} BD-rate reductions on average for {Y, Cb, Cr} under random-access, low-delay, and all-intra configurations respectively

arXiv.org e-Print Archive

Recommended from our members

Efficient Debanding Filtering for Inverse Tone Mapped High Dynamic Range Videos

Author: Cosman Pamela C
Song Qing
Su Guan-Ming
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

eScholarship - University of California