Search CORE

4 research outputs found

A CNN-based Post-Processor for Perceptually-Optimized Immersive Media Compression

Author: Bull David
Katsenou Angeliki
Zhang Fan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2022
Field of study

In recent years, resolution adaptation based on deep neural networks has enabled significant performance gains for conventional (2D) video codecs. This paper investigates the effectiveness of spatial resolution resampling in the context of immersive content. The proposed approach reduces the spatial resolution of input multi-view videos before encoding, and reconstructs their original resolution after decoding. During the up-sampling process, an advanced CNN model is used to reduce potential re-sampling, compression, and synthesis artifacts. This work has been fully tested with the TMIV coding standard using a Versatile Video Coding (VVC) codec. The results demonstrate that the proposed method achieves a significant rate-quality performance improvement for the majority of the test sequences, with an average BD-VMAF improvement of 3.07 overall sequences

arXiv.org e-Print Archive

Explore Bristol Research

Spherical-harmonics-based sound field decomposition and multichannel NMF for sound source separation

Author: Antonacci Fabio
Carabias-Orti Julio
Pezzoli Mirco
Sarti Augusto
Vera-Candeas Pedro
Publication venue
Publication date: 01/01/2024
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

3D Scene Geometry Estimation from 360 $^\circ$ Imagery: A Survey

Author: da Silveira Thiago Lopes Trugillo
Jung Claudio Rosito
Llerena Jeffri Erwin Murrugarra
Pinto Paulo Gamarra Lessa
Publication venue
Publication date: 17/01/2024
Field of study

This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360

^\circ

, spherical or panoramic) images and videos. We then survey monocular layout and depth inference approaches, highlighting the recent advances in learning-based solutions suited for spherical data. The classical stereo matching is then revised on the spherical domain, where methodologies for detecting and describing sparse and dense features become crucial. The stereo matching concepts are then extrapolated for multiple view camera setups, categorizing them among light fields, multi-view stereo, and structure from motion (or visual simultaneous localization and mapping). We also compile and discuss commonly adopted datasets and figures of merit indicated for each purpose and list recent results for completeness. We conclude this paper by pointing out current and future trends.Comment: Published in ACM Computing Survey

arXiv.org e-Print Archive

Standardization Status of Immersive Video Coding

Author: Boyce Jill M.
Peng Wen-Hsiao
Stockhammer Thomas
Wien Mathias
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study