Search CORE

242 research outputs found

Analysis of Neural Video Compression Networks for 360-Degree Video Coding

Author: Brand Fabian
Kaup André
Regensky Andy
Publication venue
Publication date: 15/02/2024
Field of study

With the increasing efforts of bringing high-quality virtual reality technologies into the market, efficient 360-degree video compression gains in importance. As such, the state-of-the-art H.266/VVC video coding standard integrates dedicated tools for 360-degree video, and considerable efforts have been put into designing 360-degree projection formats with improved compression efficiency. For the fast-evolving field of neural video compression networks (NVCs), the effects of different 360-degree projection formats on the overall compression performance have not yet been investigated. It is thus unclear, whether a resampling from the conventional equirectangular projection (ERP) to other projection formats yields similar gains for NVCs as for hybrid video codecs, and which formats perform best. In this paper, we analyze several generations of NVCs and an extensive set of 360-degree projection formats with respect to their compression performance for 360-degree video. Based on our analysis, we find that projection format resampling yields significant improvements in compression performance also for NVCs. The adjusted cubemap projection (ACP) and equatorial cylindrical projection (ECP) show to perform best and achieve rate savings of more than 55% compared to ERP based on WS-PSNR for the most recent NVC. Remarkably, the observed rate savings are higher than for H.266/VVC, emphasizing the importance of projection format resampling for NVCs.Comment: 5 pages, 4 figures, 1 table, accepted for Picture Coding Symposium 2024 (PCS 2024

arXiv.org e-Print Archive

Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding

Author: Brand Fabian
Kaup André
Seiler Jürgen
Publication venue
Publication date: 24/07/2023
Field of study

Conditional coding is a new video coding paradigm enabled by neural-network-based compression. It can be shown that conditional coding is in theory better than the traditional residual coding, which is widely used in video compression standards like HEVC or VVC. However, on closer inspection, it becomes clear that conditional coders can suffer from information bottlenecks in the prediction path, i.e., that due to the data processing inequality not all information from the prediction signal can be passed to the reconstructed signal, thereby impairing the coder performance. In this paper we propose the conditional residual coding concept, which we derive from information theoretical properties of the conditional coder. This coder significantly reduces the influence of bottlenecks, while maintaining the theoretical performance of the conditional coder. We provide a theoretical analysis of the coding paradigm and demonstrate the performance of the conditional residual coder in a practical example. We show that conditional residual coders alleviate the disadvantages of conditional coders while being able to maintain their advantages over residual coders. In the spectrum of residual and conditional coding, we can therefore consider them as ``the best from both worlds''.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Boosting Neural Image Compression for Machines Using Latent Space Masking

Author: Brand Fabian
Fischer Kristian
Kaup André
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/10/2022
Field of study

Today, many image coding scenarios do not have a human as final intended user, but rather a machine fulfilling computer vision tasks on the decoded image. Thereby, the primary goal is not to keep visual quality but maintain the task accuracy of the machine for a given bitrate. Due to the tremendous progress of deep neural networks setting benchmarking results, mostly neural networks are employed to solve the analysis tasks at the decoder side. Moreover, neural networks have also found their way into the field of image compression recently. These two developments allow for an end-to-end training of the neural compression network for an analysis network as information sink. Therefore, we first roll out such a training with a task-specific loss to enhance the coding performance of neural compression networks. Compared to the standard VVC, 41.4% of bitrate are saved by this method for Mask R-CNN as analysis network on the uncompressed Cityscapes dataset. As a main contribution, we propose LSMnet, a network that runs in parallel to the encoder network and masks out elements of the latent space that are presumably not required for the analysis network. By this approach, additional 27.3% of bitrate are saved compared to the basic neural compression network optimized with the task loss. In addition, we are the first to utilize a feature-based distortion in the training loss within the context of machine-to-machine communication, which allows for a training without annotated data. We provide extensive analyses on the Cityscapes dataset including cross-evaluation with different analysis networks and present exemplary visual results. Inference code and pre-trained models are published at https://github.com/FAU-LMS/NCN_for_M2M.Comment: 12 pages, 9 figures, 3 tables; This work has been accepted for IEEE T-CSVT special issue "Learned Visual Data Compression for both Human and Machine". Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

On Benefits and Challenges of Conditional Interframe Video Coding in Light of Information Theory

Author: Brand Fabian
Kaup André
Seiler Jürgen
Publication venue
Publication date: 13/12/2022
Field of study

The rise of variational autoencoders for image and video compression has opened the door to many elaborate coding techniques. One example here is the possibility of conditional interframe coding. Here, instead of transmitting the residual between the original frame and the predicted frame (often obtained by motion compensation), the current frame is transmitted under the condition of knowing the prediction signal. In practice, conditional coding can be straightforwardly implemented using a conditional autoencoder, which has also shown good results in recent works. In this paper, we provide an information theoretical analysis of conditional coding for inter frames and show in which cases gains compared to traditional residual coding can be expected. We also show the effect of information bottlenecks which can occur in practical video coders in the prediction signal path due to the network structure, as a consequence of the data-processing theorem or due to quantization. We demonstrate that conditional coding has theoretical benefits over residual coding but that there are cases in which the benefits are quickly canceled by small information bottlenecks of the prediction signal.Comment: 5 pages, 4 figures, accepted to be presented at PCS 2022. arXiv admin note: text overlap with arXiv:2112.08011 Update Note: Fixed notation in Eq. 10, no changes otherwis

arXiv.org e-Print Archive

Jall, Andreas: Erfahrung von Offenbarung. Grundlagen, Quellen und Anwendungen der Erkenntnislehre Joseph Ratzingers

Author: Brand Fabian
Publication venue: 'Organisation for Economic Co-Operation and Development (OECD)'
Publication date: 19/06/2020
Field of study

Open Access-Zeitschriften an der WWU Münster (Westfälische Wilhelms-Universität)

Alles dreht sich um den Raum …: Topologische Aspekte der pastoralen Strukturreform

Author: Brand Fabian
Publication venue: Zeitschrift für Pastoraltheologie (ZPTh)
Publication date: 17/03/2020
Field of study

Open Access-Zeitschriften an der WWU Münster (Westfälische Wilhelms-Universität)

The Violent Interstellar Medium of Nearby Dwarf Galaxies

Author: Bomans
Brand
Brinks
Brinks
Deul
Fabian Walter
Radice
Rhode
Tenorio-Tagle
Thilker
van der Hulst
Walter
Publication venue: 'CSIRO Publishing'
Publication date: 01/01/1998
Field of study

High resolution HI observations of nearby dwarf galaxies (most of which are situated in the M 81 group at a distance of about 3.2 Mpc) reveal that their neutral interstellar medium (ISM) is dominated by hole-like features most of which are expanding. A comparison of the physical properties of these holes with the ones found in more massive spiral galaxies (such as M 31 and M 33) shows that they tend to reach much larger sizes in dwarf galaxies. This can be understood in terms of the galaxy's gravitational potential. The origin of these features is still a matter of debate. In general, young star forming regions (OB-associations) are held responsible for their formation. This picture, however, is not without its critics and other mechanism such as the infall of high velocity clouds, turbulent motions or even gamma ray bursters have been recently proposed. Here I will present one example of a supergiant shell in IC 2574 which corroborates the picture that OB associations are indeed creating these structures. This particular supergiant shell is currently the most promising case to study the effects of the combined effects of stellar winds and supernova-explosions which shape the neutral interstellar medium of (dwarf) galaxies.Comment: 8 pages, 4 figures, accepted for publication in PASA, in press. Online version: http://www.atnf.csiro.au/pasa/16_1/walter/paper

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Learning Frequency-Specific Quantization Scaling in VVC for Standard-Compliant Task-driven Image Coding

Author: Brand Fabian
Fischer Kristian
Herglotz Christian
Kaup André
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/01/2023
Field of study

Today, visual data is often analyzed by a neural network without any human being involved, which demands for specialized codecs. For standard-compliant codec adaptations towards certain information sinks, HEVC or VVC provide the possibility of frequency-specific quantization with scaling lists. This is a well-known method for the human visual system, where scaling lists are derived from psycho-visual models. In this work, we employ scaling lists when performing VVC intra coding for neural networks as information sink. To this end, we propose a novel data-driven method to obtain optimal scaling lists for arbitrary neural networks. Experiments with Mask R-CNN as information sink reveal that coding the Cityscapes dataset with the proposed scaling lists result in peak bitrate savings of 8.9 % over VVC with constant quantization. By that, our approach also outperforms scaling lists optimized for the human visual system. The generated scaling lists can be found under https://github.com/FAU-LMS/VCM_scaling_lists.Comment: Originally submitted at IEEE ICIP 202

arXiv.org e-Print Archive

Mixed-Initiative Planning for Manned-Unmanned Teaming Missions

Author: Brand Yannick
Schmitt Fabian
Publication venue
Publication date: 01/01/2020
Field of study

The proposed presentation describes an adaptive mixed-initiative agent which assists during mission (re-) planning to enable efficient multi-vehicle mission management. Our application comprises future military manned-unmanned teaming missions. The mixed-initiative agent is capable to plan and schedule tasks of manned and unmanned aircrafts. But instead of replacing the human’s role as mission manager, the agent acts as an additional team member and supports the human with task proposals and flaw corrections. Therefore, the agent supports on a level, which was formerly exclusively owned by human operators. The type and extend of support is adapted to the particular situation automatically. By reducing the pilot’s work share in the planning process, pilot mental workload can be reduced significantly. However, the probability for lacks in plan awareness increases

Open Archive Toulouse Archive Ouverte

Processing Energy Modeling for Neural Network Based Image Compression

Author: Brand Fabian
Herglotz Christian
Kaup André
Regensky Andy
Rievel Felix
Publication venue
Publication date: 29/06/2023
Field of study

Nowadays, the compression performance of neural-networkbased image compression algorithms outperforms state-of-the-art compression approaches such as JPEG or HEIC-based image compression. Unfortunately, most neural-network based compression methods are executed on GPUs and consume a high amount of energy during execution. Therefore, this paper performs an in-depth analysis on the energy consumption of state-of-the-art neural-network based compression methods on a GPU and show that the energy consumption of compression networks can be estimated using the image size with mean estimation errors of less than 7%. Finally, using a correlation analysis, we find that the number of operations per pixel is the main driving force for energy consumption and deduce that the network layers up to the second downsampling step are consuming most energy.Comment: 5 pages, 3 figures, accepted for IEEE International Conference on Image Processing (ICIP) 202

arXiv.org e-Print Archive