36 research outputs found
Analysis of Neural Video Compression Networks for 360-Degree Video Coding
With the increasing efforts of bringing high-quality virtual reality
technologies into the market, efficient 360-degree video compression gains in
importance. As such, the state-of-the-art H.266/VVC video coding standard
integrates dedicated tools for 360-degree video, and considerable efforts have
been put into designing 360-degree projection formats with improved compression
efficiency. For the fast-evolving field of neural video compression networks
(NVCs), the effects of different 360-degree projection formats on the overall
compression performance have not yet been investigated. It is thus unclear,
whether a resampling from the conventional equirectangular projection (ERP) to
other projection formats yields similar gains for NVCs as for hybrid video
codecs, and which formats perform best. In this paper, we analyze several
generations of NVCs and an extensive set of 360-degree projection formats with
respect to their compression performance for 360-degree video. Based on our
analysis, we find that projection format resampling yields significant
improvements in compression performance also for NVCs. The adjusted cubemap
projection (ACP) and equatorial cylindrical projection (ECP) show to perform
best and achieve rate savings of more than 55% compared to ERP based on WS-PSNR
for the most recent NVC. Remarkably, the observed rate savings are higher than
for H.266/VVC, emphasizing the importance of projection format resampling for
NVCs.Comment: 5 pages, 4 figures, 1 table, accepted for Picture Coding Symposium
2024 (PCS 2024
Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding
Conditional coding is a new video coding paradigm enabled by
neural-network-based compression. It can be shown that conditional coding is in
theory better than the traditional residual coding, which is widely used in
video compression standards like HEVC or VVC. However, on closer inspection, it
becomes clear that conditional coders can suffer from information bottlenecks
in the prediction path, i.e., that due to the data processing inequality not
all information from the prediction signal can be passed to the reconstructed
signal, thereby impairing the coder performance. In this paper we propose the
conditional residual coding concept, which we derive from information
theoretical properties of the conditional coder. This coder significantly
reduces the influence of bottlenecks, while maintaining the theoretical
performance of the conditional coder. We provide a theoretical analysis of the
coding paradigm and demonstrate the performance of the conditional residual
coder in a practical example. We show that conditional residual coders
alleviate the disadvantages of conditional coders while being able to maintain
their advantages over residual coders. In the spectrum of residual and
conditional coding, we can therefore consider them as ``the best from both
worlds''.Comment: 12 pages, 8 figure
Boosting Neural Image Compression for Machines Using Latent Space Masking
Today, many image coding scenarios do not have a human as final intended
user, but rather a machine fulfilling computer vision tasks on the decoded
image. Thereby, the primary goal is not to keep visual quality but maintain the
task accuracy of the machine for a given bitrate. Due to the tremendous
progress of deep neural networks setting benchmarking results, mostly neural
networks are employed to solve the analysis tasks at the decoder side.
Moreover, neural networks have also found their way into the field of image
compression recently. These two developments allow for an end-to-end training
of the neural compression network for an analysis network as information sink.
Therefore, we first roll out such a training with a task-specific loss to
enhance the coding performance of neural compression networks. Compared to the
standard VVC, 41.4% of bitrate are saved by this method for Mask R-CNN as
analysis network on the uncompressed Cityscapes dataset. As a main
contribution, we propose LSMnet, a network that runs in parallel to the encoder
network and masks out elements of the latent space that are presumably not
required for the analysis network. By this approach, additional 27.3% of
bitrate are saved compared to the basic neural compression network optimized
with the task loss. In addition, we are the first to utilize a feature-based
distortion in the training loss within the context of machine-to-machine
communication, which allows for a training without annotated data. We provide
extensive analyses on the Cityscapes dataset including cross-evaluation with
different analysis networks and present exemplary visual results. Inference
code and pre-trained models are published at
https://github.com/FAU-LMS/NCN_for_M2M.Comment: 12 pages, 9 figures, 3 tables; This work has been accepted for IEEE
T-CSVT special issue "Learned Visual Data Compression for both Human and
Machine". Copyright may be transferred without notice, after which this
version may no longer be accessibl
On Benefits and Challenges of Conditional Interframe Video Coding in Light of Information Theory
The rise of variational autoencoders for image and video compression has
opened the door to many elaborate coding techniques. One example here is the
possibility of conditional interframe coding. Here, instead of transmitting the
residual between the original frame and the predicted frame (often obtained by
motion compensation), the current frame is transmitted under the condition of
knowing the prediction signal. In practice, conditional coding can be
straightforwardly implemented using a conditional autoencoder, which has also
shown good results in recent works. In this paper, we provide an information
theoretical analysis of conditional coding for inter frames and show in which
cases gains compared to traditional residual coding can be expected. We also
show the effect of information bottlenecks which can occur in practical video
coders in the prediction signal path due to the network structure, as a
consequence of the data-processing theorem or due to quantization. We
demonstrate that conditional coding has theoretical benefits over residual
coding but that there are cases in which the benefits are quickly canceled by
small information bottlenecks of the prediction signal.Comment: 5 pages, 4 figures, accepted to be presented at PCS 2022. arXiv admin
note: text overlap with arXiv:2112.08011 Update Note: Fixed notation in Eq.
10, no changes otherwis
Learning Frequency-Specific Quantization Scaling in VVC for Standard-Compliant Task-driven Image Coding
Today, visual data is often analyzed by a neural network without any human
being involved, which demands for specialized codecs. For standard-compliant
codec adaptations towards certain information sinks, HEVC or VVC provide the
possibility of frequency-specific quantization with scaling lists. This is a
well-known method for the human visual system, where scaling lists are derived
from psycho-visual models. In this work, we employ scaling lists when
performing VVC intra coding for neural networks as information sink. To this
end, we propose a novel data-driven method to obtain optimal scaling lists for
arbitrary neural networks. Experiments with Mask R-CNN as information sink
reveal that coding the Cityscapes dataset with the proposed scaling lists
result in peak bitrate savings of 8.9 % over VVC with constant quantization. By
that, our approach also outperforms scaling lists optimized for the human
visual system. The generated scaling lists can be found under
https://github.com/FAU-LMS/VCM_scaling_lists.Comment: Originally submitted at IEEE ICIP 202
Processing Energy Modeling for Neural Network Based Image Compression
Nowadays, the compression performance of neural-networkbased image
compression algorithms outperforms state-of-the-art compression approaches such
as JPEG or HEIC-based image compression. Unfortunately, most neural-network
based compression methods are executed on GPUs and consume a high amount of
energy during execution. Therefore, this paper performs an in-depth analysis on
the energy consumption of state-of-the-art neural-network based compression
methods on a GPU and show that the energy consumption of compression networks
can be estimated using the image size with mean estimation errors of less than
7%. Finally, using a correlation analysis, we find that the number of
operations per pixel is the main driving force for energy consumption and
deduce that the network layers up to the second downsampling step are consuming
most energy.Comment: 5 pages, 3 figures, accepted for IEEE International Conference on
Image Processing (ICIP) 202
Behind the NAT – A Measurement Based Evaluation of Cellular Service Quality
Abstract—Mobile applications such as VoIP, (live) gaming, or video streaming have diverse QoS requirements ranging from low delay to high throughput. The optimization of the network quality experienced by end-users requires detailed knowledge of the expected network performance. Also, the achieved service quality is affected by a number of factors, including network operator and available technologies. However, most studies focusing on measuring the cellular network do not consider the performance implications of network configuration and management. To this end, this paper reports about an extensive data set of cellular network measurements, focused on analyzing root causes of mobile network performance variability. Measurements conducted over four weeks in a 4G cellular network in Germany show that management and configuration decisions have a substantial impact on the performance. Specifically, it is observed that the association of mobile devices to a Point of Presence (PoP) within the operator’s network can influence the end-to-end RTT by a large extent. Given the collected data a model predicting the PoP assignment and its resulting RTT leveraging Markov Chain and machine learning approaches is developed. RTT increases of 58% to 73% compared to the optimum performance are observed in more than 57% of the measurements
Содовые подземные воды юга-востока Западной Сибири: определение и распространение
Дается определение понятия "содовые воды", приводятся условия локализации подземных содовых вод на юго-востоке Западной Сибири и некоторые их химические особенности. Definition of the term "soda water", the conditions of localization of underground soda waters on the South-East of Western Siberia and some of their chemical features are given
Energy-efficiency and Performance in Communication Networks: Analyzing Energy-Performance Trade-offs in Communication Networks and their Implications on Future Network Structure and Management
The demand on communication networks has increased over the past years and is predicted to continue for the foreseeable future [Cis16]. Cellular network access with a compound annual growth rate (CAGR) of 53 % is the main area of growth [Cis16]. This affects the network quality, bringing current network technologies to their limits [Qua13]. Future network standards like 5G promise to satisfy this demand, providing a 1000-fold increase in data rates and latencies as low as 1 ms [Qua13].
With information and communications technology (ICT) causing 10 % of the global energy consumption [Mil13], the increasing demand is also reflected in a growing energy consumption of communication networks [BBD+11]. The major contributor to the network power consumption are home gateways (HGWs) in the fixed access network, and mobile base stations in the cellular network [VHD+11]. This trend is predicted to continue [BBD+11].
To assess and optimize the power consumption of communication networks, power models of the involved devices are required. Using these, the efficiency of proposed optimization approaches can be assessed before deployment. A number of power models of conventional network equipment for different device classes can be derived from literature. Still, models of new device classes such as single-board computers (SBCs) and OpenFlow switches are not available. For each class, representative power models of several device types are presented. Further, the power consumption caused by new communication protocols such as MultiPath TCP (MPTCP) is not fully analyzed yet. This work is, to the best of the author’s knowledge, the first to publish SBC and OpenFlow power models and contributes to the understanding of MPTCP power consumption during constant bit rate (CBR) streaming.
For the analysis of the power consumption, also the knowledge of network performance is required, as it defines relative costs and the maximum number of supported users. This is well known and comparatively simple in fixed networks, but more challenging in a wireless context. A number of approaches are described in literature and implemented as commercial software (e.g. [SSM13; OpS]), but the data required for analysis and optimization is not available. Hence, extensive measurements of the cellular network are conducted in this work. The location-based availability and performance of cellular and WiFi networks are assessed in a crowd-sensing study. Based on measurements on regional trains, the predictability of the cellular service quality based only on available network technology and latency is shown to be feasible. Anomalies observed within the crowd-sensing data are analyzed using dedicated, stationary measurements. The main observation is that network management decisions have significant effects on end-to-end performance. By allocating users to random points of presence (PoPs)/exit gateways of the mobile network operator (MNO), the latency compared to the best observed allocation is increased by more than 58 % in over 80 % of the time.
Combining the energy models and network performance measurements as presented in this work, an energy evaluation environment is created to analyze the cost of mobile data communication. This combines the empirically determined performance of cellular and WiFi networks with the energy models of smartphones and traffic traces recorded
by the participants of a crowd-sensing study. Thereby, the power consumption of the generated data patterns is established, and the effectiveness of network optimization approaches as presented in literature assessed. These prove to be less potent than originally claimed by the authors. This is expected considering the improvements in cellular networks and smartphones. Nonetheless, energy savings are observed. Considering the requirement of 5G networks to reduce latency to 1 ms, and improve capacity by a factor of 1000, while simultaneously reducing energy consumption, also changes in fixed access networks are required. A promising approach assuming further virtualization of networks using software defined networking (SDN) and network functions virtualization (NFV) is the placement of services closer to the end-users. Extrapolating the trend of increasing hardware capabilities of HGWs at almost constant cost, these may be used to provide additional services to local users. This may be achieved by e.g. using virtualized content distribution network (CDN) nodes running on HGWs, thus utilizing these often idle resources. This further equalizes the traffic within the core network by providing content locally and refreshing it during less traffic intensive periods. Simultaneously, the end-user perceived service quality is expected to increase. Thus, installed capacities can be used longer, resulting in better service quality at fixed energy cost