36 research outputs found

    Analysis of Neural Video Compression Networks for 360-Degree Video Coding

    Full text link
    With the increasing efforts of bringing high-quality virtual reality technologies into the market, efficient 360-degree video compression gains in importance. As such, the state-of-the-art H.266/VVC video coding standard integrates dedicated tools for 360-degree video, and considerable efforts have been put into designing 360-degree projection formats with improved compression efficiency. For the fast-evolving field of neural video compression networks (NVCs), the effects of different 360-degree projection formats on the overall compression performance have not yet been investigated. It is thus unclear, whether a resampling from the conventional equirectangular projection (ERP) to other projection formats yields similar gains for NVCs as for hybrid video codecs, and which formats perform best. In this paper, we analyze several generations of NVCs and an extensive set of 360-degree projection formats with respect to their compression performance for 360-degree video. Based on our analysis, we find that projection format resampling yields significant improvements in compression performance also for NVCs. The adjusted cubemap projection (ACP) and equatorial cylindrical projection (ECP) show to perform best and achieve rate savings of more than 55% compared to ERP based on WS-PSNR for the most recent NVC. Remarkably, the observed rate savings are higher than for H.266/VVC, emphasizing the importance of projection format resampling for NVCs.Comment: 5 pages, 4 figures, 1 table, accepted for Picture Coding Symposium 2024 (PCS 2024

    Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding

    Full text link
    Conditional coding is a new video coding paradigm enabled by neural-network-based compression. It can be shown that conditional coding is in theory better than the traditional residual coding, which is widely used in video compression standards like HEVC or VVC. However, on closer inspection, it becomes clear that conditional coders can suffer from information bottlenecks in the prediction path, i.e., that due to the data processing inequality not all information from the prediction signal can be passed to the reconstructed signal, thereby impairing the coder performance. In this paper we propose the conditional residual coding concept, which we derive from information theoretical properties of the conditional coder. This coder significantly reduces the influence of bottlenecks, while maintaining the theoretical performance of the conditional coder. We provide a theoretical analysis of the coding paradigm and demonstrate the performance of the conditional residual coder in a practical example. We show that conditional residual coders alleviate the disadvantages of conditional coders while being able to maintain their advantages over residual coders. In the spectrum of residual and conditional coding, we can therefore consider them as ``the best from both worlds''.Comment: 12 pages, 8 figure

    Boosting Neural Image Compression for Machines Using Latent Space Masking

    Full text link
    Today, many image coding scenarios do not have a human as final intended user, but rather a machine fulfilling computer vision tasks on the decoded image. Thereby, the primary goal is not to keep visual quality but maintain the task accuracy of the machine for a given bitrate. Due to the tremendous progress of deep neural networks setting benchmarking results, mostly neural networks are employed to solve the analysis tasks at the decoder side. Moreover, neural networks have also found their way into the field of image compression recently. These two developments allow for an end-to-end training of the neural compression network for an analysis network as information sink. Therefore, we first roll out such a training with a task-specific loss to enhance the coding performance of neural compression networks. Compared to the standard VVC, 41.4% of bitrate are saved by this method for Mask R-CNN as analysis network on the uncompressed Cityscapes dataset. As a main contribution, we propose LSMnet, a network that runs in parallel to the encoder network and masks out elements of the latent space that are presumably not required for the analysis network. By this approach, additional 27.3% of bitrate are saved compared to the basic neural compression network optimized with the task loss. In addition, we are the first to utilize a feature-based distortion in the training loss within the context of machine-to-machine communication, which allows for a training without annotated data. We provide extensive analyses on the Cityscapes dataset including cross-evaluation with different analysis networks and present exemplary visual results. Inference code and pre-trained models are published at https://github.com/FAU-LMS/NCN_for_M2M.Comment: 12 pages, 9 figures, 3 tables; This work has been accepted for IEEE T-CSVT special issue "Learned Visual Data Compression for both Human and Machine". Copyright may be transferred without notice, after which this version may no longer be accessibl

    On Benefits and Challenges of Conditional Interframe Video Coding in Light of Information Theory

    Full text link
    The rise of variational autoencoders for image and video compression has opened the door to many elaborate coding techniques. One example here is the possibility of conditional interframe coding. Here, instead of transmitting the residual between the original frame and the predicted frame (often obtained by motion compensation), the current frame is transmitted under the condition of knowing the prediction signal. In practice, conditional coding can be straightforwardly implemented using a conditional autoencoder, which has also shown good results in recent works. In this paper, we provide an information theoretical analysis of conditional coding for inter frames and show in which cases gains compared to traditional residual coding can be expected. We also show the effect of information bottlenecks which can occur in practical video coders in the prediction signal path due to the network structure, as a consequence of the data-processing theorem or due to quantization. We demonstrate that conditional coding has theoretical benefits over residual coding but that there are cases in which the benefits are quickly canceled by small information bottlenecks of the prediction signal.Comment: 5 pages, 4 figures, accepted to be presented at PCS 2022. arXiv admin note: text overlap with arXiv:2112.08011 Update Note: Fixed notation in Eq. 10, no changes otherwis

    Learning Frequency-Specific Quantization Scaling in VVC for Standard-Compliant Task-driven Image Coding

    Full text link
    Today, visual data is often analyzed by a neural network without any human being involved, which demands for specialized codecs. For standard-compliant codec adaptations towards certain information sinks, HEVC or VVC provide the possibility of frequency-specific quantization with scaling lists. This is a well-known method for the human visual system, where scaling lists are derived from psycho-visual models. In this work, we employ scaling lists when performing VVC intra coding for neural networks as information sink. To this end, we propose a novel data-driven method to obtain optimal scaling lists for arbitrary neural networks. Experiments with Mask R-CNN as information sink reveal that coding the Cityscapes dataset with the proposed scaling lists result in peak bitrate savings of 8.9 % over VVC with constant quantization. By that, our approach also outperforms scaling lists optimized for the human visual system. The generated scaling lists can be found under https://github.com/FAU-LMS/VCM_scaling_lists.Comment: Originally submitted at IEEE ICIP 202

    Processing Energy Modeling for Neural Network Based Image Compression

    Full text link
    Nowadays, the compression performance of neural-networkbased image compression algorithms outperforms state-of-the-art compression approaches such as JPEG or HEIC-based image compression. Unfortunately, most neural-network based compression methods are executed on GPUs and consume a high amount of energy during execution. Therefore, this paper performs an in-depth analysis on the energy consumption of state-of-the-art neural-network based compression methods on a GPU and show that the energy consumption of compression networks can be estimated using the image size with mean estimation errors of less than 7%. Finally, using a correlation analysis, we find that the number of operations per pixel is the main driving force for energy consumption and deduce that the network layers up to the second downsampling step are consuming most energy.Comment: 5 pages, 3 figures, accepted for IEEE International Conference on Image Processing (ICIP) 202

    Behind the NAT – A Measurement Based Evaluation of Cellular Service Quality

    Get PDF
    Abstract—Mobile applications such as VoIP, (live) gaming, or video streaming have diverse QoS requirements ranging from low delay to high throughput. The optimization of the network quality experienced by end-users requires detailed knowledge of the expected network performance. Also, the achieved service quality is affected by a number of factors, including network operator and available technologies. However, most studies focusing on measuring the cellular network do not consider the performance implications of network configuration and management. To this end, this paper reports about an extensive data set of cellular network measurements, focused on analyzing root causes of mobile network performance variability. Measurements conducted over four weeks in a 4G cellular network in Germany show that management and configuration decisions have a substantial impact on the performance. Specifically, it is observed that the association of mobile devices to a Point of Presence (PoP) within the operator’s network can influence the end-to-end RTT by a large extent. Given the collected data a model predicting the PoP assignment and its resulting RTT leveraging Markov Chain and machine learning approaches is developed. RTT increases of 58% to 73% compared to the optimum performance are observed in more than 57% of the measurements

    Содовые подземные воды юга-востока Западной Сибири: определение и распространение

    Get PDF
    Дается определение понятия "содовые воды", приводятся условия локализации подземных содовых вод на юго-востоке Западной Сибири и некоторые их химические особенности. Definition of the term "soda water", the conditions of localization of underground soda waters on the South-East of Western Siberia and some of their chemical features are given

    Energy-efficiency and Performance in Communication Networks: Analyzing Energy-Performance Trade-offs in Communication Networks and their Implications on Future Network Structure and Management

    Get PDF
    The demand on communication networks has increased over the past years and is predicted to continue for the foreseeable future [Cis16]. Cellular network access with a compound annual growth rate (CAGR) of 53 % is the main area of growth [Cis16]. This affects the network quality, bringing current network technologies to their limits [Qua13]. Future network standards like 5G promise to satisfy this demand, providing a 1000-fold increase in data rates and latencies as low as 1 ms [Qua13]. With information and communications technology (ICT) causing 10 % of the global energy consumption [Mil13], the increasing demand is also reflected in a growing energy consumption of communication networks [BBD+11]. The major contributor to the network power consumption are home gateways (HGWs) in the fixed access network, and mobile base stations in the cellular network [VHD+11]. This trend is predicted to continue [BBD+11]. To assess and optimize the power consumption of communication networks, power models of the involved devices are required. Using these, the efficiency of proposed optimization approaches can be assessed before deployment. A number of power models of conventional network equipment for different device classes can be derived from literature. Still, models of new device classes such as single-board computers (SBCs) and OpenFlow switches are not available. For each class, representative power models of several device types are presented. Further, the power consumption caused by new communication protocols such as MultiPath TCP (MPTCP) is not fully analyzed yet. This work is, to the best of the author’s knowledge, the first to publish SBC and OpenFlow power models and contributes to the understanding of MPTCP power consumption during constant bit rate (CBR) streaming. For the analysis of the power consumption, also the knowledge of network performance is required, as it defines relative costs and the maximum number of supported users. This is well known and comparatively simple in fixed networks, but more challenging in a wireless context. A number of approaches are described in literature and implemented as commercial software (e.g. [SSM13; OpS]), but the data required for analysis and optimization is not available. Hence, extensive measurements of the cellular network are conducted in this work. The location-based availability and performance of cellular and WiFi networks are assessed in a crowd-sensing study. Based on measurements on regional trains, the predictability of the cellular service quality based only on available network technology and latency is shown to be feasible. Anomalies observed within the crowd-sensing data are analyzed using dedicated, stationary measurements. The main observation is that network management decisions have significant effects on end-to-end performance. By allocating users to random points of presence (PoPs)/exit gateways of the mobile network operator (MNO), the latency compared to the best observed allocation is increased by more than 58 % in over 80 % of the time. Combining the energy models and network performance measurements as presented in this work, an energy evaluation environment is created to analyze the cost of mobile data communication. This combines the empirically determined performance of cellular and WiFi networks with the energy models of smartphones and traffic traces recorded by the participants of a crowd-sensing study. Thereby, the power consumption of the generated data patterns is established, and the effectiveness of network optimization approaches as presented in literature assessed. These prove to be less potent than originally claimed by the authors. This is expected considering the improvements in cellular networks and smartphones. Nonetheless, energy savings are observed. Considering the requirement of 5G networks to reduce latency to 1 ms, and improve capacity by a factor of 1000, while simultaneously reducing energy consumption, also changes in fixed access networks are required. A promising approach assuming further virtualization of networks using software defined networking (SDN) and network functions virtualization (NFV) is the placement of services closer to the end-users. Extrapolating the trend of increasing hardware capabilities of HGWs at almost constant cost, these may be used to provide additional services to local users. This may be achieved by e.g. using virtualized content distribution network (CDN) nodes running on HGWs, thus utilizing these often idle resources. This further equalizes the traffic within the core network by providing content locally and refreshing it during less traffic intensive periods. Simultaneously, the end-user perceived service quality is expected to increase. Thus, installed capacities can be used longer, resulting in better service quality at fixed energy cost
    corecore