3,524 research outputs found

    Toward Generalized Psychovisual Preprocessing For Video Encoding

    Get PDF
    Deep perceptual preprocessing has recently emerged as a new way to enable further bitrate savings across several generations of video encoders without breaking standards or requiring any changes in client devices. In this article, we lay the foundation for a generalized psychovisual preprocessing framework for video encoding and describe one of its promising instantiations that is practically deployable for video-on-demand, live, gaming, and user-generated content (UGC). Results using state-of-the-art advanced video coding (AVC), high efficiency video coding (HEVC), and versatile video coding (VVC) encoders show that average bitrate [Bjontegaard delta-rate (BD-rate)] gains of 11%-17% are obtained over three state-of-the-art reference-based quality metrics [Netflix video multi-method assessment fusion (VMAF), structural similarity index (SSIM), and Apple advanced video quality tool (AVQT)], as well as the recently proposed nonreference International Telecommunication Union-Telecommunication?(ITU-T) P.1204 metric. The proposed framework on CPU is shown to be twice faster than Ă— 264 medium-preset encoding. On GPU hardware, our approach achieves 714 frames/sec for 1080p video (below 2 ms/frame), thereby enabling its use in very-low-latency live video or game streaming applications

    VIDEO PREPROCESSING BASED ON HUMAN PERCEPTION FOR TELESURGERY

    Get PDF
    Video transmission plays a critical role in robotic telesurgery because of the high bandwidth and high quality requirement. The goal of this dissertation is to find a preprocessing method based on human visual perception for telesurgical video, so that when preprocessed image sequences are passed to the video encoder, the bandwidth can be reallocated from non-essential surrounding regions to the region of interest, ensuring excellent image quality of critical regions (e.g. surgical region). It can also be considered as a quality control scheme that will gracefully degrade the video quality in the presence of network congestion. The proposed preprocessing method can be separated into two major parts. First, we propose a time-varying attention map whose value is highest at the gazing point and falls off progressively towards the periphery. Second, we propose adaptive spatial filtering and the parameters of which are adjusted according to the attention map. By adding visual adaptation to the spatial filtering, telesurgical video data can be compressed efficiently because of the high degree of visual redundancy removal by our algorithm. Our experimental results have shown that with the proposed preprocessing method, over half of the bandwidth can be reduced while there is no significant visual effect for the observer. We have also developed an optimal parameter selecting algorithm, so that when the network bandwidth is limited, the overall visual distortion after preprocessing is minimized

    Deep perceptual preprocessing for video coding

    Get PDF
    We introduce the concept of rate-aware deep perceptual preprocessing (DPP) for video encoding. DPP makes a single pass over each input frame in order to enhance its visual quality when the video is to be compressed with any codec at any bitrate. The resulting bitstreams can be decoded and displayed at the client side without any post-processing component. DPP comprises a convolutional neural network that is trained via a composite set of loss functions that incorporates: (i) a perceptual loss based on a trained no-reference image quality assessment model, (ii) a reference-based fidelity loss expressing L1 and structural similarity aspects, (iii) a motion-based rate loss via block-based transform, quantization and entropy estimates that converts the essential components of standard hybrid video encoder designs into a trainable framework. Extensive testing using multiple quality metrics and AVC, AV1 and VVC encoders shows that DPP+encoder reduces, on average, the bitrate of the corresponding encoder by 11%. This marks the first time a server-side neural processing component achieves such savings over the state-of-the-art in video coding

    Perceptually-aware bilateral filtering for quality improvement in low bit rate video coding

    Get PDF
    Proceedings of: Picture Coding Symposium (PCS 2012). Krakow, Poland, May 7-9, 2012Perceptual coding has become of great interest in modern video coding due to the need for higher compression rates. Many previous works have been carried out to incorporate perceptual information to hybrid video encoders, either modifying the quantization parameter according to a certain perceptual resource allocation map or preprocessing video sequences for removing information that is not perceptually relevant. The first strategy is limited by the presence of blocking artifacts and the second one lacks of adaptation to video content. In this paper, a novel and simple approach is proposed, which performs a smart filtering prior to the encoding process preserving both the structural and motion information. The experiments prove that the use of proposed method implemented on an H.264 encoder significantly improves its perceptual quality for low bit rates.Publicad

    An adaptive perception-based image preprocessing method

    Get PDF
    The aim of this paper is to introduce an adaptive preprocessing procedure based on human perception in order to increase the performance of some standard image processing techniques. Specifically, image frequency content has been weighted by the corresponding value of the contrast sensitivity function, in agreement with the sensitiveness of human eye to the different image frequencies and contrasts. The 2D Rational dilation wavelet transform has been employed for representing image frequencies. In fact, it provides an adaptive and flexible multiresolution framework, enabling an easy and straightforward adaptation to the image frequency content. Preliminary experimental results show that the proposed preprocessing allows us to increase the performance of some standard image enhancement algorithms in terms of visual quality and often also in terms of PSNR

    Investigation of the effectiveness of video quality metrics in video pre-processing.

    Get PDF
    This paper presents an investigation of the effectiveness of current video quality measurement metrics in measuring variations in perceptual video quality of pre-processed video. The results show that full-reference video quality metrics are not effective in detecting variations in perceptual video quality. However, no reference metrics show better performance when compared to full reference metrics, particularly, Naturalness Image Quality Evaluator (NIQE) is notably better at detecting perceptual quality variations

    Quality-Oriented Perceptual HEVC Based on the Spatiotemporal Saliency Detection Model

    Get PDF
    Perceptual video coding (PVC) can provide a lower bitrate with the same visual quality compared with traditional H.265/high efficiency video coding (HEVC). In this work, a novel H.265/HEVC-compliant PVC framework is proposed based on the video saliency model. Firstly, both an effective and efficient spatiotemporal saliency model is used to generate a video saliency map. Secondly, a perceptual coding scheme is developed based on the saliency map. A saliency-based quantization control algorithm is proposed to reduce the bitrate. Finally, the simulation results demonstrate that the proposed perceptual coding scheme shows its superiority in objective and subjective tests, achieving up to a 9.46% bitrate reduction with negligible subjective and objective quality loss. The advantage of the proposed method is the high quality adapted for a high-definition video application

    Escaping the complexity-bitrate-quality barriers of video encoders via deep perceptual optimization

    Get PDF
    We extend the concept of learnable video precoding (rate-aware neural-network processing prior to encoding) to deep perceptual optimization (DPO). Our framework comprises a pixel-to-pixel convolutional neural network that is trained based on the virtualization of core encoding blocks (block transform, quantization, block-based prediction) and multiple loss functions representing rate, distortion and visual quality of the virtual encoder. We evaluate our proposal with AVC/H.264 and AV1 under per-clip rate-quality optimization. The results show that DPO offers, on average, 14.2% bitrate reduction over AVC/H.264 and 12.5% bitrate reduction over AV1. Our framework is shown to improve both distortion- and perception-oriented metrics in a consistent manner, exhibiting only 3% outliers, which correspond to content with peculiar characteristics. Thus, DPO is shown to offer complexity-bitrate-quality tradeoffs that go beyond what conventional video encoders can offe

    Encoding in the Dark Grand Challenge:An Overview

    Get PDF
    A big part of the video content we consume from video providers consists of genres featuring low-light aesthetics. Low light sequences have special characteristics, such as spatio-temporal varying acquisition noise and light flickering, that make the encoding process challenging. To deal with the spatio-temporal incoherent noise, higher bitrates are used to achieve high objective quality. Additionally, the quality assessment metrics and methods have not been designed, trained or tested for this type of content. This has inspired us to trigger research in that area and propose a Grand Challenge on encoding low-light video sequences. In this paper, we present an overview of the proposed challenge, and test state-of-the-art methods that will be part of the benchmark methods at the stage of the participants' deliverable assessment. From this exploration, our results show that VVC already achieves a high performance compared to simply denoising the video source prior to encoding. Moreover, the quality of the video streams can be further improved by employing a post-processing image enhancement method

    Fast compressed domain watermarking of MPEG multiplexed streams

    Get PDF
    In this paper, a new technique for watermarking of MPEG compressed video streams is proposed. The watermarking scheme operates directly in the domain of MPEG multiplexed streams. Perceptual models are used during the embedding process in order to preserve the quality of the video. The watermark is embedded in the compressed domain and is detected without the use of the original video sequence. Experimental evaluation demonstrates that the proposed scheme is able to withstand a variety of attacks. The resulting watermarking system is very fast and reliable, and is suitable for copyright protection and real-time content authentication applications
    • …
    corecore