3,224 research outputs found

    The Visual Centrifuge: Model-Free Layered Video Representations

    Full text link
    True video understanding requires making sense of non-lambertian scenes where the color of light arriving at the camera sensor encodes information about not just the last object it collided with, but about multiple mediums -- colored windows, dirty mirrors, smoke or rain. Layered video representations have the potential of accurately modelling realistic scenes but have so far required stringent assumptions on motion, lighting and shape. Here we propose a learning-based approach for multi-layered video representation: we introduce novel uncertainty-capturing 3D convolutional architectures and train them to separate blended videos. We show that these models then generalize to single videos, where they exhibit interesting abilities: color constancy, factoring out shadows and separating reflections. We present quantitative and qualitative results on real world videos.Comment: Appears in: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019). This arXiv contains the CVPR Camera Ready version of the paper (although we have included larger figures) as well as an appendix detailing the model architectur

    Complexity management of H.264/AVC video compression.

    Get PDF
    The H. 264/AVC video coding standard offers significantly improved compression efficiency and flexibility compared to previous standards. However, the high computational complexity of H. 264/AVC is a problem for codecs running on low-power hand held devices and general purpose computers. This thesis presents new techniques to reduce, control and manage the computational complexity of an H. 264/AVC codec. A new complexity reduction algorithm for H. 264/AVC is developed. This algorithm predicts "skipped" macroblocks prior to motion estimation by estimating a Lagrange ratedistortion cost function. Complexity savings are achieved by not processing the macroblocks that are predicted as "skipped". The Lagrange multiplier is adaptively modelled as a function of the quantisation parameter and video sequence statistics. Simulation results show that this algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. The complexity reduction algorithm is further developed to achieve complexity-scalable control of the encoding process. The Lagrangian cost estimation is extended to incorporate computational complexity. A target level of complexity is maintained by using a feedback algorithm to update the Lagrange multiplier associated with complexity. Results indicate that scalable complexity control of the encoding process can be achieved whilst maintaining near optimal complexity-rate-distortion performance. A complexity management framework is proposed for maximising the perceptual quality of coded video in a real-time processing-power constrained environment. A real-time frame-level control algorithm and a per-frame complexity control algorithm are combined in order to manage the encoding process such that a high frame rate is maintained without significantly losing frame quality. Subjective evaluations show that the managed complexity approach results in higher perceptual quality compared to a reference encoder that drops frames in computationally constrained situations. These novel algorithms are likely to be useful in implementing real-time H. 264/AVC standard encoders in computationally constrained environments such as low-power mobile devices and general purpose computers

    On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

    Get PDF

    On the design of multimedia architectures : proceedings of a one-day workshop, Eindhoven, December 18, 2003

    Get PDF

    Deep perceptual preprocessing for video coding

    Get PDF
    We introduce the concept of rate-aware deep perceptual preprocessing (DPP) for video encoding. DPP makes a single pass over each input frame in order to enhance its visual quality when the video is to be compressed with any codec at any bitrate. The resulting bitstreams can be decoded and displayed at the client side without any post-processing component. DPP comprises a convolutional neural network that is trained via a composite set of loss functions that incorporates: (i) a perceptual loss based on a trained no-reference image quality assessment model, (ii) a reference-based fidelity loss expressing L1 and structural similarity aspects, (iii) a motion-based rate loss via block-based transform, quantization and entropy estimates that converts the essential components of standard hybrid video encoder designs into a trainable framework. Extensive testing using multiple quality metrics and AVC, AV1 and VVC encoders shows that DPP+encoder reduces, on average, the bitrate of the corresponding encoder by 11%. This marks the first time a server-side neural processing component achieves such savings over the state-of-the-art in video coding

    Two-Pass Rate Control for Improved Quality of Experience in UHDTV Delivery

    Get PDF

    Investigating low-bitrate, low-complexity H.264 region of interest techniques in error-prone environments

    Get PDF
    The H.264/AVC video coding standard leverages advanced compression methods to provide a significant increase in performance over previous CODECs in terms of picture quality, bitrate, and flexibility. The specification itself provides several profiles and levels that allow customization through the use of various advanced features. In addition to these features, several new video coding techniques have been developed since the standard\u27s inception. One such technique known as Region of Interest (RoI) coding has been in existence since before H.264\u27s formalization, and several means of implementing RoI coding in H.264 have been proposed. Region of Interest coding operates under the assumption that one or more regions of a sequence have higher priority than the rest of the video. One goal of RoI coding is to provide a decrease in bitrate without significant loss of perceptual quality, and this is particularly applicable to low complexity environments, if the proper implementation is used. Furthermore, RoI coding may allow for enhanced error resilience in the selected regions if desired, making RoI suitable for both low-bitrate and error-prone scenarios. The goal of this thesis project was to examine H.264 Region of Interest coding as it applies to such scenarios. A modified version of the H.264 JM Reference Software was created in which all non-Baseline profile features were removed. Six low-complexity RoI coding techniques, three targeting rate control and three targeting error resilience, were selected for implementation. Error and distortion modeling tools were created to enhance the quality of experimental data. Results were gathered by varying a range of coding parameters including frame size, target bitrate, and macroblock error rates. Methods were then examined based on their rate-distortion curves, ability to achieve target bitrates accurately, and per-region distortions where applicable
    • …
    corecore