2,295 research outputs found

    Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing

    Full text link
    Free-viewpoint video conferencing allows a participant to observe the remote 3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint image is commonly synthesized using two pairs of transmitted texture and depth maps from two neighboring captured viewpoints via depth-image-based rendering (DIBR). To maintain high quality of synthesized images, it is imperative to contain the adverse effects of network packet losses that may arise during texture and depth video transmission. Towards this end, we develop an integrated approach that exploits the representation redundancy inherent in the multiple streamed videos a voxel in the 3D scene visible to two captured views is sampled and coded twice in the two views. In particular, at the receiver we first develop an error concealment strategy that adaptively blends corresponding pixels in the two captured views during DIBR, so that pixels from the more reliable transmitted view are weighted more heavily. We then couple it with a sender-side optimization of reference picture selection (RPS) during real-time video coding, so that blocks containing samples of voxels that are visible in both views are more error-resiliently coded in one view only, given adaptive blending will erase errors in the other view. Further, synthesized view distortion sensitivities to texture versus depth errors are analyzed, so that relative importance of texture and depth code blocks can be computed for system-wide RPS optimization. Experimental results show that the proposed scheme can outperform the use of a traditional feedback channel by up to 0.82 dB on average at 8% packet loss rate, and by as much as 3 dB for particular frames

    Study and simulation of low rate video coding schemes

    Get PDF
    The semiannual report is included. Topics covered include communication, information science, data compression, remote sensing, color mapped images, robust coding scheme for packet video, recursively indexed differential pulse code modulation, image compression technique for use on token ring networks, and joint source/channel coder design

    Adaptive sensing and optimal power allocation for wireless video sensors with sigma-delta imager

    Get PDF
    We consider optimal power allocation for wireless video sensors (WVSs), including the image sensor subsystem in the system analysis. By assigning a power-rate-distortion (P-R-D) characteristic for the image sensor, we build a comprehensive P-R-D optimization framework for WVSs. For a WVS node operating under a power budget, we propose power allocation among the image sensor, compression, and transmission modules, in order to minimize the distortion of the video reconstructed at the receiver. To demonstrate the proposed optimization method, we establish a P-R-D model for an image sensor based upon a pixel level sigma-delta ( ) image sensor design that allows investigation of the tradeoff between the bit depth of the captured images and spatio-temporal characteristics of the video sequence under the power constraint. The optimization results obtained in this setting confirm that including the image sensor in the system optimization procedure can improve the overall video quality under power constraint and prolong the lifetime of the WVSs. In particular, when the available power budget for a WVS node falls below a threshold, adaptive sensing becomes necessary to ensure that the node communicates useful information about the video content while meeting its power budget.Peer ReviewedPostprint (published version

    Scalable image quality assessment with 2D mel-cepstrum and machine learning approach

    Get PDF
    Cataloged from PDF version of article.Measurement of image quality is of fundamental importance to numerous image and video processing applications. Objective image quality assessment (IQA) is a two-stage process comprising of the following: (a) extraction of important information and discarding the redundant one, (b) pooling the detected features using appropriate weights. These two stages are not easy to tackle due to the complex nature of the human visual system (HVS). In this paper, we first investigate image features based on two-dimensional (20) mel-cepstrum for the purpose of IQA. It is shown that these features are effective since they can represent the structural information, which is crucial for IQA. Moreover, they are also beneficial in a reduced-reference scenario where only partial reference image information is used for quality assessment. We address the second issue by exploiting machine learning. In our opinion, the well established methodology of machine learning/pattern recognition has not been adequately used for IQA so far; we believe that it will be an effective tool for feature pooling since the required weights/parameters can be determined in a more convincing way via training with the ground truth obtained according to subjective scores. This helps to overcome the limitations of the existing pooling methods, which tend to be over simplistic and lack theoretical justification. Therefore, we propose a new metric by formulating IQA as a pattern recognition problem. Extensive experiments conducted using six publicly available image databases (totally 3211 images with diverse distortions) and one video database (with 78 video sequences) demonstrate the effectiveness and efficiency of the proposed metric, in comparison with seven relevant existing metrics. (C) 2011 Elsevier Ltd. All rights reserved

    Domain-Specific Fusion Of Objective Video Quality Metrics

    Get PDF
    Video processing algorithms like video upscaling, denoising, and compression are now increasingly optimized for perceptual quality metrics instead of signal distortion. This means that they may score well for metrics like video multi-method assessment fusion (VMAF), but this may be because of metric overfitting. This imposes the need for costly subjective quality assessments that cannot scale to large datasets and large parameter explorations. We propose a methodology that fuses multiple quality metrics based on small scale subjective testing in order to unlock their use at scale for specific application domains of interest. This is achieved by employing pseudo-random sampling of the resolution, quality range and test video content available, which is initially guided by quality metrics in order to cover the quality range useful to each application. The selected samples then undergo a subjective test, such as ITU-T P.910 absolute categorical rating, with the results of the test postprocessed and used as the means to derive the best combination of multiple objective metrics using support vector regression. We showcase the benefits of this approach in two applications: video encoding with and without perceptual preprocessing, and deep video denoising & upscaling of compressed content. For both applications, the derived fusion of metrics allows for a more robust alignment to mean opinion scores than a perceptually-uninformed combination of the original metrics themselves. The dataset and code is available at https://github.com/isize-tech/VideoQualityFusion

    RBF-Based QP Estimation Model for VBR Control in H.264/SVC

    Get PDF
    In this paper we propose a novel variable bit rate (VBR) controller for real-time H.264/scalable video coding (SVC) applications. The proposed VBR controller relies on the fact that consecutive pictures within the same scene often exhibit similar degrees of complexity, and consequently should be encoded using similar quantization parameter (QP) values for the sake of quality consistency. In oder to prevent unnecessary QP fluctuations, the proposed VBR controller allows for just an incremental variation of QP with respect to that of the previous picture, focusing on the design of an effective method for estimating this QP variation. The implementation in H.264/SVC requires to locate a rate controller at each dependency layer (spatial or coarse grain scalability). In particular, the QP increment estimation at each layer is computed by means of a radial basis function (RBF) network that is specially designed for this purpose. Furthermore, the RBF network design process was conceived to provide an effective solution for a wide range of practical real-time VBR applications for scalable video content delivery. In order to assess the proposed VBR controller, two real-time application scenarios were simulated: mobile live streaming and IPTV broadcast. It was compared to constant QP encoding and a recently proposed constant bit rate (CBR) controller for H.264/SVC. The experimental results show that the proposed method achieves remarkably consistent quality, outperforming the reference CBR controller in the two scenarios for all the spatio-temporal resolutions considered.Proyecto CCG10-UC3M/TIC-5570 de la Comunidad Autónoma de Madrid y Universidad Carlos III de MadridPublicad
    • …
    corecore