673 research outputs found

    Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing

    Full text link
    Free-viewpoint video conferencing allows a participant to observe the remote 3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint image is commonly synthesized using two pairs of transmitted texture and depth maps from two neighboring captured viewpoints via depth-image-based rendering (DIBR). To maintain high quality of synthesized images, it is imperative to contain the adverse effects of network packet losses that may arise during texture and depth video transmission. Towards this end, we develop an integrated approach that exploits the representation redundancy inherent in the multiple streamed videos a voxel in the 3D scene visible to two captured views is sampled and coded twice in the two views. In particular, at the receiver we first develop an error concealment strategy that adaptively blends corresponding pixels in the two captured views during DIBR, so that pixels from the more reliable transmitted view are weighted more heavily. We then couple it with a sender-side optimization of reference picture selection (RPS) during real-time video coding, so that blocks containing samples of voxels that are visible in both views are more error-resiliently coded in one view only, given adaptive blending will erase errors in the other view. Further, synthesized view distortion sensitivities to texture versus depth errors are analyzed, so that relative importance of texture and depth code blocks can be computed for system-wide RPS optimization. Experimental results show that the proposed scheme can outperform the use of a traditional feedback channel by up to 0.82 dB on average at 8% packet loss rate, and by as much as 3 dB for particular frames

    Modeling and Evaluating Feedback-Based Error Control for Video Transfer

    Get PDF
    Packet loss can be detrimental to real-time interactive video over lossy networks because one lost video packet can propagate errors to many subsequent video frames due to the encoding dependency between frames. Feedback-based error control techniques use feedback information from the decoder to adjust coding parameters at the encoder or retransmit lost packets to reduce the error propagation due to data loss. Feedback-based error control techniques have been shown to be more effective than trying to conceal the error at the encoder or decoder alone since they allow the encoder and decoder to cooperate in the error control process. However, there has been no systematic exploration of the impact of video content and network conditions on the performance of feedback-based error control techniques. In particular, the impact of packet loss, round-trip delay, network capacity constraint, video motion and reference distance on the quality of videos using feedback-based error control techniques have not been systematically studied. This thesis presents analytical models for the major feedback-based error control techniques: Retransmission, Reference Picture Selection (both NACK and ACK modes) and Intra Update. These feedback-based error control techniques have been included in H.263/H.264 and MPEG4, the state of the art video in compression standards. Given a round-trip time, packet loss rate, network capacity constraint, our models can predict the quality for a streaming video with retransmission, Intra Update and RPS over a lossy network. In order to exploit our analytical models, a series of studies has been conducted to explore the effect of reference distance, capacity constraint and Intra coding on video quality. The accuracy of our analytical models in predicting the video quality under different network conditions is validated through simulations. These models are used to examine the behavior of feedback-based error control schemes under a variety of network conditions and video content through a series of analytic experiments. Analysis shows that the performance of feedback-based error control techniques is affected by a variety of factors including round-trip time, loss rate, video content and the Group of Pictures (GOP) length. In particular: 1) RPS NACK achieves the best performance when loss rate is low while RPS ACK outperforms other repair techniques when loss rate is high. However RPS ACK performs the worst when loss rate is low. Retransmission performs the worst when the loss rate is high; 2) for a given round-trip time, the loss rate where RPS NACK performs worse than RPS ACK is higher for low motion videos than it is for high motion videos; 3) Videos with RPS NACK always perform the same or better than videos without repair. However, when small GOP sizes are used, videos without repair perform better than videos with RPS ACK; 4) RPS NACK outperform Intra Update for low-motion videos. However, the performance gap between RPS NACK and Intra Update drops when the round-trip time or the intensity of video motion increases. 5) Although the above trends hold for both VQM and PSNR, when VQM is the video quality metric the performance results are much more sensitive to network loss. 6) Retransmission is effective only when the round-trip time is low. When the round-trip time is high, Partial Retransmission achieves almost the same performance as Full Retransmission. These insights derived from our models can help determine appropriate choices for feedback-based error control techniques under various network conditions and video content

    Error concealment techniques for H.264/MVC encoded sequences

    Get PDF
    This work is partially funded by the Strategic Educational Pathways Scholarship Scheme (STEPS-Malta). This scholarship is partly financed by the European Union–European Social Fund (ESF 1.25).The H.264/MVC standard offers good compression ratios for multi-view sequences by exploiting spatial, temporal and interview image dependencies. This works well in error-free channels, however in the event of transmission errors, it leads to the propagation of the distorted macro-blocks, degrading the quality of experience of the user. This paper reviews the state-of-the-art error concealment solutions and proposes a low complexity concealment method that can be used with multi-view video coding. The error resilience techniques used to aid error concealment are also identified. Results obtained demonstrate that good multi-view video reconstruction can be obtained with this approach.peer-reviewe

    Enhanced spatial error concealment with directional entropy based interpolation switching

    Get PDF

    GRACE: Loss-Resilient Real-Time Video through Neural Codecs

    Full text link
    In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements. To counter packet losses without retransmission, two primary strategies are employed -- encoder-based forward error correction (FEC) and decoder-based error concealment. The former encodes data with redundancy before transmission, yet determining the optimal redundancy level in advance proves challenging. The latter reconstructs video from partially received frames, but dividing a frame into independently coded partitions inherently compromises compression efficiency, and the lost information cannot be effectively recovered by the decoder without adapting the encoder. We present a loss-resilient real-time video system called GRACE, which preserves the user's quality of experience (QoE) across a wide range of packet losses through a new neural video codec. Central to GRACE's enhanced loss resilience is its joint training of the neural encoder and decoder under a spectrum of simulated packet losses. In lossless scenarios, GRACE achieves video quality on par with conventional codecs (e.g., H.265). As the loss rate escalates, GRACE exhibits a more graceful, less pronounced decline in quality, consistently outperforming other loss-resilient schemes. Through extensive evaluation on various videos and real network traces, we demonstrate that GRACE reduces undecodable frames by 95% and stall duration by 90% compared with FEC, while markedly boosting video quality over error concealment methods. In a user study with 240 crowdsourced participants and 960 subjective ratings, GRACE registers a 38% higher mean opinion score (MOS) than other baselines

    Edge-guided image gap interpolation using multi-scale transformation

    Get PDF
    This paper presents improvements in image gap restoration through the incorporation of edge-based directional interpolation within multi-scale pyramid transforms. Two types of image edges are reconstructed: 1) the local edges or textures, inferred from the gradients of the neighboring pixels and 2) the global edges between image objects or segments, inferred using a Canny detector. Through a process of pyramid transformation and downsampling, the image is progressively transformed into a series of reduced size layers until at the pyramid apex the gap size is one sample. At each layer, an edge skeleton image is extracted for edge-guided interpolation. The process is then reversed; from the apex, at each layer, the missing samples are estimated (an iterative method is used in the last stage of upsampling), up-sampled, and combined with the available samples of the next layer. Discrete cosine transform and a family of discrete wavelet transforms are utilized as alternatives for pyramid construction. Evaluations over a range of images, in regular and random loss pattern, at loss rates of up to 40%, demonstrate that the proposed method improves peak-signal-to-noise-ratio by 1–5 dB compared with a range of best-published works
    • …
    corecore