80 research outputs found

    Perceptual Quality Assessment Based on Visual Attention Analysis

    Get PDF
    Most existing quality metrics do not take the human attention analysis into account. Attention to particular objects or regions is an important attribute of human vision and perception system in measuring perceived image and video qualities. This paper presents an approach for extracting visual attention regions based on a combination of a bottom-up saliency model and semantic image analysis. The use of PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural SIMilarity) in extracted attention regions is analyzed for image/video quality assessment, and a novel quality metric is proposed which can exploit the attributes of visual attention information adequately. The experimental results with respect to the subjective measurement demonstrate that the proposed metric outperforms the current methods

    Leveraging progressive model and overfitting for efficient learned image compression

    Full text link
    Deep learning is overwhelmingly dominant in the field of computer vision and image/video processing for the last decade. However, for image and video compression, it lags behind the traditional techniques based on discrete cosine transform (DCT) and linear filters. Built on top of an autoencoder architecture, learned image compression (LIC) systems have drawn enormous attention in recent years. Nevertheless, the proposed LIC systems are still inferior to the state-of-the-art traditional techniques, for example, the Versatile Video Coding (VVC/H.266) standard, due to either their compression performance or decoding complexity. Although claimed to outperform the VVC/H.266 on a limited bit rate range, some proposed LIC systems take over 40 seconds to decode a 2K image on a GPU system. In this paper, we introduce a powerful and flexible LIC framework with multi-scale progressive (MSP) probability model and latent representation overfitting (LOF) technique. With different predefined profiles, the proposed framework can achieve various balance points between compression efficiency and computational complexity. Experiments show that the proposed framework achieves 2.5%, 1.0%, and 1.3% Bjontegaard delta bit rate (BD-rate) reduction over the VVC/H.266 standard on three benchmark datasets on a wide bit rate range. More importantly, the decoding complexity is reduced from O(n) to O(1) compared to many other LIC systems, resulting in over 20 times speedup when decoding 2K images

    Tune-in Time Reduction in Video Streaming Over DVB-H

    Full text link

    Learned Enhancement Filters for Image Coding for Machines

    Get PDF
    Machine-To-Machine (M2M) communication applications and use cases, such as object detection and instance segmentation, are becoming mainstream nowadays. As a consequence, majority of multimedia content is likely to be consumed by machines in the coming years. This opens up new challenges on efficient compression of this type of data. Two main directions are being explored in the literature, one being based on existing traditional codecs, such as the Versatile Video Coding (VVC) standard, that are optimized for human-Targeted use cases, and another based on end-To-end trained neural networks. However, traditional codecs have significant benefits in terms of interoperability, real-Time decoding, and availability of hardware implementations over end-To-end learned codecs. Therefore, in this paper, we propose learned post-processing filters that are targeted for enhancing the performance of machine vision tasks for images reconstructed by the VVC codec. The proposed enhancement filters provide significant improvements on the target tasks compared to VVC coded images. The conducted experiments show that the proposed post-processing filters provide about 45% and 49% Bjontegaard Delta Rate gains over VVC in instance segmentation and object detection tasks, respectively.acceptedVersionPeer reviewe

    Error-Resilient Communication Using the H.264/AVC Video Coding Standard

    Get PDF
    The Advanced Video Coding standard (H.264/AVC) has become a widely deployed coding technique used in numerous products and services. H.264/AVC utilizes predictive coding to achieve high compression ratio. However, predictive coding also makes H.264/AVC bitstreams vulnerable to transmission errors, as prediction incurs temporal and spatial propagation of the degradations caused by transmission errors. Due to the delay constraints of real-time video communication applications, transmission errors cannot usually be tackled by reliable communication protocols. Yet, most networks are susceptible to transmission errors. Consequently, error resilience techniques are needed to combat transmission errors in real-time H.264/AVC-based video communication. The thesis presents methods to improve the error robustness of H.264/AVC in real-time video communication applications. The presented methods can be grouped into three topics: isolated regions, sub-sequences and interleaved transmission, and encoder-assisted error concealment. In addition to improved error resilience, it is shown that the sub-sequence technique improves compression efficiency compared to non-hierarchical temporal scalability and non-scalable bitstreams. A part of the research work presented in this thesis was targeted at the H.264/AVC standard. Specifically, isolated regions, sub-sequences, and the presented encoder-assisted error concealment methods were adopted into H.264/AVC, and the interleaved transmission feature was included in the specification for real-time carriage of H.264/AVC bitstreams over the Internet Protocol

    Error-Resilient Communication Using the H.264/AVC Video Coding Standard

    Get PDF
    The Advanced Video Coding standard (H.264/AVC) has become a widely deployed coding technique used in numerous products and services. H.264/AVC utilizes predictive coding to achieve high compression ratio. However, predictive coding also makes H.264/AVC bitstreams vulnerable to transmission errors, as prediction incurs temporal and spatial propagation of the degradations caused by transmission errors. Due to the delay constraints of real-time video communication applications, transmission errors cannot usually be tackled by reliable communication protocols. Yet, most networks are susceptible to transmission errors. Consequently, error resilience techniques are needed to combat transmission errors in real-time H.264/AVC-based video communication. The thesis presents methods to improve the error robustness of H.264/AVC in real-time video communication applications. The presented methods can be grouped into three topics: isolated regions, sub-sequences and interleaved transmission, and encoder-assisted error concealment. In addition to improved error resilience, it is shown that the sub-sequence technique improves compression efficiency compared to non-hierarchical temporal scalability and non-scalable bitstreams. A part of the research work presented in this thesis was targeted at the H.264/AVC standard. Specifically, isolated regions, sub-sequences, and the presented encoder-assisted error concealment methods were adopted into H.264/AVC, and the interleaved transmission feature was included in the specification for real-time carriage of H.264/AVC bitstreams over the Internet Protocol

    Error resilient video coding using unequally protected key pictures

    No full text
    Abstract. This paper proposes the use of unequally protected key pictures to prevent temporal error propagation in error-prone video communications. The key picture may either be an intra-coded picture or a picture using a long-term motion compensation reference picture through reference picture selection. The inter-coded key picture uses previous key pictures as motion compensation reference. Key pictures are better protected than other pictures using forward error correction in either source or transport coding. Simulation results show significantly improved error resiliency performance of the proposed technique compared to conventional methods.
    corecore