589 research outputs found

    PEA265: Perceptual Assessment of Video Compression Artifacts

    Full text link
    The most widely used video encoders share a common hybrid coding framework that includes block-based motion estimation/compensation and block-based transform coding. Despite their high coding efficiency, the encoded videos often exhibit visually annoying artifacts, denoted as Perceivable Encoding Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience (QoE) of end users. To monitor and improve visual QoE, it is crucial to develop subjective and objective measures that can identify and quantify various types of PEAs. In this work, we make the first attempt to build a large-scale subjectlabelled database composed of H.265/HEVC compressed videos containing various PEAs. The database, namely the PEA265 database, includes 4 types of spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types of temporal PEAs (i.e. flickering and floating). Each containing at least 60,000 image or video patches with positive and negative labels. To objectively identify these PEAs, we train Convolutional Neural Networks (CNNs) using the PEA265 database. It appears that state-of-theart ResNeXt is capable of identifying each type of PEAs with high accuracy. Furthermore, we define PEA pattern and PEA intensity measures to quantify PEA levels of compressed video sequence. We believe that the PEA265 database and our findings will benefit the future development of video quality assessment methods and perceptually motivated video encoders.Comment: 10 pages,15 figures,4 table

    Deep CNN Model for Non-Screen Content and Screen Content Image Quality Assessment

    Get PDF
    In the current world, user experience in various platforms matters a lot for different organizations. But providing a better experience can be challenging if the multimedia content on online platforms is having different kinds of distortions which impact the overall experience of the user. There can be various reasons behind distortions such as compression or minimal lighting condition while taking photos. In this work, a deep CNN-based Non-Screen Content and Screen Content NR-IQA framework is proposed which solves this issue in a more effective way. The framework is known as DNSSCIQ. Two different architectures are proposed based upon the input image type whether the input is a screen content or non-screen content image. This work attempts to solve this by evaluating the quality of such image

    Basic Binary Convolution Unit for Binarized Image Restoration Network

    Full text link
    Lighter and faster image restoration (IR) models are crucial for the deployment on resource-limited devices. Binary neural network (BNN), one of the most promising model compression methods, can dramatically reduce the computations and parameters of full-precision convolutional neural networks (CNN). However, there are different properties between BNN and full-precision CNN, and we can hardly use the experience of designing CNN to develop BNN. In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks. We conduct systematic analyses to explain each component's role in binary convolution and discuss the pitfalls. Specifically, we find that residual connection can reduce the information loss caused by binarization; BatchNorm can solve the value range gap between residual connection and binary convolution; The position of the activation function dramatically affects the performance of BNN. Based on our findings and analyses, we design a simple yet efficient basic binary convolution unit (BBCU). Furthermore, we divide IR networks into four parts and specially design variants of BBCU for each part to explore the benefit of binarizing these parts. We conduct experiments on different IR tasks, and our BBCU significantly outperforms other BNNs and lightweight models, which shows that BBCU can serve as a basic unit for binarized IR networks. All codes and models will be released.Comment: ICLR2023, code is available at https://github.com/Zj-BinXia/BBC

    Super-resolution assessment and detection

    Get PDF
    Super Resolution (SR) techniques are powerful digital manipulation tools that have significantly impacted various industries due to their ability to enhance the resolution of lower quality images and videos. Yet, the real-world adaptation of SR models poses numerous challenges, which blind SR models aim to overcome by emulating complex real-world degradations. In this thesis, we investigate these SR techniques, with a particular focus on comparing the performance of blind models to their non-blind counterparts under various conditions. Despite recent progress, the proliferation of SR techniques raises concerns about their potential misuse. These methods can easily manipulate real digital content and create misrepresentations, which highlights the need for robust SR detection mechanisms. In our study, we analyze the limitations of current SR detection techniques and propose a new detection system that exhibits higher performance in discerning real and upscaled videos. Moreover, we conduct several experiments to gain insights into the strengths and weaknesses of the detection models, providing a better understanding of their behavior and limitations. Particularly, we target 4K videos, which are rapidly becoming the standard resolution in various fields such as streaming services, gaming, and content creation. As part of our research, we have created and utilized a unique dataset in 4K resolution, specifically designed to facilitate the investigation of SR techniques and their detection

    A Dive into SAM Prior in Image Restoration

    Full text link
    The goal of image restoration (IR), a fundamental issue in computer vision, is to restore a high-quality (HQ) image from its degraded low-quality (LQ) observation. Multiple HQ solutions may correspond to an LQ input in this poorly posed problem, creating an ambiguous solution space. This motivates the investigation and incorporation of prior knowledge in order to effectively constrain the solution space and enhance the quality of the restored images. In spite of the pervasive use of hand-crafted and learned priors in IR, limited attention has been paid to the incorporation of knowledge from large-scale foundation models. In this paper, we for the first time leverage the prior knowledge of the state-of-the-art segment anything model (SAM) to boost the performance of existing IR networks in an parameter-efficient tuning manner. In particular, the choice of SAM is based on its robustness to image degradations, such that HQ semantic masks can be extracted from it. In order to leverage semantic priors and enhance restoration quality, we propose a lightweight SAM prior tuning (SPT) unit. This plug-and-play component allows us to effectively integrate semantic priors into existing IR networks, resulting in significant improvements in restoration quality. As the only trainable module in our method, the SPT unit has the potential to improve both efficiency and scalability. We demonstrate the effectiveness of the proposed method in enhancing a variety of methods across multiple tasks, such as image super-resolution and color image denoising.Comment: Technical Repor
    • …
    corecore