118 research outputs found
Definition of masks related to psychovisual features for video quality assessment
Video Quality Assessment needs to correspond to human perception. Pixel-based metrics (PSNR or MSE) fail in many circumstances for not taking into account the spatio-temporal property of human's visual perception. In this paper we propose a new pixel-weighted method to improve video quality metrics for artifacts evaluation. The method applies a psychovisual model based on motion, level of detail, pixel location and the appearance of human faces, which approximate the quality to the human eye's response. Subjective tests were developed to adjust the psychovisual model for demonstrating the noticeable improvement of an algorithm when weighting the pixels according to the factors analyzed instead of treating them equally. The analysis developed demonstrates the necessity of models adapted to the specific visualization of contents and the model presents an advance in quality to be applied over sequences when a determined artifact is analyzed
A Detail Based Method for Linear Full Reference Image Quality Prediction
In this paper, a novel Full Reference method is proposed for image quality
assessment, using the combination of two separate metrics to measure the
perceptually distinct impact of detail losses and of spurious details. To this
purpose, the gradient of the impaired image is locally decomposed as a
predicted version of the original gradient, plus a gradient residual. It is
assumed that the detail attenuation identifies the detail loss, whereas the
gradient residuals describe the spurious details. It turns out that the
perceptual impact of detail losses is roughly linear with the loss of the
positional Fisher information, while the perceptual impact of the spurious
details is roughly proportional to a logarithmic measure of the signal to
residual ratio. The affine combination of these two metrics forms a new index
strongly correlated with the empirical Differential Mean Opinion Score (DMOS)
for a significant class of image impairments, as verified for three independent
popular databases. The method allowed alignment and merging of DMOS data coming
from these different databases to a common DMOS scale by affine
transformations. Unexpectedly, the DMOS scale setting is possible by the
analysis of a single image affected by additive noise.Comment: 15 pages, 9 figures. Copyright notice: The paper has been accepted
for publication on the IEEE Trans. on Image Processing on 19/09/2017 and the
copyright has been transferred to the IEE
Multi-Task Learning Approach for Natural Images' Quality Assessment
Blind image quality assessment (BIQA) is a method to predict the quality of a natural image without the presence of a reference image. Current BIQA models typically learn their prediction separately for different image distortions, ignoring the relationship between the learning tasks. As a result, a BIQA model may has great prediction performance for natural images affected by one particular type of distortion but is less effective when tested on others. In this paper, we propose to address this limitation by training our BIQA model simultaneously under different distortion conditions using multi-task learning (MTL) technique. Given a set of training images, our Multi-Task Learning based Image Quality assessment (MTL-IQ) model first extracts spatial domain BIQA features. The features are then used as an input to a trace-norm regularisation based MTL framework to learn prediction models for different distortion classes simultaneously. For a test image of a known distortion, MTL-IQ selects a specific trained model to predict the image’s quality score. For a test image of an unknown distortion, MTLIQ first estimates the amount of each distortion present in the image using a support vector classifier. The probability estimates are then used to weigh the image prediction scores from different trained models. The weighted scores are then pooled to obtain the final image quality score. Experimental results on standard image quality assessment (IQA) databases show that MTL-IQ is highly correlated with human perceptual measures of image quality. It also obtained higher prediction performance in both overall and individual distortion cases compared to current BIQA models
DeepWSD: Projecting Degradations in Perceptual Space to Wasserstein Distance in Deep Feature Space
Existing deep learning-based full-reference IQA (FR-IQA) models usually
predict the image quality in a deterministic way by explicitly comparing the
features, gauging how severely distorted an image is by how far the
corresponding feature lies from the space of the reference images. Herein, we
look at this problem from a different viewpoint and propose to model the
quality degradation in perceptual space from a statistical distribution
perspective. As such, the quality is measured based upon the Wasserstein
distance in the deep feature domain. More specifically, the 1DWasserstein
distance at each stage of the pre-trained VGG network is measured, based on
which the final quality score is performed. The deep Wasserstein distance
(DeepWSD) performed on features from neural networks enjoys better
interpretability of the quality contamination caused by various types of
distortions and presents an advanced quality prediction capability. Extensive
experiments and theoretical analysis show the superiority of the proposed
DeepWSD in terms of both quality prediction and optimization.Comment: ACM Multimedia 2022 accepted thesi
- …