19,210 research outputs found
Efficient No-Reference Quality Assessment and Classification Model for Contrast Distorted Images
In this paper, an efficient Minkowski Distance based Metric (MDM) for
no-reference (NR) quality assessment of contrast distorted images is proposed.
It is shown that higher orders of Minkowski distance and entropy provide
accurate quality prediction for the contrast distorted images. The proposed
metric performs predictions by extracting only three features from the
distorted images followed by a regression analysis. Furthermore, the proposed
features are able to classify type of the contrast distorted images with a high
accuracy. Experimental results on four datasets CSIQ, TID2013, CCID2014, and
SIQAD show that the proposed metric with a very low complexity provides better
quality predictions than the state-of-the-art NR metrics. The MATLAB source
code of the proposed metric is available to public at
http://www.synchromedia.ca/system/files/MDM.zip.Comment: 6 pages, 4 figures, 4 table
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
We present a deep neural network-based approach to image quality assessment
(IQA). The network is trained end-to-end and comprises ten convolutional layers
and five pooling layers for feature extraction, and two fully connected layers
for regression, which makes it significantly deeper than related IQA models.
Unique features of the proposed architecture are that: 1) with slight
adaptations it can be used in a no-reference (NR) as well as in a
full-reference (FR) IQA setting and 2) it allows for joint learning of local
quality and local weights, i.e., relative importance of local quality to the
global quality estimate, in an unified framework. Our approach is purely
data-driven and does not rely on hand-crafted features or other types of prior
domain knowledge about the human visual system or image statistics. We evaluate
the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the
LIVE In the wild image quality challenge database and show superior performance
to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation
shows a high ability to generalize between different databases, indicating a
high robustness of the learned features
NIMA: Neural Image Assessment
Automatically learned quality assessment for images has recently become a hot
topic due to its usefulness in a wide variety of applications such as
evaluating image capture pipelines, storage techniques and sharing media.
Despite the subjective nature of this problem, most existing methods only
predict the mean opinion score provided by datasets such as AVA [1] and TID2013
[2]. Our approach differs from others in that we predict the distribution of
human opinion scores using a convolutional neural network. Our architecture
also has the advantage of being significantly simpler than other methods with
comparable performance. Our proposed approach relies on the success (and
retraining) of proven, state-of-the-art deep object recognition networks. Our
resulting network can be used to not only score images reliably and with high
correlation to human perception, but also to assist with adaptation and
optimization of photo editing/enhancement algorithms in a photographic
pipeline. All this is done without need for a "golden" reference image,
consequently allowing for single-image, semantic- and perceptually-aware,
no-reference quality assessment.Comment: IEEE Transactions on Image Processing 201
Blind Predicting Similar Quality Map for Image Quality Assessment
A key problem in blind image quality assessment (BIQA) is how to effectively
model the properties of human visual system in a data-driven manner. In this
paper, we propose a simple and efficient BIQA model based on a novel framework
which consists of a fully convolutional neural network (FCNN) and a pooling
network to solve this problem. In principle, FCNN is capable of predicting a
pixel-by-pixel similar quality map only from a distorted image by using the
intermediate similarity maps derived from conventional full-reference image
quality assessment methods. The predicted pixel-by-pixel quality maps have good
consistency with the distortion correlations between the reference and
distorted images. Finally, a deep pooling network regresses the quality map
into a score. Experiments have demonstrated that our predictions outperform
many state-of-the-art BIQA methods
Learn to Evaluate Image Perceptual Quality Blindly from Statistics of Self-similarity
Among the various image quality assessment (IQA) tasks, blind IQA (BIQA) is
particularly challenging due to the absence of knowledge about the reference
image and distortion type. Features based on natural scene statistics (NSS)
have been successfully used in BIQA, while the quality relevance of the feature
plays an essential role to the quality prediction performance. Motivated by the
fact that the early processing stage in human visual system aims to remove the
signal redundancies for efficient visual coding, we propose a simple but very
effective BIQA method by computing the statistics of self-similarity (SOS) in
an image. Specifically, we calculate the inter-scale similarity and intra-scale
similarity of the distorted image, extract the SOS features from these
similarities, and learn a regression model to map the SOS features to the
subjective quality score. Extensive experiments demonstrate very competitive
quality prediction performance and generalization ability of the proposed SOS
based BIQA method
dipIQ: Blind Image Quality Assessment by Learning-to-Rank Discriminable Image Pairs
Objective assessment of image quality is fundamentally important in many
image processing tasks. In this work, we focus on learning blind image quality
assessment (BIQA) models which predict the quality of a digital image with no
access to its original pristine-quality counterpart as reference. One of the
biggest challenges in learning BIQA models is the conflict between the gigantic
image space (which is in the dimension of the number of image pixels) and the
extremely limited reliable ground truth data for training. Such data are
typically collected via subjective testing, which is cumbersome, slow, and
expensive. Here we first show that a vast amount of reliable training data in
the form of quality-discriminable image pairs (DIP) can be obtained
automatically at low cost by exploiting large-scale databases with diverse
image content. We then learn an opinion-unaware BIQA (OU-BIQA, meaning that no
subjective opinions are used for training) model using RankNet, a pairwise
learning-to-rank (L2R) algorithm, from millions of DIPs, each associated with a
perceptual uncertainty level, leading to a DIP inferred quality (dipIQ) index.
Extensive experiments on four benchmark IQA databases demonstrate that dipIQ
outperforms state-of-the-art OU-BIQA models. The robustness of dipIQ is also
significantly improved as confirmed by the group MAximum Differentiation (gMAD)
competition method. Furthermore, we extend the proposed framework by learning
models with ListNet (a listwise L2R algorithm) on quality-discriminable image
lists (DIL). The resulting DIL Inferred Quality (dilIQ) index achieves an
additional performance gain
JND-SalCAR: A Novel JND-based Saliency-Channel Attention Residual Network for Image Quality Prediction
In image quality enhancement processing, it is the most important to predict
how humans perceive processed images since human observers are the ultimate
receivers of the images. Thus, objective image quality assessment (IQA) methods
based on human visual sensitivity from psychophysical experiments have been
extensively studied. Thanks to the powerfulness of deep convolutional neural
networks (CNN), many CNN based IQA models have been studied. However, previous
CNN-based IQA models have not fully utilized the characteristics of human
visual systems (HVS) for IQA problems by simply entrusting everything to CNN
where the CNN-based models are often trained as a regressor to predict the
scores of subjective quality assessment obtained from IQA datasets. In this
paper, we propose a novel JND-based saliency-channel attention residual network
for image quality assessment, called JND-SalCAR, where the human psychophysical
characteristics such as visual saliency and just noticeable difference (JND)
are effectively incorporated. We newly propose a SalCAR block so that
perceptually important features can be extracted by using a saliency-based
spatial attention and a channel attention. In addition, the visual saliency map
is further used as a guideline for predicting the patch weight map in order to
afford a stable training of end-to-end optimization for the JND-SalCAR. To our
best knowledge, our work is the first HVS-inspired trainable IQA network that
considers both the visual saliency and JND characteristics of HVS. We evaluate
the proposed JND-SalCAR on large IQA datasets where it outperforms all the
recent state-of-the-art IQA methods
Capturing Localized Image Artifacts through a CNN-based Hyper-image Representation
Training deep CNNs to capture localized image artifacts on a relatively small
dataset is a challenging task. With enough images at hand, one can hope that a
deep CNN characterizes localized artifacts over the entire data and their
effect on the output. However, on smaller datasets, such deep CNNs may overfit
and shallow ones find it hard to capture local artifacts. Thus some image-based
small-data applications first train their framework on a collection of patches
(instead of the entire image) to better learn the representation of localized
artifacts. Then the output is obtained by averaging the patch-level results.
Such an approach ignores the spatial correlation among patches and how various
patch locations affect the output. It also fails in cases where few patches
mainly contribute to the image label. To combat these scenarios, we develop the
notion of hyper-image representations. Our CNN has two stages. The first stage
is trained on patches. The second stage utilizes the last layer representation
developed in the first stage to form a hyper-image, which is used to train the
second stage. We show that this approach is able to develop a better mapping
between the image and its output. We analyze additional properties of our
approach and show its effectiveness on one synthetic and two real-world vision
tasks - no-reference image quality estimation and image tampering detection -
by its performance improvement over existing strong baselines.Comment: Our work on No-reference Image Quality Estimation (NR-IQA) using deep
neural network
How is Gaze Influenced by Image Transformations? Dataset and Model
Data size is the bottleneck for developing deep saliency models, because
collecting eye-movement data is very time consuming and expensive. Most of
current studies on human attention and saliency modeling have used high quality
stereotype stimuli. In real world, however, captured images undergo various
types of transformations. Can we use these transformations to augment existing
saliency datasets? Here, we first create a novel saliency dataset including
fixations of 10 observers over 1900 images degraded by 19 types of
transformations. Second, by analyzing eye movements, we find that observers
look at different locations over transformed versus original images. Third, we
utilize the new data over transformed images, called data augmentation
transformation (DAT), to train deep saliency models. We find that label
preserving DATs with negligible impact on human gaze boost saliency prediction,
whereas some other DATs that severely impact human gaze degrade the
performance. These label preserving valid augmentation transformations provide
a solution to enlarge existing saliency datasets. Finally, we introduce a novel
saliency model based on generative adversarial network (dubbed GazeGAN). A
modified UNet is proposed as the generator of the GazeGAN, which combines
classic skip connections with a novel center-surround connection (CSC), in
order to leverage multi level features. We also propose a histogram loss based
on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in
terms of luminance distribution. Extensive experiments and comparisons over 3
datasets indicate that GazeGAN achieves the best performance in terms of
popular saliency evaluation metrics, and is more robust to various
perturbations. Our code and data are available at:
https://github.com/CZHQuality/Sal-CFS-GAN
UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content
Recent years have witnessed an explosion of user-generated content (UGC)
videos shared and streamed over the Internet, thanks to the evolution of
affordable and reliable consumer capture devices, and the tremendous popularity
of social media platforms. Accordingly, there is a great need for accurate
video quality assessment (VQA) models for UGC/consumer videos to monitor,
control, and optimize this vast content. Blind quality prediction of
in-the-wild videos is quite challenging, since the quality degradations of UGC
content are unpredictable, complicated, and often commingled. Here we
contribute to advancing the UGC-VQA problem by conducting a comprehensive
evaluation of leading no-reference/blind VQA (BVQA) features and models on a
fixed evaluation architecture, yielding new empirical insights on both
subjective video quality studies and VQA model design. By employing a feature
selection strategy on top of leading VQA model features, we are able to extract
60 of the 763 statistical features used by the leading models to create a new
fusion-based BVQA model, which we dub the \textbf{VID}eo quality
\textbf{EVAL}uator (VIDEVAL), that effectively balances the trade-off between
VQA performance and efficiency. Our experimental results show that VIDEVAL
achieves state-of-the-art performance at considerably lower computational cost
than other leading models. Our study protocol also defines a reliable benchmark
for the UGC-VQA problem, which we believe will facilitate further research on
deep learning-based VQA modeling, as well as perceptually-optimized efficient
UGC video processing, transcoding, and streaming. To promote reproducible
research and public evaluation, an implementation of VIDEVAL has been made
available online: \url{https://github.com/tu184044109/VIDEVAL_release}.Comment: 13 pages, 11 figures, 11 table
- …