19,620 research outputs found
Deep Optimization model for Screen Content Image Quality Assessment using Neural Networks
In this paper, we propose a novel quadratic optimized model based on the deep
convolutional neural network (QODCNN) for full-reference and no-reference
screen content image (SCI) quality assessment. Unlike traditional CNN methods
taking all image patches as training data and using average quality pooling,
our model is optimized to obtain a more effective model including three steps.
In the first step, an end-to-end deep CNN is trained to preliminarily predict
the image visual quality, and batch normalized (BN) layers and l2
regularization are employed to improve the speed and performance of network
fitting. For second step, the pretrained model is fine-tuned to achieve better
performance under analysis of the raw training data. An adaptive weighting
method is proposed in the third step to fuse local quality inspired by the
perceptual property of the human visual system (HVS) that the HVS is sensitive
to image patches containing texture and edge information. The novelty of our
algorithm can be concluded as follows: 1) with the consideration of correlation
between local quality and subjective differential mean opinion score (DMOS),
the Euclidean distance is utilized to measure effectiveness of image patches,
and the pretrained model is fine-tuned with more effective training data; 2) an
adaptive pooling approach is employed to fuse patch quality of textual and
pictorial regions, whose feature only extracted from distorted images owns
strong noise robust and effects on both FR and NR IQA; 3) Considering the
characteristics of SCIs, a deep and valid network architecture is designed for
both NR and FR visual quality evaluation of SCIs. Experimental results verify
that our model outperforms both current no-reference and full-reference image
quality assessment methods on the benchmark screen content image quality
assessment database (SIQAD).Comment: 12pages, 9 figure
Blind Predicting Similar Quality Map for Image Quality Assessment
A key problem in blind image quality assessment (BIQA) is how to effectively
model the properties of human visual system in a data-driven manner. In this
paper, we propose a simple and efficient BIQA model based on a novel framework
which consists of a fully convolutional neural network (FCNN) and a pooling
network to solve this problem. In principle, FCNN is capable of predicting a
pixel-by-pixel similar quality map only from a distorted image by using the
intermediate similarity maps derived from conventional full-reference image
quality assessment methods. The predicted pixel-by-pixel quality maps have good
consistency with the distortion correlations between the reference and
distorted images. Finally, a deep pooling network regresses the quality map
into a score. Experiments have demonstrated that our predictions outperform
many state-of-the-art BIQA methods
Convolutional Neural Networks for Video Quality Assessment
Video Quality Assessment (VQA) is a very challenging task due to its highly
subjective nature. Moreover, many factors influence VQA. Compression of video
content, while necessary for minimising transmission and storage requirements,
introduces distortions which can have detrimental effects on the perceived
quality. Especially when dealing with modern video coding standards, it is
extremely difficult to model the effects of compression due to the
unpredictability of encoding on different content types. Moreover, transmission
also introduces delays and other distortion types which affect the perceived
quality. Therefore, it would be highly beneficial to accurately predict the
perceived quality of video to be distributed over modern content distribution
platforms, so that specific actions could be undertaken to maximise the Quality
of Experience (QoE) of the users. Traditional VQA techniques based on feature
extraction and modelling may not be sufficiently accurate. In this paper, a
novel Deep Learning (DL) framework is introduced for effectively predicting VQA
of video content delivery mechanisms based on end-to-end feature learning. The
proposed framework is based on Convolutional Neural Networks, taking into
account compression distortion as well as transmission delays. Training and
evaluation of the proposed framework are performed on a user annotated VQA
dataset specifically created to undertake this work. The experiments show that
the proposed methods can lead to high accuracy of the quality estimation,
showcasing the potential of using DL in complex VQA scenarios.Comment: Number of Pages: 12, Number of Figures: 17, Submitted to: Signal
Processing: Image Communication (Elsevier
Comparison-based Image Quality Assessment for Parameter Selection
Image quality assessment (IQA) is traditionally classified into
full-reference (FR) IQA and no-reference (NR) IQA according to whether the
original image is required. Although NR-IQA is widely used in practical
applications, room for improvement still remains because of the lack of the
reference image. Inspired by the fact that in many applications, such as
parameter selection, a series of distorted images are available, the authors
propose a novel comparison-based image quality assessment (C-IQA) method. The
new comparison-based framework parallels FR-IQA by requiring two input images,
and resembles NR-IQA by not using the original image. As a result, the new
comparison-based approach has more application scenarios than FR-IQA does, and
takes greater advantage of the accessible information than the traditional
single-input NR-IQA does. Further, C-IQA is compared with other
state-of-the-art NR-IQA methods on two widely used IQA databases. Experimental
results show that C-IQA outperforms the other NR-IQA methods for parameter
selection, and the parameter trimming framework combined with C-IQA saves the
computation of iterative image reconstruction up to 80%.Comment: 12 pages, 15 figure
Cross-Resolution Person Re-identification with Deep Antithetical Learning
Images with different resolutions are ubiquitous in public person
re-identification (ReID) datasets and real-world scenes, it is thus crucial for
a person ReID model to handle the image resolution variations for improving its
generalization ability. However, most existing person ReID methods pay little
attention to this resolution discrepancy problem. One paradigm to deal with
this problem is to use some complicated methods for mapping all images into an
artificial image space, which however will disrupt the natural image
distribution and requires heavy image preprocessing. In this paper, we analyze
the deficiencies of several widely-used objective functions handling image
resolution discrepancies and propose a new framework called deep antithetical
learning that directly learns from the natural image space rather than creating
an arbitrary one. We first quantify and categorize original training images
according to their resolutions. Then we create an antithetical training set and
make sure that original training images have counterparts with antithetical
resolutions in this new set. At last, a novel Contrastive Center Loss(CCL) is
proposed to learn from images with different resolutions without being
interfered by their resolution discrepancies. Extensive experimental analyses
and evaluations indicate that the proposed framework, even using a vanilla deep
ReID network, exhibits remarkable performance improvements. Without bells and
whistles, our approach outperforms previous state-of-the-art methods by a large
margin
No-Reference Color Image Quality Assessment: From Entropy to Perceptual Quality
This paper presents a high-performance general-purpose no-reference (NR)
image quality assessment (IQA) method based on image entropy. The image
features are extracted from two domains. In the spatial domain, the mutual
information between the color channels and the two-dimensional entropy are
calculated. In the frequency domain, the two-dimensional entropy and the mutual
information of the filtered sub-band images are computed as the feature set of
the input color image. Then, with all the extracted features, the support
vector classifier (SVC) for distortion classification and support vector
regression (SVR) are utilized for the quality prediction, to obtain the final
quality assessment score. The proposed method, which we call entropy-based
no-reference image quality assessment (ENIQA), can assess the quality of
different categories of distorted images, and has a low complexity. The
proposed ENIQA method was assessed on the LIVE and TID2013 databases and showed
a superior performance. The experimental results confirmed that the proposed
ENIQA method has a high consistency of objective and subjective assessment on
color images, which indicates the good overall performance and generalization
ability of ENIQA. The source code is available on github
https://github.com/jacob6/ENIQA.Comment: 12 pages, 8 figure
A-Lamp: Adaptive Layout-Aware Multi-Patch Deep Convolutional Neural Network for Photo Aesthetic Assessment
Deep convolutional neural networks (CNN) have recently been shown to generate
promising results for aesthetics assessment. However, the performance of these
deep CNN methods is often compromised by the constraint that the neural network
only takes the fixed-size input. To accommodate this requirement, input images
need to be transformed via cropping, warping, or padding, which often alter
image composition, reduce image resolution, or cause image distortion. Thus the
aesthetics of the original images is impaired because of potential loss of fine
grained details and holistic image layout. However, such fine grained details
and holistic image layout is critical for evaluating an image's aesthetics. In
this paper, we present an Adaptive Layout-Aware Multi-Patch Convolutional
Neural Network (A-Lamp CNN) architecture for photo aesthetic assessment. This
novel scheme is able to accept arbitrary sized images, and learn from both
fined grained details and holistic image layout simultaneously. To enable
training on these hybrid inputs, we extend the method by developing a dedicated
double-subnet neural network structure, i.e. a Multi-Patch subnet and a
Layout-Aware subnet. We further construct an aggregation layer to effectively
combine the hybrid features from these two subnets. Extensive experiments on
the large-scale aesthetics assessment benchmark (AVA) demonstrate significant
performance improvement over the state-of-the-art in photo aesthetic
assessment
Learning to Predict Streaming Video QoE: Distortions, Rebuffering and Memory
Mobile streaming video data accounts for a large and increasing percentage of
wireless network traffic. The available bandwidths of modern wireless networks
are often unstable, leading to difficulties in delivering smooth, high-quality
video. Streaming service providers such as Netflix and YouTube attempt to adapt
their systems to adjust in response to these bandwidth limitations by changing
the video bitrate or, failing that, allowing playback interruptions
(rebuffering). Being able to predict end user' quality of experience (QoE)
resulting from these adjustments could lead to perceptually-driven network
resource allocation strategies that would deliver streaming content of higher
quality to clients, while being cost effective for providers. Existing
objective QoE models only consider the effects on user QoE of video quality
changes or playback interruptions. For streaming applications, adaptive network
strategies may involve a combination of dynamic bitrate allocation along with
playback interruptions when the available bandwidth reaches a very low value.
Towards effectively predicting user QoE, we propose Video Assessment of
TemporaL Artifacts and Stalls (Video ATLAS): a machine learning framework where
we combine a number of QoE-related features, including objective quality
features, rebuffering-aware features and memory-driven features to make QoE
predictions. We evaluated our learning-based QoE prediction model on the
recently designed LIVE-Netflix Video QoE Database which consists of practical
playout patterns, where the videos are afflicted by both quality changes and
rebuffering events, and found that it provides improved performance over
state-of-the-art video quality metrics while generalizing well on different
datasets. The proposed algorithm is made publicly available at
http://live.ece.utexas.edu/research/Quality/VideoATLAS release_v2.rar.Comment: under review in Transactions on Image Processin
dipIQ: Blind Image Quality Assessment by Learning-to-Rank Discriminable Image Pairs
Objective assessment of image quality is fundamentally important in many
image processing tasks. In this work, we focus on learning blind image quality
assessment (BIQA) models which predict the quality of a digital image with no
access to its original pristine-quality counterpart as reference. One of the
biggest challenges in learning BIQA models is the conflict between the gigantic
image space (which is in the dimension of the number of image pixels) and the
extremely limited reliable ground truth data for training. Such data are
typically collected via subjective testing, which is cumbersome, slow, and
expensive. Here we first show that a vast amount of reliable training data in
the form of quality-discriminable image pairs (DIP) can be obtained
automatically at low cost by exploiting large-scale databases with diverse
image content. We then learn an opinion-unaware BIQA (OU-BIQA, meaning that no
subjective opinions are used for training) model using RankNet, a pairwise
learning-to-rank (L2R) algorithm, from millions of DIPs, each associated with a
perceptual uncertainty level, leading to a DIP inferred quality (dipIQ) index.
Extensive experiments on four benchmark IQA databases demonstrate that dipIQ
outperforms state-of-the-art OU-BIQA models. The robustness of dipIQ is also
significantly improved as confirmed by the group MAximum Differentiation (gMAD)
competition method. Furthermore, we extend the proposed framework by learning
models with ListNet (a listwise L2R algorithm) on quality-discriminable image
lists (DIL). The resulting DIL Inferred Quality (dilIQ) index achieves an
additional performance gain
Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment
Face recognition has made significant progress in recent years due to deep
convolutional neural networks (CNN). In many face recognition (FR) scenarios,
face images are acquired from a sequence with huge intra-variations. These
intra-variations, which are mainly affected by the low-quality face images,
cause instability of recognition performance. Previous works have focused on
ad-hoc methods to select frames from a video or use face image quality
assessment (FIQA) methods, which consider only a particular or combination of
several distortions.
In this work, we present an efficient non-reference image quality assessment
for FR that directly links image quality assessment (IQA) and FR. More
specifically, we propose a new measurement to evaluate image quality without
any reference. Based on the proposed quality measurement, we propose a deep
Tiny Face Quality network (tinyFQnet) to learn a quality prediction function
from data.
We evaluate the proposed method for different powerful FR models on two
classical video-based (or template-based) benchmark: IJB-B and YTF. Extensive
experiments show that, although the tinyFQnet is much smaller than the others,
the proposed method outperforms state-of-the-art quality assessment methods in
terms of effectiveness and efficiency
- …