31,192 research outputs found
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
We present a deep neural network-based approach to image quality assessment
(IQA). The network is trained end-to-end and comprises ten convolutional layers
and five pooling layers for feature extraction, and two fully connected layers
for regression, which makes it significantly deeper than related IQA models.
Unique features of the proposed architecture are that: 1) with slight
adaptations it can be used in a no-reference (NR) as well as in a
full-reference (FR) IQA setting and 2) it allows for joint learning of local
quality and local weights, i.e., relative importance of local quality to the
global quality estimate, in an unified framework. Our approach is purely
data-driven and does not rely on hand-crafted features or other types of prior
domain knowledge about the human visual system or image statistics. We evaluate
the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the
LIVE In the wild image quality challenge database and show superior performance
to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation
shows a high ability to generalize between different databases, indicating a
high robustness of the learned features
WESPE: Weakly Supervised Photo Enhancer for Digital Cameras
Low-end and compact mobile cameras demonstrate limited photo quality mainly
due to space, hardware and budget constraints. In this work, we propose a deep
learning solution that translates photos taken by cameras with limited
capabilities into DSLR-quality photos automatically. We tackle this problem by
introducing a weakly supervised photo enhancer (WESPE) - a novel image-to-image
Generative Adversarial Network-based architecture. The proposed model is
trained by under weak supervision: unlike previous works, there is no need for
strong supervision in the form of a large annotated dataset of aligned
original/enhanced photo pairs. The sole requirement is two distinct datasets:
one from the source camera, and one composed of arbitrary high-quality images
that can be generally crawled from the Internet - the visual content they
exhibit may be unrelated. Hence, our solution is repeatable for any camera:
collecting the data and training can be achieved in a couple of hours. In this
work, we emphasize on extensive evaluation of obtained results. Besides
standard objective metrics and subjective user study, we train a virtual rater
in the form of a separate CNN that mimics human raters on Flickr data and use
this network to get reference scores for both original and enhanced photos. Our
experiments on the DPED, KITTI and Cityscapes datasets as well as pictures from
several generations of smartphones demonstrate that WESPE produces comparable
or improved qualitative results with state-of-the-art strongly supervised
methods
Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image
Image metrics predict the perceived per-pixel difference between a reference
image and its degraded (e. g., re-rendered) version. In several important
applications, the reference image is not available and image metrics cannot be
applied. We devise a neural network architecture and training procedure that
allows predicting the MSE, SSIM or VGG16 image difference from the distorted
image alone while the reference is not observed. This is enabled by two
insights: The first is to inject sufficiently many un-distorted natural image
patches, which can be found in arbitrary amounts and are known to have no
perceivable difference to themselves. This avoids false positives. The second
is to balance the learning, where it is carefully made sure that all image
errors are equally likely, avoiding false negatives. Surprisingly, we observe,
that the resulting no-reference metric, subjectively, can even perform better
than the reference-based one, as it had to become robust against
mis-alignments. We evaluate the effectiveness of our approach in an image-based
rendering context, both quantitatively and qualitatively. Finally, we
demonstrate two applications which reduce light field capture time and provide
guidance for interactive depth adjustment.Comment: 13 pages, 11 figure
- …