89,086 research outputs found
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
We present a deep neural network-based approach to image quality assessment
(IQA). The network is trained end-to-end and comprises ten convolutional layers
and five pooling layers for feature extraction, and two fully connected layers
for regression, which makes it significantly deeper than related IQA models.
Unique features of the proposed architecture are that: 1) with slight
adaptations it can be used in a no-reference (NR) as well as in a
full-reference (FR) IQA setting and 2) it allows for joint learning of local
quality and local weights, i.e., relative importance of local quality to the
global quality estimate, in an unified framework. Our approach is purely
data-driven and does not rely on hand-crafted features or other types of prior
domain knowledge about the human visual system or image statistics. We evaluate
the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the
LIVE In the wild image quality challenge database and show superior performance
to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation
shows a high ability to generalize between different databases, indicating a
high robustness of the learned features
Adaptive Density Estimation for Generative Models
Unsupervised learning of generative models has seen tremendous progress over
recent years, in particular due to generative adversarial networks (GANs),
variational autoencoders, and flow-based models. GANs have dramatically
improved sample quality, but suffer from two drawbacks: (i) they mode-drop,
i.e., do not cover the full support of the train data, and (ii) they do not
allow for likelihood evaluations on held-out data. In contrast,
likelihood-based training encourages models to cover the full support of the
train data, but yields poorer samples. These mutual shortcomings can in
principle be addressed by training generative latent variable models in a
hybrid adversarial-likelihood manner. However, we show that commonly made
parametric assumptions create a conflict between them, making successful hybrid
models non trivial. As a solution, we propose to use deep invertible
transformations in the latent variable decoder. This approach allows for
likelihood computations in image space, is more efficient than fully invertible
models, and can take full advantage of adversarial training. We show that our
model significantly improves over existing hybrid models: offering GAN-like
samples, IS and FID scores that are competitive with fully adversarial models,
and improved likelihood scores
Active Image-based Modeling with a Toy Drone
Image-based modeling techniques can now generate photo-realistic 3D models
from images. But it is up to users to provide high quality images with good
coverage and view overlap, which makes the data capturing process tedious and
time consuming. We seek to automate data capturing for image-based modeling.
The core of our system is an iterative linear method to solve the multi-view
stereo (MVS) problem quickly and plan the Next-Best-View (NBV) effectively. Our
fast MVS algorithm enables online model reconstruction and quality assessment
to determine the NBVs on the fly. We test our system with a toy unmanned aerial
vehicle (UAV) in simulated, indoor and outdoor experiments. Results show that
our system improves the efficiency of data acquisition and ensures the
completeness of the final model.Comment: To be published on International Conference on Robotics and
Automation 2018, Brisbane, Australia. Project Page:
https://huangrui815.github.io/active-image-based-modeling/ The author's
personal page: http://www.sfu.ca/~rha55
Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank
For many applications the collection of labeled data is expensive laborious.
Exploitation of unlabeled data during training is thus a long pursued objective
of machine learning. Self-supervised learning addresses this by positing an
auxiliary task (different, but related to the supervised task) for which data
is abundantly available. In this paper, we show how ranking can be used as a
proxy task for some regression problems. As another contribution, we propose an
efficient backpropagation technique for Siamese networks which prevents the
redundant computation introduced by the multi-branch network architecture. We
apply our framework to two regression problems: Image Quality Assessment (IQA)
and Crowd Counting. For both we show how to automatically generate ranked image
sets from unlabeled data. Our results show that networks trained to regress to
the ground truth targets for labeled data and to simultaneously learn to rank
unlabeled data obtain significantly better, state-of-the-art results for both
IQA and crowd counting. In addition, we show that measuring network uncertainty
on the self-supervised proxy task is a good measure of informativeness of
unlabeled data. This can be used to drive an algorithm for active learning and
we show that this reduces labeling effort by up to 50%.Comment: Accepted at TPAMI. (Keywords: Learning from rankings, image quality
assessment, crowd counting, active learning). arXiv admin note: text overlap
with arXiv:1803.0309
- …