96 research outputs found
Blind image quality evaluation using perception based features
This paper proposes a novel no-reference Perception-based Image Quality Evaluator (PIQUE) for real-world imagery. A majority of the existing methods for blind image quality assessment rely on opinion-based supervised learning for quality score prediction. Unlike these methods, we propose an opinion unaware methodology that attempts to quantify distortion without the need for any training data. Our method relies on extracting local features for predicting quality. Additionally, to mimic human behavior, we estimate quality only from perceptually significant spatial regions. Further, the choice of our features enables us to generate a fine-grained block level distortion map. Our algorithm is competitive with the state-of-the-art based on evaluation over several popular datasets including LIVE IQA, TID & CSIQ. Finally, our algorithm has low computational complexity despite working at the block-level
An Universal Image Attractiveness Ranking Framework
We propose a new framework to rank image attractiveness using a novel
pairwise deep network trained with a large set of side-by-side multi-labeled
image pairs from a web image index. The judges only provide relative ranking
between two images without the need to directly assign an absolute score, or
rate any predefined image attribute, thus making the rating more intuitive and
accurate. We investigate a deep attractiveness rank net (DARN), a combination
of deep convolutional neural network and rank net, to directly learn an
attractiveness score mean and variance for each image and the underlying
criteria the judges use to label each pair. The extension of this model
(DARN-V2) is able to adapt to individual judge's personal preference. We also
show the attractiveness of search results are significantly improved by using
this attractiveness information in a real commercial search engine. We evaluate
our model against other state-of-the-art models on our side-by-side web test
data and another public aesthetic data set. With much less judgments (1M vs
50M), our model outperforms on side-by-side labeled data, and is comparable on
data labeled by absolute score.Comment: Accepted by 2019 Winter Conference on Application of Computer Vision
(WACV
Blind Quality Assessment for Image Superresolution Using Deep Two-Stream Convolutional Networks
Numerous image superresolution (SR) algorithms have been proposed for
reconstructing high-resolution (HR) images from input images with lower spatial
resolutions. However, effectively evaluating the perceptual quality of SR
images remains a challenging research problem. In this paper, we propose a
no-reference/blind deep neural network-based SR image quality assessor
(DeepSRQ). To learn more discriminative feature representations of various
distorted SR images, the proposed DeepSRQ is a two-stream convolutional network
including two subcomponents for distorted structure and texture SR images.
Different from traditional image distortions, the artifacts of SR images cause
both image structure and texture quality degradation. Therefore, we choose the
two-stream scheme that captures different properties of SR inputs instead of
directly learning features from one image stream. Considering the human visual
system (HVS) characteristics, the structure stream focuses on extracting
features in structural degradations, while the texture stream focuses on the
change in textural distributions. In addition, to augment the training data and
ensure the category balance, we propose a stride-based adaptive cropping
approach for further improvement. Experimental results on three publicly
available SR image quality databases demonstrate the effectiveness and
generalization ability of our proposed DeepSRQ method compared with
state-of-the-art image quality assessment algorithms
ICface: Interpretable and Controllable Face Reenactment Using GANs
This paper presents a generic face animator that is able to control the pose
and expressions of a given face image. The animation is driven by human
interpretable control signals consisting of head pose angles and the Action
Unit (AU) values. The control information can be obtained from multiple sources
including external driving videos and manual controls. Due to the interpretable
nature of the driving signal, one can easily mix the information between
multiple sources (e.g. pose from one image and expression from another) and
apply selective post-production editing. The proposed face animator is
implemented as a two-stage neural network model that is learned in a
self-supervised manner using a large video collection. The proposed
Interpretable and Controllable face reenactment network (ICface) is compared to
the state-of-the-art neural network-based face animation techniques in multiple
tasks. The results indicate that ICface produces better visual quality while
being more versatile than most of the comparison methods. The introduced model
could provide a lightweight and easy to use tool for a multitude of advanced
image and video editing tasks.Comment: Accepted in WACV-202
- …