35 research outputs found
Generalized Expectation Maximization Framework for Blind Image Super Resolution
Learning-based methods for blind single image super resolution (SISR) conduct
the restoration by a learned mapping between high-resolution (HR) images and
their low-resolution (LR) counterparts degraded with arbitrary blur kernels.
However, these methods mostly require an independent step to estimate the blur
kernel, leading to error accumulation between steps. We propose an end-to-end
learning framework for the blind SISR problem, which enables image restoration
within a unified Bayesian framework with either full- or semi-supervision. The
proposed method, namely SREMN, integrates learning techniques into the
generalized expectation-maximization (GEM) algorithm and infers HR images from
the maximum likelihood estimation (MLE). Extensive experiments show the
superiority of the proposed method with comparison to existing work and novelty
in semi-supervised learning
NLCUnet: Single-Image Super-Resolution Network with Hairline Details
Pursuing the precise details of super-resolution images is challenging for
single-image super-resolution tasks. This paper presents a single-image
super-resolution network with hairline details (termed NLCUnet), including
three core designs. Specifically, a non-local attention mechanism is first
introduced to restore local pieces by learning from the whole image region.
Then, we find that the blur kernel trained by the existing work is unnecessary.
Based on this finding, we create a new network architecture by integrating
depth-wise convolution with channel attention without the blur kernel
estimation, resulting in a performance improvement instead. Finally, to make
the cropped region contain as much semantic information as possible, we propose
a random 6464 crop inside the central 512512 crop instead of a
direct random crop inside the whole image of 2K size. Numerous experiments
conducted on the benchmark DF2K dataset demonstrate that our NLCUnet performs
better than the state-of-the-art in terms of the PSNR and SSIM metrics and
yields visually favorable hairline details.Comment: 6 pages,5 figure
MMSR: Multiple-Model Learned Image Super-Resolution Benefiting From Class-Specific Image Priors
Assuming a known degradation model, the performance of a learned image
super-resolution (SR) model depends on how well the variety of image
characteristics within the training set matches those in the test set. As a
result, the performance of an SR model varies noticeably from image to image
over a test set depending on whether characteristics of specific images are
similar to those in the training set or not. Hence, in general, a single SR
model cannot generalize well enough for all types of image content. In this
work, we show that training multiple SR models for different classes of images
(e.g., for text, texture, etc.) to exploit class-specific image priors and
employing a post-processing network that learns how to best fuse the outputs
produced by these multiple SR models surpasses the performance of
state-of-the-art generic SR models. Experimental results clearly demonstrate
that the proposed multiple-model SR (MMSR) approach significantly outperforms a
single pre-trained state-of-the-art SR model both quantitatively and visually.
It even exceeds the performance of the best single class-specific SR model
trained on similar text or texture images.Comment: 5 pages, 4 figures, accepted for publication in IEEE ICIP 2022
Conferenc
Cross-Quality LFW: A Database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments
Real-world face recognition applications often deal with suboptimal image
quality or resolution due to different capturing conditions such as various
subject-to-camera distances, poor camera settings, or motion blur. This
characteristic has an unignorable effect on performance. Recent
cross-resolution face recognition approaches used simple, arbitrary, and
unrealistic down- and up-scaling techniques to measure robustness against
real-world edge-cases in image quality. Thus, we propose a new standardized
benchmark dataset and evaluation protocol derived from the famous Labeled Faces
in the Wild (LFW). In contrast to previous derivatives, which focus on pose,
age, similarity, and adversarial attacks, our Cross-Quality Labeled Faces in
the Wild (XQLFW) maximizes the quality difference. It contains only more
realistic synthetically degraded images when necessary. Our proposed dataset is
then used to further investigate the influence of image quality on several
state-of-the-art approaches. With XQLFW, we show that these models perform
differently in cross-quality cases, and hence, the generalizing capability is
not accurately predicted by their performance on LFW. Additionally, we report
baseline accuracy with recent deep learning models explicitly trained for
cross-resolution applications and evaluate the susceptibility to image quality.
To encourage further research in cross-resolution face recognition and incite
the assessment of image quality robustness, we publish the database and code
for evaluation.Comment: 9 pages, 4 figures, 2 table
A Modular Deep Learning Framework for Scene Understanding in Augmented Reality Applications
Taking as input natural images and videos augmented reality (AR) applications aim to enhance the real world with superimposed digital contents enabling interaction between the user and the environment. One important step in this process is automatic scene analysis and understanding that should be performed both in real time and with a good level of object recognition accuracy. In this work an end-to-end framework based on the combination of a Super Resolution network with a detection and recognition deep network has been proposed to increase performance and lower processing time. This novel approach has been evaluated on two different datasets: the popular COCO dataset whose real images are used for benchmarking many different computer vision tasks, and a generated dataset with synthetic images recreating a variety of environmental, lighting and acquisition conditions. The evaluation analysis is focused on small objects, which are more challenging to be correctly detected and recognised. The results show that the Average Precision is higher for smaller and low resolution objects for the proposed end-to-end approach in most of the selected conditions
Trainable Loss Weights in Super-Resolution
In recent years, research on super-resolution has primarily focused on the
development of unsupervised models, blind networks, and the use of optimization
methods in non-blind models. But, limited research has discussed the loss
function in the super-resolution process. The majority of those studies have
only used perceptual similarity in a conventional way. This is while the
development of appropriate loss can improve the quality of other methods as
well. In this article, a new weighting method for pixel-wise loss is proposed.
With the help of this method, it is possible to use trainable weights based on
the general structure of the image and its perceptual features while
maintaining the advantages of pixel-wise loss. Also, a criterion for comparing
weights of loss is introduced so that the weights can be estimated directly by
a convolutional neural network using this criterion. In addition, in this
article, the expectation-maximization method is used for the simultaneous
estimation super-resolution network and weighting network. In addition, a new
activation function, called "FixedSum", is introduced which can keep the sum of
all components of vector constants while keeping the output components between
zero and one. As shown in the experimental results section, weighted loss by
the proposed method leads to better results than the unweighted loss in both
signal-to-noise and perceptual similarity senses.Comment: 7 pages, 3 figures, 1 tabl