10,681 research outputs found
Context-Patch Face Hallucination Based on Thresholding Locality-constrained Representation and Reproducing Learning
Face hallucination is a technique that reconstruct high-resolution (HR) faces
from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR
face pairs. Most state-of-the-arts leverage position-patch prior knowledge of
human face to estimate the optimal representation coefficients for each image
patch. However, they focus only the position information and usually ignore the
context information of image patch. In addition, when they are confronted with
misalignment or the Small Sample Size (SSS) problem, the hallucination
performance is very poor. To this end, this study incorporates the contextual
information of image patch and proposes a powerful and efficient context-patch
based face hallucination approach, namely Thresholding Locality-constrained
Representation and Reproducing learning (TLcR-RL). Under the context-patch
based framework, we advance a thresholding based representation method to
enhance the reconstruction accuracy and reduce the computational complexity. To
further improve the performance of the proposed algorithm, we propose a
promotion strategy called reproducing learning. By adding the estimated HR face
to the training set, which can simulates the case that the HR version of the
input LR face is present in the training set, thus iteratively enhancing the
final hallucination result. Experiments demonstrate that the proposed TLcR-RL
method achieves a substantial increase in the hallucinated results, both
subjectively and objectively. Additionally, the proposed framework is more
robust to face misalignment and the SSS problem, and its hallucinated HR face
is still very good when the LR test face is from the real-world. The MATLAB
source code is available at https://github.com/junjun-jiang/TLcR-RLComment: 13 pages, 15 figures, Accepted by IEEE TCY
Kernel based low-rank sparse model for single image super-resolution
Self-similarity learning has been recognized as a promising method for single
image super-resolution (SR) to produce high-resolution (HR) image in recent
years. The performance of learning based SR reconstruction, however, highly
depends on learned representation coeffcients. Due to the degradation of input
image, conventional sparse coding is prone to produce unfaithful representation
coeffcients. To this end, we propose a novel kernel based low-rank sparse model
with self-similarity learning for single image SR which incorporates
nonlocalsimilarity prior to enforce similar patches having similar
representation weights. We perform a gradual magnification scheme, using
self-examples extracted from the degraded input image and up-scaled versions.
To exploit nonlocal-similarity, we concatenate the vectorized input patch and
its nonlocal neighbors at different locations into a data matrix which consists
of similar components. Then we map the nonlocal data matrix into a
high-dimensional feature space by kernel method to capture their nonlinear
structures. Under the assumption that the sparse coeffcients for the nonlocal
data in the kernel space should be low-rank, we impose low-rank constraint on
sparse coding to share similarities among representation coeffcients and remove
outliers in order that stable weights for SR reconstruction can be obtained.
Experimental results demonstrate the advantage of our proposed method in both
visual quality and reconstruction error.Comment: 27 pages, Keywords: low-rank, sparse representation, kernel method,
self-similarity learning, super-resolutio
Synthesis-based Robust Low Resolution Face Recognition
Recognition of low resolution face images is a challenging problem in many
practical face recognition systems. Methods have been proposed in the face
recognition literature for the problem which assume that the probe is low
resolution, but a high resolution gallery is available for recognition. These
attempts have been aimed at modifying the probe image such that the resultant
image provides better discrimination. We formulate the problem differently by
leveraging the information available in the high resolution gallery image and
propose a dictionary learning approach for classifying the low-resolution probe
image. An important feature of our algorithm is that it can handle resolution
change along with illumination variations. Furthermore, we also kernelize the
algorithm to handle non-linearity in data and present a joint dictionary
learning technique for robust recognition at low resolutions. The effectiveness
of the proposed method is demonstrated using standard datasets and a
challenging outdoor face dataset. It is shown that our method is efficient and
can perform significantly better than many competitive low resolution face
recognition algorithms
A survey of sparse representation: algorithms and applications
Sparse representation has attracted much attention from researchers in fields
of signal processing, image processing, computer vision and pattern
recognition. Sparse representation also has a good reputation in both
theoretical research and practical applications. Many different algorithms have
been proposed for sparse representation. The main purpose of this article is to
provide a comprehensive study and an updated review on sparse representation
and to supply a guidance for researchers. The taxonomy of sparse representation
methods can be studied from various viewpoints. For example, in terms of
different norm minimizations used in sparsity constraints, the methods can be
roughly categorized into five groups: sparse representation with -norm
minimization, sparse representation with -norm (0p1) minimization,
sparse representation with -norm minimization and sparse representation
with -norm minimization. In this paper, a comprehensive overview of
sparse representation is provided. The available sparse representation
algorithms can also be empirically categorized into four groups: greedy
strategy approximation, constrained optimization, proximity algorithm-based
optimization, and homotopy algorithm-based sparse representation. The
rationales of different algorithms in each category are analyzed and a wide
range of sparse representation applications are summarized, which could
sufficiently reveal the potential nature of the sparse representation theory.
Specifically, an experimentally comparative study of these sparse
representation algorithms was presented. The Matlab code used in this paper can
be available at: http://www.yongxu.org/lunwen.html.Comment: Published on IEEE Access, Vol. 3, pp. 490-530, 201
Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach
Emotion recognition from facial expressions is tremendously useful,
especially when coupled with smart devices and wireless multimedia
applications. However, the inadequate network bandwidth often limits the
spatial resolution of the transmitted video, which will heavily degrade the
recognition reliability. We develop a novel framework to achieve robust emotion
recognition from low bit rate video. While video frames are downsampled at the
encoder side, the decoder is embedded with a deep network model for joint
super-resolution (SR) and recognition. Notably, we propose a novel max-mix
training strategy, leading to a single "One-for-All" model that is remarkably
robust to a vast range of downsampling factors. That makes our framework well
adapted for the varied bandwidths in real transmission scenarios, without
hampering scalability or efficiency. The proposed framework is evaluated on the
AVEC 2016 benchmark, and demonstrates significantly improved stand-alone
recognition performance, as well as rate-distortion (R-D) performance, than
either directly recognizing from LR frames, or separating SR and recognition.Comment: Accepted by the Seventh International Conference on Affective
Computing and Intelligent Interaction (ACII2017
Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning
Single image super resolution (SR), which refers to reconstruct a
higher-resolution (HR) image from the observed low-resolution (LR) image, has
received substantial attention due to its tremendous application potentials.
Despite the breakthroughs of recently proposed SR methods using convolutional
neural networks (CNNs), their generated results usually lack of preserving
structural (high-frequency) details. In this paper, regarding global boundary
context and residual context as complimentary information for enhancing
structural details in image restoration, we develop a contextualized multi-task
learning framework to address the SR problem. Specifically, our method first
extracts convolutional features from the input LR image and applies one
deconvolutional module to interpolate the LR feature maps in a content-adaptive
way. Then, the resulting feature maps are fed into two branched sub-networks.
During the neural network training, one sub-network outputs salient image
boundaries and the HR image, and the other sub-network outputs the local
residual map, i.e., the residual difference between the generated HR image and
ground-truth image. On several standard benchmarks (i.e., Set5, Set14 and
BSD200), our extensive evaluations demonstrate the effectiveness of our SR
method on achieving both higher restoration quality and computational
efficiency compared with several state-of-the-art SR approaches. The source
code and some SR results can be found at:
http://hcp.sysu.edu.cn/structure-preserving-image-super-resolution/Comment: To appear in Transactions on Multimedia 201
UG Track 2: A Collective Benchmark Effort for Evaluating and Advancing Image Understanding in Poor Visibility Environments
The UG challenge in IEEE CVPR 2019 aims to evoke a comprehensive
discussion and exploration about how low-level vision techniques can benefit
the high-level automatic visual recognition in various scenarios. In its second
track, we focus on object or face detection in poor visibility enhancements
caused by bad weathers (haze, rain) and low light conditions. While existing
enhancement methods are empirically expected to help the high-level end task,
that is observed to not always be the case in practice. To provide a more
thorough examination and fair comparison, we introduce three benchmark sets
collected in real-world hazy, rainy, and low-light conditions, respectively,
with annotate objects/faces annotated. To our best knowledge, this is the first
and currently largest effort of its kind. Baseline results by cascading
existing enhancement and detection models are reported, indicating the highly
challenging nature of our new data as well as the large room for further
technical innovations. We expect a large participation from the broad research
community to address these challenges together.Comment: A summary paper on datasets, fact sheets, baseline results, challenge
results, and winning methods in UG Challenge (Track 2). More materials
are provided in http://www.ug2challenge.org/index.htm
Attention-Aware Face Hallucination via Deep Reinforcement Learning
Face hallucination is a domain-specific super-resolution problem with the
goal to generate high-resolution (HR) faces from low-resolution (LR) input
images. In contrast to existing methods that often learn a single
patch-to-patch mapping from LR to HR images and are regardless of the
contextual interdependency between patches, we propose a novel Attention-aware
Face Hallucination (Attention-FH) framework which resorts to deep reinforcement
learning for sequentially discovering attended patches and then performing the
facial part enhancement by fully exploiting the global interdependency of the
image. Specifically, in each time step, the recurrent policy network is
proposed to dynamically specify a new attended region by incorporating what
happened in the past. The state (i.e., face hallucination result for the whole
image) can thus be exploited and updated by the local enhancement network on
the selected region. The Attention-FH approach jointly learns the recurrent
policy network and local enhancement network through maximizing the long-term
reward that reflects the hallucination performance over the whole image.
Therefore, our proposed Attention-FH is capable of adaptively personalizing an
optimal searching path for each face image according to its own characteristic.
Extensive experiments show our approach significantly surpasses the
state-of-the-arts on in-the-wild faces with large pose and illumination
variations
Joint Maximum Purity Forest with Application to Image Super-Resolution
In this paper, we propose a novel random-forest scheme, namely Joint Maximum
Purity Forest (JMPF), for classification, clustering, and regression tasks. In
the JMPF scheme, the original feature space is transformed into a compactly
pre-clustered feature space, via a trained rotation matrix. The rotation matrix
is obtained through an iterative quantization process, where the input data
belonging to different classes are clustered to the respective vertices of the
new feature space with maximum purity. In the new feature space, orthogonal
hyperplanes, which are employed at the split-nodes of decision trees in random
forests, can tackle the clustering problems effectively. We evaluated our
proposed method on public benchmark datasets for regression and classification
tasks, and experiments showed that JMPF remarkably outperforms other
state-of-the-art random-forest-based approaches. Furthermore, we applied JMPF
to image super-resolution, because the transformed, compact features are more
discriminative to the clustering-regression scheme. Experiment results on
several public benchmark datasets also showed that the JMPF-based image
super-resolution scheme is consistently superior to recent state-of-the-art
image super-resolution algorithms.Comment: 18 pages, 7 figure
- …