728 research outputs found
Context-Patch Face Hallucination Based on Thresholding Locality-Constrained Representation and Reproducing Learning
Face hallucination is a technique that reconstruct high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the context information of image patch. In addition, when they are confronted with misalignment or the Small Sample Size (SSS) problem, the hallucination performance is very poor. To this end, this study incorporates the contextual information of image patch and proposes a powerful and efficient context-patch based face hallucination approach, namely Thresholding Locality-constrained Representation and Reproducing learning (TLcR-RL). Under the context-patch based framework, we advance a thresholding based representation method to enhance the reconstruction accuracy and reduce the computational complexity. To further improve the performance of the proposed algorithm, we propose a promotion strategy called reproducing learning. By adding the estimated HR face to the training set, which can simulates the case that the HR version of the input LR face is present in the training set, thus iteratively enhancing the final hallucination result. Experiments demonstrate that the proposed TLcR-RL method achieves a substantial increase in the hallucinated results, both subjectively and objectively. Additionally, the proposed framework is more robust to face misalignment and the SSS problem, and its hallucinated HR face is still very good when the LR test face is from the real-world. The MATLAB source code is available at https://github.com/junjun-jiang/TLcR-RL
High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks
Synthesizing face sketches from real photos and its inverse have many
applications. However, photo/sketch synthesis remains a challenging problem due
to the fact that photo and sketch have different characteristics. In this work,
we consider this task as an image-to-image translation problem and explore the
recently popular generative models (GANs) to generate high-quality realistic
photos from sketches and sketches from photos. Recent GAN-based methods have
shown promising results on image-to-image translation problems and
photo-to-sketch synthesis in particular, however, they are known to have
limited abilities in generating high-resolution realistic images. To this end,
we propose a novel synthesis framework called Photo-Sketch Synthesis using
Multi-Adversarial Networks, (PS2-MAN) that iteratively generates low resolution
to high resolution images in an adversarial way. The hidden layers of the
generator are supervised to first generate lower resolution images followed by
implicit refinement in the network to generate higher resolution images.
Furthermore, since photo-sketch synthesis is a coupled/paired translation
problem, we leverage the pair information using CycleGAN framework. Both Image
Quality Assessment (IQA) and Photo-Sketch Matching experiments are conducted to
demonstrate the superior performance of our framework in comparison to existing
state-of-the-art solutions. Code available at:
https://github.com/lidan1/PhotoSketchMAN.Comment: Accepted by 2018 13th IEEE International Conference on Automatic Face
& Gesture Recognition (FG 2018)(Oral
A Comprehensive Review of Deep Learning-based Single Image Super-resolution
Image super-resolution (SR) is one of the vital image processing methods that
improve the resolution of an image in the field of computer vision. In the last
two decades, significant progress has been made in the field of
super-resolution, especially by utilizing deep learning methods. This survey is
an effort to provide a detailed survey of recent progress in single-image
super-resolution in the perspective of deep learning while also informing about
the initial classical methods used for image super-resolution. The survey
classifies the image SR methods into four categories, i.e., classical methods,
supervised learning-based methods, unsupervised learning-based methods, and
domain-specific SR methods. We also introduce the problem of SR to provide
intuition about image quality metrics, available reference datasets, and SR
challenges. Deep learning-based approaches of SR are evaluated using a
reference dataset. Some of the reviewed state-of-the-art image SR methods
include the enhanced deep SR network (EDSR), cycle-in-cycle GAN (CinCGAN),
multiscale residual network (MSRN), meta residual dense network (Meta-RDN),
recurrent back-projection network (RBPN), second-order attention network (SAN),
SR feedback network (SRFBN) and the wavelet-based residual attention network
(WRAN). Finally, this survey is concluded with future directions and trends in
SR and open problems in SR to be addressed by the researchers.Comment: 56 Pages, 11 Figures, 5 Table
Towards Mitigating Hallucination in Large Language Models via Self-Reflection
Large language models (LLMs) have shown promise for generative and
knowledge-intensive tasks including question-answering (QA) tasks. However, the
practical deployment still faces challenges, notably the issue of
"hallucination", where models generate plausible-sounding but unfaithful or
nonsensical information. This issue becomes particularly critical in the
medical domain due to the uncommon professional concepts and potential social
risks involved. This paper analyses the phenomenon of hallucination in medical
generative QA systems using widely adopted LLMs and datasets. Our investigation
centers on the identification and comprehension of common problematic answers,
with a specific emphasis on hallucination. To tackle this challenge, we present
an interactive self-reflection methodology that incorporates knowledge
acquisition and answer generation. Through this feedback process, our approach
steadily enhances the factuality, consistency, and entailment of the generated
answers. Consequently, we harness the interactivity and multitasking ability of
LLMs and produce progressively more precise and accurate answers. Experimental
results on both automatic and human evaluation demonstrate the superiority of
our approach in hallucination reduction compared to baselines.Comment: Accepted by the findings of EMNLP 202
- …