166 research outputs found
Fast Preprocessing for Robust Face Sketch Synthesis
Exemplar-based face sketch synthesis methods usually meet the challenging
problem that input photos are captured in different lighting conditions from
training photos. The critical step causing the failure is the search of similar
patch candidates for an input photo patch. Conventional illumination invariant
patch distances are adopted rather than directly relying on pixel intensity
difference, but they will fail when local contrast within a patch changes. In
this paper, we propose a fast preprocessing method named Bidirectional
Luminance Remapping (BLR), which interactively adjust the lighting of training
and input photos. Our method can be directly integrated into state-of-the-art
exemplar-based methods to improve their robustness with ignorable computational
cost.Comment: IJCAI 2017. Project page:
http://www.cs.cityu.edu.hk/~yibisong/ijcai17_sketch/index.htm
Learning to Hallucinate Face Images via Component Generation and Enhancement
We propose a two-stage method for face hallucination. First, we generate
facial components of the input image using CNNs. These components represent the
basic facial structures. Second, we synthesize fine-grained facial structures
from high resolution training images. The details of these structures are
transferred into facial components for enhancement. Therefore, we generate
facial components to approximate ground truth global appearance in the first
stage and enhance them through recovering details in the second stage. The
experiments demonstrate that our method performs favorably against
state-of-the-art methodsComment: IJCAI 2017. Project page:
http://www.cs.cityu.edu.hk/~yibisong/ijcai17_sr/index.htm
Stylizing Face Images via Multiple Exemplars
We address the problem of transferring the style of a headshot photo to face
images. Existing methods using a single exemplar lead to inaccurate results
when the exemplar does not contain sufficient stylized facial components for a
given photo. In this work, we propose an algorithm to stylize face images using
multiple exemplars containing different subjects in the same style. Patch
correspondences between an input photo and multiple exemplars are established
using a Markov Random Field (MRF), which enables accurate local energy transfer
via Laplacian stacks. As image patches from multiple exemplars are used, the
boundaries of facial components on the target image are inevitably
inconsistent. The artifacts are removed by a post-processing step using an
edge-preserving filter. Experimental results show that the proposed algorithm
consistently produces visually pleasing results.Comment: In CVIU 2017. Project Page:
http://www.cs.cityu.edu.hk/~yibisong/cviu17/index.htm
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
GAN inversion and editing via StyleGAN maps an input image into the embedding
spaces (, , and ) to simultaneously
maintain image fidelity and meaningful manipulation. From latent space
to extended latent space to feature space
in StyleGAN, the editability of GAN inversion decreases while its
reconstruction quality increases. Recent GAN inversion methods typically
explore and rather than to improve
reconstruction fidelity while maintaining editability. As and
are derived from that is essentially the foundation
latent space of StyleGAN, these GAN inversion methods focusing on
and spaces could be improved by stepping back to
. In this work, we propose to first obtain the precise latent code
in foundation latent space . We introduce contrastive learning to
align and the image space for precise latent code discovery. %The
obtaining process is by using contrastive learning to align and
the image space. Then, we leverage a cross-attention encoder to transform the
obtained latent code in into and ,
accordingly. Our experiments show that our exploration of the foundation latent
space improves the representation ability of latent codes in
and features in , which yields state-of-the-art
reconstruction fidelity and editability results on the standard benchmarks.
Project page: \url{https://github.com/KumapowerLIU/CLCAE}
DiffusionDet: Diffusion Model for Object Detection
We propose DiffusionDet, a new framework that formulates object detection as
a denoising diffusion process from noisy boxes to object boxes. During the
training stage, object boxes diffuse from ground-truth boxes to random
distribution, and the model learns to reverse this noising process. In
inference, the model refines a set of randomly generated boxes to the output
results in a progressive way. Our work possesses an appealing property of
flexibility, which enables the dynamic number of boxes and iterative
evaluation. The extensive experiments on the standard benchmarks show that
DiffusionDet achieves favorable performance compared to previous
well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8
AP gains when evaluated with more boxes and iteration steps, under a zero-shot
transfer setting from COCO to CrowdHuman. Our code is available at
https://github.com/ShoufaChen/DiffusionDet.Comment: ICCV2023 (Oral), Camera-read
- …