381 research outputs found
Hierarchy Composition GAN for High-fidelity Image Synthesis
Despite the rapid progress of generative adversarial networks (GANs) in image
synthesis in recent years, the existing image synthesis approaches work in
either geometry domain or appearance domain alone which often introduces
various synthesis artifacts. This paper presents an innovative Hierarchical
Composition GAN (HIC-GAN) that incorporates image synthesis in geometry and
appearance domains into an end-to-end trainable network and achieves superior
synthesis realism in both domains simultaneously. We design an innovative
hierarchical composition mechanism that is capable of learning realistic
composition geometry and handling occlusions while multiple foreground objects
are involved in image composition. In addition, we introduce a novel attention
mask mechanism that guides to adapt the appearance of foreground objects which
also helps to provide better training reference for learning in geometry
domain. Extensive experiments on scene text image synthesis, portrait editing
and indoor rendering tasks show that the proposed HIC-GAN achieves superior
synthesis performance qualitatively and quantitatively.Comment: 11 pages, 8 figure
3DPortraitGAN: Learning One-Quarter Headshot 3D GANs from a Single-View Portrait Dataset with Diverse Body Poses
3D-aware face generators are typically trained on 2D real-life face image
datasets that primarily consist of near-frontal face data, and as such, they
are unable to construct one-quarter headshot 3D portraits with complete head,
neck, and shoulder geometry. Two reasons account for this issue: First,
existing facial recognition methods struggle with extracting facial data
captured from large camera angles or back views. Second, it is challenging to
learn a distribution of 3D portraits covering the one-quarter headshot region
from single-view data due to significant geometric deformation caused by
diverse body poses. To this end, we first create the dataset
360{\deg}-Portrait-HQ (360{\deg}PHQ for short) which consists of high-quality
single-view real portraits annotated with a variety of camera parameters (the
yaw angles span the entire 360{\deg} range) and body poses. We then propose
3DPortraitGAN, the first 3D-aware one-quarter headshot portrait generator that
learns a canonical 3D avatar distribution from the 360{\deg}PHQ dataset with
body pose self-learning. Our model can generate view-consistent portrait images
from all camera angles with a canonical one-quarter headshot 3D representation.
Our experiments show that the proposed framework can accurately predict
portrait body poses and generate view-consistent, realistic portrait images
with complete geometry from all camera angles
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
We propose semantic region-adaptive normalization (SEAN), a simple but
effective building block for Generative Adversarial Networks conditioned on
segmentation masks that describe the semantic regions in the desired output
image. Using SEAN normalization, we can build a network architecture that can
control the style of each semantic region individually, e.g., we can specify
one style reference image per region. SEAN is better suited to encode,
transfer, and synthesize style than the best previous method in terms of
reconstruction quality, variability, and visual quality. We evaluate SEAN on
multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than
the current state of the art. SEAN also pushes the frontier of interactive
image editing. We can interactively edit images by changing segmentation masks
or the style for any given region. We can also interpolate styles from two
reference images per region.Comment: Accepted as a CVPR 2020 oral paper. The interactive demo is available
at https://youtu.be/0Vbj9xFgoU
FaceShop: Deep Sketch-based Face Image Editing
We present a novel system for sketch-based face image editing, enabling users
to edit images intuitively by sketching a few strokes on a region of interest.
Our interface features tools to express a desired image manipulation by
providing both geometry and color constraints as user-drawn strokes. As an
alternative to the direct user input, our proposed system naturally supports a
copy-paste mode, which allows users to edit a given image region by using parts
of another exemplar image without the need of hand-drawn sketching at all. The
proposed interface runs in real-time and facilitates an interactive and
iterative workflow to quickly express the intended edits. Our system is based
on a novel sketch domain and a convolutional neural network trained end-to-end
to automatically learn to render image regions corresponding to the input
strokes. To achieve high quality and semantically consistent results we train
our neural network on two simultaneous tasks, namely image completion and image
translation. To the best of our knowledge, we are the first to combine these
two tasks in a unified framework for interactive image editing. Our results
show that the proposed sketch domain, network architecture, and training
procedure generalize well to real user input and enable high quality synthesis
results without additional post-processing.Comment: 13 pages, 20 figure
- …