7,733 research outputs found
Semantic Photo Manipulation with a Generative Image Prior
Despite the recent success of GANs in synthesizing images conditioned on
inputs such as a user sketch, text, or semantic labels, manipulating the
high-level attributes of an existing natural photograph with GANs is
challenging for two reasons. First, it is hard for GANs to precisely reproduce
an input image. Second, after manipulation, the newly synthesized pixels often
do not fit the original image. In this paper, we address these issues by
adapting the image prior learned by GANs to image statistics of an individual
image. Our method can accurately reconstruct the input image and synthesize new
content, consistent with the appearance of the input image. We demonstrate our
interactive system on several semantic image editing tasks, including
synthesizing new objects consistent with background, removing unwanted objects,
and changing the appearance of an object. Quantitative and qualitative
comparisons against several existing methods demonstrate the effectiveness of
our method.Comment: SIGGRAPH 201
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
We present a new method for synthesizing high-resolution photo-realistic
images from semantic label maps using conditional generative adversarial
networks (conditional GANs). Conditional GANs have enabled a variety of
applications, but the results are often limited to low-resolution and still far
from realistic. In this work, we generate 2048x1024 visually appealing results
with a novel adversarial loss, as well as new multi-scale generator and
discriminator architectures. Furthermore, we extend our framework to
interactive visual manipulation with two additional features. First, we
incorporate object instance segmentation information, which enables object
manipulations such as removing/adding objects and changing the object category.
Second, we propose a method to generate diverse results given the same input,
allowing users to edit the object appearance interactively. Human opinion
studies demonstrate that our method significantly outperforms existing methods,
advancing both the quality and the resolution of deep image synthesis and
editing.Comment: v2: CVPR camera ready, adding more results for edge-to-photo example
Manipulating Attributes of Natural Scenes via Hallucination
In this study, we explore building a two-stage framework for enabling users
to directly manipulate high-level attributes of a natural scene. The key to our
approach is a deep generative network which can hallucinate images of a scene
as if they were taken at a different season (e.g. during winter), weather
condition (e.g. in a cloudy day) or time of the day (e.g. at sunset). Once the
scene is hallucinated with the given attributes, the corresponding look is then
transferred to the input image while preserving the semantic details intact,
giving a photo-realistic manipulation result. As the proposed framework
hallucinates what the scene will look like, it does not require any reference
style image as commonly utilized in most of the appearance or style transfer
approaches. Moreover, it allows to simultaneously manipulate a given scene
according to a diverse set of transient attributes within a single model,
eliminating the need of training multiple networks per each translation task.
Our comprehensive set of qualitative and quantitative results demonstrate the
effectiveness of our approach against the competing methods.Comment: Accepted for publication in ACM Transactions on Graphic
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
We propose semantic region-adaptive normalization (SEAN), a simple but
effective building block for Generative Adversarial Networks conditioned on
segmentation masks that describe the semantic regions in the desired output
image. Using SEAN normalization, we can build a network architecture that can
control the style of each semantic region individually, e.g., we can specify
one style reference image per region. SEAN is better suited to encode,
transfer, and synthesize style than the best previous method in terms of
reconstruction quality, variability, and visual quality. We evaluate SEAN on
multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than
the current state of the art. SEAN also pushes the frontier of interactive
image editing. We can interactively edit images by changing segmentation masks
or the style for any given region. We can also interpolate styles from two
reference images per region.Comment: Accepted as a CVPR 2020 oral paper. The interactive demo is available
at https://youtu.be/0Vbj9xFgoU
Channel-Recurrent Autoencoding for Image Modeling
Despite recent successes in synthesizing faces and bedrooms, existing
generative models struggle to capture more complex image types, potentially due
to the oversimplification of their latent space constructions. To tackle this
issue, building on Variational Autoencoders (VAEs), we integrate recurrent
connections across channels to both inference and generation steps, allowing
the high-level features to be captured in global-to-local, coarse-to-fine
manners. Combined with adversarial loss, our channel-recurrent VAE-GAN
(crVAE-GAN) outperforms VAE-GAN in generating a diverse spectrum of high
resolution images while maintaining the same level of computational efficacy.
Our model produces interpretable and expressive latent representations to
benefit downstream tasks such as image completion. Moreover, we propose two
novel regularizations, namely the KL objective weighting scheme over time steps
and mutual information maximization between transformed latent variables and
the outputs, to enhance the training.Comment: Code: https://github.com/WendyShang/crVAE. Supplementary Materials:
http://www-personal.umich.edu/~shangw/wacv18_supplementary_material.pd
- …