928 research outputs found
GP-GAN: Gender Preserving GAN for Synthesizing Faces from Landmarks
Facial landmarks constitute the most compressed representation of faces and
are known to preserve information such as pose, gender and facial structure
present in the faces. Several works exist that attempt to perform high-level
face-related analysis tasks based on landmarks. In contrast, in this work, an
attempt is made to tackle the inverse problem of synthesizing faces from their
respective landmarks. The primary aim of this work is to demonstrate that
information preserved by landmarks (gender in particular) can be further
accentuated by leveraging generative models to synthesize corresponding faces.
Though the problem is particularly challenging due to its ill-posed nature, we
believe that successful synthesis will enable several applications such as
boosting performance of high-level face related tasks using landmark points and
performing dataset augmentation. To this end, a novel face-synthesis method
known as Gender Preserving Generative Adversarial Network (GP-GAN) that is
guided by adversarial loss, perceptual loss and a gender preserving loss is
presented. Further, we propose a novel generator sub-network UDeNet for GP-GAN
that leverages advantages of U-Net and DenseNet architectures. Extensive
experiments and comparison with recent methods are performed to verify the
effectiveness of the proposed method.Comment: 6 pages, 5 figures, this paper is accepted as 2018 24th International
Conference on Pattern Recognition (ICPR2018
Learning Segmentation Masks with the Independence Prior
An instance with a bad mask might make a composite image that uses it look
fake. This encourages us to learn segmentation by generating realistic
composite images. To achieve this, we propose a novel framework that exploits a
new proposed prior called the independence prior based on Generative
Adversarial Networks (GANs). The generator produces an image with multiple
category-specific instance providers, a layout module and a composition module.
Firstly, each provider independently outputs a category-specific instance image
with a soft mask. Then the provided instances' poses are corrected by the
layout module. Lastly, the composition module combines these instances into a
final image. Training with adversarial loss and penalty for mask area, each
provider learns a mask that is as small as possible but enough to cover a
complete category-specific instance. Weakly supervised semantic segmentation
methods widely use grouping cues modeling the association between image parts,
which are either artificially designed or learned with costly segmentation
labels or only modeled on local pairs. Unlike them, our method automatically
models the dependence between any parts and learns instance segmentation. We
apply our framework in two cases: (1) Foreground segmentation on
category-specific images with box-level annotation. (2) Unsupervised learning
of instance appearances and masks with only one image of homogeneous object
cluster (HOC). We get appealing results in both tasks, which shows the
independence prior is useful for instance segmentation and it is possible to
unsupervisedly learn instance masks with only one image.Comment: 7+5 pages, 13 figures, Accepted to AAAI 201
Dual Attention GANs for Semantic Image Synthesis
In this paper, we focus on the semantic image synthesis task that aims at
transferring semantic label maps to photo-realistic images. Existing methods
lack effective semantic constraints to preserve the semantic information and
ignore the structural correlations in both spatial and channel dimensions,
leading to unsatisfactory blurry and artifact-prone results. To address these
limitations, we propose a novel Dual Attention GAN (DAGAN) to synthesize
photo-realistic and semantically-consistent images with fine details from the
input layouts without imposing extra training overhead or modifying the network
architectures of existing methods. We also propose two novel modules, i.e.,
position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention
Module (CAM), to capture semantic structure attention in spatial and channel
dimensions, respectively. Specifically, SAM selectively correlates the pixels
at each position by a spatial attention map, leading to pixels with the same
semantic label being related to each other regardless of their spatial
distances. Meanwhile, CAM selectively emphasizes the scale-wise features at
each channel by a channel attention map, which integrates associated features
among all channel maps regardless of their scales. We finally sum the outputs
of SAM and CAM to further improve feature representation. Extensive experiments
on four challenging datasets show that DAGAN achieves remarkably better results
than state-of-the-art methods, while using fewer model parameters. The source
code and trained models are available at https://github.com/Ha0Tang/DAGAN.Comment: Accepted to ACM MM 2020, camera ready (9 pages) + supplementary (10
pages
Semi-supervised Regression with Generative Adversarial Networks Using Minimal Labeled Data
This work studies the generalization of semi-supervised generative adversarial networks (GANs) to regression tasks. A novel feature layer contrasting optimization function, in conjunction with a feature matching optimization, allows the adversarial network to learn from unannotated data and thereby reduce the number of labels required to train a predictive network. An analysis of simulated training conditions is performed to explore the capabilities and limitations of the method. In concert with the semi-supervised regression GANs, an improved label topology and upsampling technique for multi-target regression tasks are shown to reduce data requirements. Improvements are demonstrated on a wide variety of vision tasks, including dense crowd counting, age estimation, and automotive steering angle prediction. With training data limitations arguably being the most restrictive component of deep learning, methods which reduce data requirements hold immense value. The methods proposed here are general-purpose and can be incorporated into existing network architectures with little or no modifications to the existing structure
- …