256 research outputs found
Learning Segmentation Masks with the Independence Prior
An instance with a bad mask might make a composite image that uses it look
fake. This encourages us to learn segmentation by generating realistic
composite images. To achieve this, we propose a novel framework that exploits a
new proposed prior called the independence prior based on Generative
Adversarial Networks (GANs). The generator produces an image with multiple
category-specific instance providers, a layout module and a composition module.
Firstly, each provider independently outputs a category-specific instance image
with a soft mask. Then the provided instances' poses are corrected by the
layout module. Lastly, the composition module combines these instances into a
final image. Training with adversarial loss and penalty for mask area, each
provider learns a mask that is as small as possible but enough to cover a
complete category-specific instance. Weakly supervised semantic segmentation
methods widely use grouping cues modeling the association between image parts,
which are either artificially designed or learned with costly segmentation
labels or only modeled on local pairs. Unlike them, our method automatically
models the dependence between any parts and learns instance segmentation. We
apply our framework in two cases: (1) Foreground segmentation on
category-specific images with box-level annotation. (2) Unsupervised learning
of instance appearances and masks with only one image of homogeneous object
cluster (HOC). We get appealing results in both tasks, which shows the
independence prior is useful for instance segmentation and it is possible to
unsupervisedly learn instance masks with only one image.Comment: 7+5 pages, 13 figures, Accepted to AAAI 201
Understanding Segment Anything Model: SAM is Biased Towards Texture Rather than Shape
In contrast to the human vision that mainly depends on the shape for
recognizing the objects, deep image recognition models are widely known to be
biased toward texture. Recently, Meta research team has released the first
foundation model for image segmentation, termed segment anything model (SAM),
which has attracted significant attention. In this work, we understand SAM from
the perspective of texture \textit{v.s.} shape. Different from label-oriented
recognition tasks, the SAM is trained to predict a mask for covering the object
shape based on a promt. With this said, it seems self-evident that the SAM is
biased towards shape. In this work, however, we reveal an interesting finding:
the SAM is strongly biased towards texture-like dense features rather than
shape. This intriguing finding is supported by a novel setup where we
disentangle texture and shape cues and design texture-shape cue conflict for
mask prediction
3D GANs and Latent Space: A comprehensive survey
Generative Adversarial Networks (GANs) have emerged as a significant player
in generative modeling by mapping lower-dimensional random noise to
higher-dimensional spaces. These networks have been used to generate
high-resolution images and 3D objects. The efficient modeling of 3D objects and
human faces is crucial in the development process of 3D graphical environments
such as games or simulations. 3D GANs are a new type of generative model used
for 3D reconstruction, point cloud reconstruction, and 3D semantic scene
completion. The choice of distribution for noise is critical as it represents
the latent space. Understanding a GAN's latent space is essential for
fine-tuning the generated samples, as demonstrated by the morphing of
semantically meaningful parts of images. In this work, we explore the latent
space and 3D GANs, examine several GAN variants and training methods to gain
insights into improving 3D GAN training, and suggest potential future
directions for further research
- …