297 research outputs found
Convolutional Feature Masking for Joint Object and Stuff Segmentation
The topic of semantic segmentation has witnessed considerable progress due to
the powerful features learned by convolutional neural networks (CNNs). The
current leading approaches for semantic segmentation exploit shape information
by extracting CNN features from masked image regions. This strategy introduces
artificial boundaries on the images and may impact the quality of the extracted
features. Besides, the operations on the raw image domain require to compute
thousands of networks on a single image, which is time-consuming. In this
paper, we propose to exploit shape information via masking convolutional
features. The proposal segments (e.g., super-pixels) are treated as masks on
the convolutional feature maps. The CNN features of segments are directly
masked out from these maps and used to train classifiers for recognition. We
further propose a joint method to handle objects and "stuff" (e.g., grass, sky,
water) in the same framework. State-of-the-art results are demonstrated on
benchmarks of PASCAL VOC and new PASCAL-CONTEXT, with a compelling
computational speed.Comment: IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
201
BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks
We present a simple and effective framework for simultaneous semantic
segmentation and instance segmentation with Fully Convolutional Networks
(FCNs). The method, called BiSeg, predicts instance segmentation as a posterior
in Bayesian inference, where semantic segmentation is used as a prior. We
extend the idea of position-sensitive score maps used in recent methods to a
fusion of multiple score maps at different scales and partition modes, and
adopt it as a robust likelihood for instance segmentation inference. As both
Bayesian inference and map fusion are performed per pixel, BiSeg is a fully
convolutional end-to-end solution that inherits all the advantages of FCNs. We
demonstrate state-of-the-art instance segmentation accuracy on PASCAL VOC.Comment: BMVC201
S4Net: Single Stage Salient-Instance Segmentation
We consider an interesting problem-salient instance segmentation in this
paper. Other than producing bounding boxes, our network also outputs
high-quality instance-level segments. Taking into account the
category-independent property of each target, we design a single stage salient
instance segmentation framework, with a novel segmentation branch. Our new
branch regards not only local context inside each detection window but also its
surrounding context, enabling us to distinguish the instances in the same scope
even with obstruction. Our network is end-to-end trainable and runs at a fast
speed (40 fps when processing an image with resolution 320x320). We evaluate
our approach on a publicly available benchmark and show that it outperforms
other alternative solutions. We also provide a thorough analysis of the design
choices to help readers better understand the functions of each part of our
network. The source code can be found at
\url{https://github.com/RuochenFan/S4Net}
Segmentation-Aware Convolutional Networks Using Local Attention Masks
We introduce an approach to integrate segmentation information within a
convolutional neural network (CNN). This counter-acts the tendency of CNNs to
smooth information across regions and increases their spatial precision. To
obtain segmentation information, we set up a CNN to provide an embedding space
where region co-membership can be estimated based on Euclidean distance. We use
these embeddings to compute a local attention mask relative to every neuron
position. We incorporate such masks in CNNs and replace the convolution
operation with a "segmentation-aware" variant that allows a neuron to
selectively attend to inputs coming from its own region. We call the resulting
network a segmentation-aware CNN because it adapts its filters at each image
point according to local segmentation cues. We demonstrate the merit of our
method on two widely different dense prediction tasks, that involve
classification (semantic segmentation) and regression (optical flow). Our
results show that in semantic segmentation we can match the performance of
DenseCRFs while being faster and simpler, and in optical flow we obtain clearly
sharper responses than networks that do not use local attention masks. In both
cases, segmentation-aware convolution yields systematic improvements over
strong baselines. Source code for this work is available online at
http://cs.cmu.edu/~aharley/segaware
- …