6,769 research outputs found
Recurrent Pixel Embedding for Instance Grouping
We introduce a differentiable, end-to-end trainable framework for solving
pixel-level grouping problems such as instance segmentation consisting of two
novel components. First, we regress pixels into a hyper-spherical embedding
space so that pixels from the same group have high cosine similarity while
those from different groups have similarity below a specified margin. We
analyze the choice of embedding dimension and margin, relating them to
theoretical results on the problem of distributing points uniformly on the
sphere. Second, to group instances, we utilize a variant of mean-shift
clustering, implemented as a recurrent neural network parameterized by kernel
bandwidth. This recurrent grouping module is differentiable, enjoys convergent
dynamics and probabilistic interpretability. Backpropagating the group-weighted
loss through this module allows learning to focus on only correcting embedding
errors that won't be resolved during subsequent clustering. Our framework,
while conceptually simple and theoretically abundant, is also practically
effective and computationally efficient. We demonstrate substantial
improvements over state-of-the-art instance segmentation for object proposal
generation, as well as demonstrating the benefits of grouping loss for
classification tasks such as boundary detection and semantic segmentation
Distance Guided Channel Weighting for Semantic Segmentation
Recent works have achieved great success in improving the performance of
multiple computer vision tasks by capturing features with a high channel number
utilizing deep neural networks. However, many channels of extracted features
are not discriminative and contain a lot of redundant information. In this
paper, we address above issue by introducing the Distance Guided Channel
Weighting (DGCW) Module. The DGCW module is constructed in a pixel-wise context
extraction manner, which enhances the discriminativeness of features by
weighting different channels of each pixel's feature vector when modeling its
relationship with other pixels. It can make full use of the high-discriminative
information while ignore the low-discriminative information containing in
feature maps, as well as capture the long-range dependencies. Furthermore, by
incorporating the DGCW module with a baseline segmentation network, we propose
the Distance Guided Channel Weighting Network (DGCWNet). We conduct extensive
experiments to demonstrate the effectiveness of DGCWNet. In particular, it
achieves 81.6% mIoU on Cityscapes with only fine annotated data for training,
and also gains satisfactory performance on another two semantic segmentation
datasets, i.e. Pascal Context and ADE20K. Code will be available soon at
https://github.com/LanyunZhu/DGCWNet
Hybrid multi-layer Deep CNN/Aggregator feature for image classification
Deep Convolutional Neural Networks (DCNN) have established a remarkable
performance benchmark in the field of image classification, displacing
classical approaches based on hand-tailored aggregations of local descriptors.
Yet DCNNs impose high computational burdens both at training and at testing
time, and training them requires collecting and annotating large amounts of
training data. Supervised adaptation methods have been proposed in the
literature that partially re-learn a transferred DCNN structure from a new
target dataset. Yet these require expensive bounding-box annotations and are
still computationally expensive to learn. In this paper, we address these
shortcomings of DCNN adaptation schemes by proposing a hybrid approach that
combines conventional, unsupervised aggregators such as Bag-of-Words (BoW),
with the DCNN pipeline by treating the output of intermediate layers as densely
extracted local descriptors.
We test a variant of our approach that uses only intermediate DCNN layers on
the standard PASCAL VOC 2007 dataset and show performance significantly higher
than the standard BoW model and comparable to Fisher vector aggregation but
with a feature that is 150 times smaller. A second variant of our approach that
includes the fully connected DCNN layers significantly outperforms Fisher
vector schemes and performs comparably to DCNN approaches adapted to Pascal VOC
2007, yet at only a small fraction of the training and testing cost.Comment: Accepted in ICASSP 2015 conference, 5 pages including reference, 4
figures and 2 table
- …