3 research outputs found
Efficient semantic image segmentation with superpixel pooling
In this work, we evaluate the use of superpixel pooling layers in deep
network architectures for semantic segmentation. Superpixel pooling is a
flexible and efficient replacement for other pooling strategies that
incorporates spatial prior information. We propose a simple and efficient
GPU-implementation of the layer and explore several designs for the integration
of the layer into existing network architectures. We provide experimental
results on the IBSR and Cityscapes dataset, demonstrating that superpixel
pooling can be leveraged to consistently increase network accuracy with minimal
computational overhead. Source code is available at
https://github.com/bermanmaxim/superpixPoo
Generating superpixels using deep image representations
Superpixel algorithms are a common pre-processing step for computer vision
algorithms such as segmentation, object tracking and localization. Many
superpixel methods only rely on colors features for segmentation, limiting
performance in low-contrast regions and applicability to infrared or medical
images where object boundaries have wide appearance variability. We study the
inclusion of deep image features in the SLIC superpixel algorithm to exploit
higher-level image representations. In addition, we devise a trainable
superpixel algorithm, yielding an intermediate domain-specific image
representation that can be applied to different tasks. A clustering-based
superpixel algorithm is transformed into a pixel-wise classification task and
superpixel training data is derived from semantic segmentation datasets. Our
results demonstrate that this approach is able to improve superpixel quality
consistently
Refining Semantic Segmentation with Superpixel by Transparent Initialization and Sparse Encoder
Although deep learning greatly improves the performance of semantic
segmentation, its success mainly lies in object central areas without accurate
edges. As superpixels are a popular and effective auxiliary to preserve object
edges, in this paper, we jointly learn semantic segmentation with trainable
superpixels. We achieve it with fully-connected layers with Transparent
Initialization (TI) and efficient logit consistency using a sparse encoder. The
proposed TI preserves the effects of learned parameters of pretrained networks.
This avoids a significant increase of the loss of pretrained networks, which
otherwise may be caused by inappropriate parameter initialization of the
additional layers. Meanwhile, consistent pixel labels in each superpixel are
guaranteed by logit consistency. The sparse encoder with sparse matrix
operations substantially reduces both the memory requirement and the
computational complexity. We demonstrated the superiority of TI over other
parameter initialization methods and tested its numerical stability. The
effectiveness of our proposal was validated on PASCAL VOC 2012, ADE20K, and
PASCAL Context showing enhanced semantic segmentation edges. With quantitative
evaluations on segmentation edges using performance ratio and F-measure, our
method outperforms the state-of-the-art