14,869 research outputs found
TernausNetV2: Fully Convolutional Network for Instance Segmentation
The most common approaches to instance segmentation are complex and use
two-stage networks with object proposals, conditional random-fields, template
matching or recurrent neural networks. In this work we present TernausNetV2 - a
simple fully convolutional network that allows extracting objects from a
high-resolution satellite imagery on an instance level. The network has popular
encoder-decoder type of architecture with skip connections but has a few
essential modifications that allows using for semantic as well as for instance
segmentation tasks. This approach is universal and allows to extend any network
that has been successfully applied for semantic segmentation to perform
instance segmentation task. In addition, we generalize network encoder that was
pre-trained for RGB images to use additional input channels. It makes possible
to use transfer learning from visual to a wider spectral range. For
DeepGlobe-CVPR 2018 building detection sub-challenge, based on public
leaderboard score, our approach shows superior performance in comparison to
other methods. The source code corresponding pre-trained weights are publicly
available at https://github.com/ternaus/TernausNetV
Recurrent Segmentation for Variable Computational Budgets
State-of-the-art systems for semantic image segmentation use feed-forward
pipelines with fixed computational costs. Building an image segmentation system
that works across a range of computational budgets is challenging and
time-intensive as new architectures must be designed and trained for every
computational setting. To address this problem we develop a recurrent neural
network that successively improves prediction quality with each iteration.
Importantly, the RNN may be deployed across a range of computational budgets by
merely running the model for a variable number of iterations. We find that this
architecture is uniquely suited for efficiently segmenting videos. By
exploiting the segmentation of past frames, the RNN can perform video
segmentation at similar quality but reduced computational cost compared to
state-of-the-art image segmentation methods. When applied to static images in
the PASCAL VOC 2012 and Cityscapes segmentation datasets, the RNN traces out a
speed-accuracy curve that saturates near the performance of state-of-the-art
segmentation methods
A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
Semantic segmentation is the pixel-wise labelling of an image. Since the
problem is defined at the pixel level, determining image class labels only is
not acceptable, but localising them at the original image pixel resolution is
necessary. Boosted by the extraordinary ability of convolutional neural
networks (CNN) in creating semantic, high level and hierarchical image
features; excessive numbers of deep learning-based 2D semantic segmentation
approaches have been proposed within the last decade. In this survey, we mainly
focus on the recent scientific developments in semantic segmentation,
specifically on deep learning-based methods using 2D images. We started with an
analysis of the public image sets and leaderboards for 2D semantic
segmantation, with an overview of the techniques employed in performance
evaluation. In examining the evolution of the field, we chronologically
categorised the approaches into three main periods, namely pre-and early deep
learning era, the fully convolutional era, and the post-FCN era. We technically
analysed the solutions put forward in terms of solving the fundamental problems
of the field, such as fine-grained localisation and scale invariance. Before
drawing our conclusions, we present a table of methods from all mentioned eras,
with a brief summary of each approach that explains their contribution to the
field. We conclude the survey by discussing the current challenges of the field
and to what extent they have been solved.Comment: Updated with new studie
Recurrent Pixel Embedding for Instance Grouping
We introduce a differentiable, end-to-end trainable framework for solving
pixel-level grouping problems such as instance segmentation consisting of two
novel components. First, we regress pixels into a hyper-spherical embedding
space so that pixels from the same group have high cosine similarity while
those from different groups have similarity below a specified margin. We
analyze the choice of embedding dimension and margin, relating them to
theoretical results on the problem of distributing points uniformly on the
sphere. Second, to group instances, we utilize a variant of mean-shift
clustering, implemented as a recurrent neural network parameterized by kernel
bandwidth. This recurrent grouping module is differentiable, enjoys convergent
dynamics and probabilistic interpretability. Backpropagating the group-weighted
loss through this module allows learning to focus on only correcting embedding
errors that won't be resolved during subsequent clustering. Our framework,
while conceptually simple and theoretically abundant, is also practically
effective and computationally efficient. We demonstrate substantial
improvements over state-of-the-art instance segmentation for object proposal
generation, as well as demonstrating the benefits of grouping loss for
classification tasks such as boundary detection and semantic segmentation
- …