151,517 research outputs found
TernausNetV2: Fully Convolutional Network for Instance Segmentation
The most common approaches to instance segmentation are complex and use
two-stage networks with object proposals, conditional random-fields, template
matching or recurrent neural networks. In this work we present TernausNetV2 - a
simple fully convolutional network that allows extracting objects from a
high-resolution satellite imagery on an instance level. The network has popular
encoder-decoder type of architecture with skip connections but has a few
essential modifications that allows using for semantic as well as for instance
segmentation tasks. This approach is universal and allows to extend any network
that has been successfully applied for semantic segmentation to perform
instance segmentation task. In addition, we generalize network encoder that was
pre-trained for RGB images to use additional input channels. It makes possible
to use transfer learning from visual to a wider spectral range. For
DeepGlobe-CVPR 2018 building detection sub-challenge, based on public
leaderboard score, our approach shows superior performance in comparison to
other methods. The source code corresponding pre-trained weights are publicly
available at https://github.com/ternaus/TernausNetV
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
In this work, we tackle the problem of instance segmentation, the task of
simultaneously solving object detection and semantic segmentation. Towards this
goal, we present a model, called MaskLab, which produces three outputs: box
detection, semantic segmentation, and direction prediction. Building on top of
the Faster-RCNN object detector, the predicted boxes provide accurate
localization of object instances. Within each region of interest, MaskLab
performs foreground/background segmentation by combining semantic and direction
prediction. Semantic segmentation assists the model in distinguishing between
objects of different semantic classes including background, while the direction
prediction, estimating each pixel's direction towards its corresponding center,
allows separating instances of the same semantic class. Moreover, we explore
the effect of incorporating recent successful methods from both segmentation
and detection (i.e. atrous convolution and hypercolumn). Our proposed model is
evaluated on the COCO instance segmentation benchmark and shows comparable
performance with other state-of-art models.Comment: 10 pages including referenc
Gland Instance Segmentation by Deep Multichannel Side Supervision
In this paper, we propose a new image instance segmentation method that
segments individual glands (instances) in colon histology images. This is a
task called instance segmentation that has recently become increasingly
important. The problem is challenging since not only do the glands need to be
segmented from the complex background, they are also required to be
individually identified. Here we leverage the idea of image-to-image prediction
in recent deep learning by building a framework that automatically exploits and
fuses complex multichannel information, regional and boundary patterns, with
side supervision (deep supervision on side responses) in gland histology
images. Our proposed system, deep multichannel side supervision (DMCS),
alleviates heavy feature design due to the use of convolutional neural networks
guided by side supervision. Compared to methods reported in the 2015 MICCAI
Gland Segmentation Challenge, we observe state-of-the-art results based on a
number of evaluation metrics.Comment: conditionally accepted at MICCAI 201
SpaceNet MVOI: a Multi-View Overhead Imagery Dataset
Detection and segmentation of objects in overheard imagery is a challenging
task. The variable density, random orientation, small size, and
instance-to-instance heterogeneity of objects in overhead imagery calls for
approaches distinct from existing models designed for natural scene datasets.
Though new overhead imagery datasets are being developed, they almost
universally comprise a single view taken from directly overhead ("at nadir"),
failing to address a critical variable: look angle. By contrast, views vary in
real-world overhead imagery, particularly in dynamic scenarios such as natural
disasters where first looks are often over 40 degrees off-nadir. This
represents an important challenge to computer vision methods, as changing view
angle adds distortions, alters resolution, and changes lighting. At present,
the impact of these perturbations for algorithmic detection and segmentation of
objects is untested. To address this problem, we present an open source
Multi-View Overhead Imagery dataset, termed SpaceNet MVOI, with 27 unique looks
from a broad range of viewing angles (-32.5 degrees to 54.0 degrees). Each of
these images cover the same 665 square km geographic extent and are annotated
with 126,747 building footprint labels, enabling direct assessment of the
impact of viewpoint perturbation on model performance. We benchmark multiple
leading segmentation and object detection models on: (1) building detection,
(2) generalization to unseen viewing angles and resolutions, and (3)
sensitivity of building footprint extraction to changes in resolution. We find
that state of the art segmentation and object detection models struggle to
identify buildings in off-nadir imagery and generalize poorly to unseen views,
presenting an important benchmark to explore the broadly relevant challenge of
detecting small, heterogeneous target objects in visually dynamic contexts.Comment: Accepted into IEEE International Conference on Computer Vision (ICCV)
201
- …
