9,353 research outputs found
A Comparison and Strategy of Semantic Segmentation on Remote Sensing Images
In recent years, with the development of aerospace technology, we use more
and more images captured by satellites to obtain information. But a large
number of useless raw images, limited data storage resource and poor
transmission capability on satellites hinder our use of valuable images.
Therefore, it is necessary to deploy an on-orbit semantic segmentation model to
filter out useless images before data transmission. In this paper, we present a
detailed comparison on the recent deep learning models. Considering the
computing environment of satellites, we compare methods from accuracy,
parameters and resource consumption on the same public dataset. And we also
analyze the relation between them. Based on experimental results, we further
propose a viable on-orbit semantic segmentation strategy. It will be deployed
on the TianZhi-2 satellite which supports deep learning methods and will be
lunched soon.Comment: 8 pages, 3 figures, ICNC-FSKD 201
Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks
Semantic labeling (or pixel-level land-cover classification) in ultra-high
resolution imagery (< 10cm) requires statistical models able to learn high
level concepts from spatial data, with large appearance variations.
Convolutional Neural Networks (CNNs) achieve this goal by learning
discriminatively a hierarchy of representations of increasing abstraction.
In this paper we present a CNN-based system relying on an
downsample-then-upsample architecture. Specifically, it first learns a rough
spatial map of high-level representations by means of convolutions and then
learns to upsample them back to the original resolution by deconvolutions. By
doing so, the CNN learns to densely label every pixel at the original
resolution of the image. This results in many advantages, including i)
state-of-the-art numerical accuracy, ii) improved geometric accuracy of
predictions and iii) high efficiency at inference time.
We test the proposed system on the Vaihingen and Potsdam sub-decimeter
resolution datasets, involving semantic labeling of aerial images of 9cm and
5cm resolution, respectively. These datasets are composed by many large and
fully annotated tiles allowing an unbiased evaluation of models making use of
spatial information. We do so by comparing two standard CNN architectures to
the proposed one: standard patch classification, prediction of local label
patches by employing only convolutions and full patch labeling by employing
deconvolutions. All the systems compare favorably or outperform a
state-of-the-art baseline relying on superpixels and powerful appearance
descriptors. The proposed full patch labeling CNN outperforms these models by a
large margin, also showing a very appealing inference time.Comment: Accepted in IEEE Transactions on Geoscience and Remote Sensing, 201
Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models
In remote sensing images, the absolute orientation of objects is arbitrary.
Depending on an object's orientation and on a sensor's flight path, objects of
the same semantic class can be observed in different orientations in the same
image. Equivariance to rotation, in this context understood as responding with
a rotated semantic label map when subject to a rotation of the input image, is
therefore a very desirable feature, in particular for high capacity models,
such as Convolutional Neural Networks (CNNs). If rotation equivariance is
encoded in the network, the model is confronted with a simpler task and does
not need to learn specific (and redundant) weights to address rotated versions
of the same object class. In this work we propose a CNN architecture called
Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation
equivariance in the network itself. By using rotating convolutions as building
blocks and passing only the the values corresponding to the maximally
activating orientation throughout the network in the form of orientation
encoding vector fields, RotEqNet treats rotated versions of the same object
with the same filter bank and therefore achieves state-of-the-art performances
even when using very small architectures trained from scratch. We test RotEqNet
in two challenging sub-decimeter resolution semantic labeling problems, and
show that we can perform better than a standard CNN while requiring one order
of magnitude less parameters
- …