3 research outputs found
Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models
In remote sensing images, the absolute orientation of objects is arbitrary.
Depending on an object's orientation and on a sensor's flight path, objects of
the same semantic class can be observed in different orientations in the same
image. Equivariance to rotation, in this context understood as responding with
a rotated semantic label map when subject to a rotation of the input image, is
therefore a very desirable feature, in particular for high capacity models,
such as Convolutional Neural Networks (CNNs). If rotation equivariance is
encoded in the network, the model is confronted with a simpler task and does
not need to learn specific (and redundant) weights to address rotated versions
of the same object class. In this work we propose a CNN architecture called
Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation
equivariance in the network itself. By using rotating convolutions as building
blocks and passing only the the values corresponding to the maximally
activating orientation throughout the network in the form of orientation
encoding vector fields, RotEqNet treats rotated versions of the same object
with the same filter bank and therefore achieves state-of-the-art performances
even when using very small architectures trained from scratch. We test RotEqNet
in two challenging sub-decimeter resolution semantic labeling problems, and
show that we can perform better than a standard CNN while requiring one order
of magnitude less parameters
Injecting spatial priors in Earth observation with machine vision
Remote Sensing (RS) imagery with submeter resolution is becoming ubiquitous. Be it from satellites, aerial campaigns or Unmanned Aerial Vehicles, this spatial resolution allows to recognize individual objects and their parts from above. This has driven, during the last few years, a big interest in the RS community on Computer Vision (CV) methods developed for the automated understanding of natural images. A central element to the success of \CV is the use of prior information about the image generation process and the objects these images contain: neighboring pixels are likely to belong to the same object; objects of the same nature tend to look similar with independence of their location in the image; certain objects tend to occur in particular geometric configurations; etc. When using RS imagery, additional prior knowledge exists on how the images were formed, since we know roughly the geographical location of the objects, the geospatial prior, and the direction they were observed from, the overhead-view prior. This thesis explores ways of encoding these priors in CV models to improve their performance on RS imagery, with a focus on land-cover and land-use mapping.</p