41 research outputs found
Recurrently Exploring Class-wise Attention in A Hybrid Convolutional and Bidirectional LSTM Network for Multi-label Aerial Image Classification
Aerial image classification is of great significance in remote sensing
community, and many researches have been conducted over the past few years.
Among these studies, most of them focus on categorizing an image into one
semantic label, while in the real world, an aerial image is often associated
with multiple labels, e.g., multiple object-level labels in our case. Besides,
a comprehensive picture of present objects in a given high resolution aerial
image can provide more in-depth understanding of the studied region. For these
reasons, aerial image multi-label classification has been attracting increasing
attention. However, one common limitation shared by existing methods in the
community is that the co-occurrence relationship of various classes, so called
class dependency, is underexplored and leads to an inconsiderate decision. In
this paper, we propose a novel end-to-end network, namely class-wise
attention-based convolutional and bidirectional LSTM network (CA-Conv-BiLSTM),
for this task. The proposed network consists of three indispensable components:
1) a feature extraction module, 2) a class attention learning layer, and 3) a
bidirectional LSTM-based sub-network. Particularly, the feature extraction
module is designed for extracting fine-grained semantic feature maps, while the
class attention learning layer aims at capturing discriminative class-specific
features. As the most important part, the bidirectional LSTM-based sub-network
models the underlying class dependency in both directions and produce
structured multiple object labels. Experimental results on UCM multi-label
dataset and DFC15 multi-label dataset validate the effectiveness of our model
quantitatively and qualitatively
A Deep Multi-Attention Driven Approach for Multi-Label Remote Sensing Image Classification
Deep learning (DL) based methods have been found popular in the framework of remote sensing (RS) image scene classification. Most of the existing DL based methods assume that training images are annotated by single-labels, however RS images typically contain multiple classes and thus can simultaneously be associated with multi-labels. Despite the success of existing methods in describing the information content of very high resolution aerial images with RGB bands, any direct adaptation for high-dimensional high-spatial resolution RS images falls short of accurate modeling the spectral and spatial information content. To address this problem, this paper presents a novel approach in the framework of the multi-label classification of high dimensional RS images. The proposed approach is based on three main steps. The first step describes the complex spatial and spectral content of image local areas by a novel KBranch CNN that includes spatial resolution specific CNN branches. The second step initially characterizes the importance scores of different local areas of each image and then defines a global descriptor for each image based on these scores. This is achieved by a novel multi-attention strategy that utilizes the bidirectional long short-term memory networks. The final step achieves the classification of RS image scenes with multilabels. Experiments carried out on BigEarthNet (which is a large-scale Sentinel-2 benchmark archive) show the effectiveness of the proposed approach in terms of multi-label classification accuracy compared to the state-of-the-art approaches. The code of the proposed approach is publicly available at https://gitlab.tubit.tuberlin.de/rsim/MAML-RSIC.EC/H2020/759764/EU/Accurate and Scalable Processing of Big Data in Earth Observation/BigEart
Relation Network for Multi-label Aerial Image Classification
Multi-label classification plays a momentous role in perceiving intricate
contents of an aerial image and triggers several related studies over the last
years. However, most of them deploy few efforts in exploiting label relations,
while such dependencies are crucial for making accurate predictions. Although
an LSTM layer can be introduced to modeling such label dependencies in a chain
propagation manner, the efficiency might be questioned when certain labels are
improperly inferred. To address this, we propose a novel aerial image
multi-label classification network, attention-aware label relational reasoning
network. Particularly, our network consists of three elemental modules: 1) a
label-wise feature parcel learning module, 2) an attentional region extraction
module, and 3) a label relational inference module. To be more specific, the
label-wise feature parcel learning module is designed for extracting high-level
label-specific features. The attentional region extraction module aims at
localizing discriminative regions in these features and yielding attentional
label-specific features. The label relational inference module finally predicts
label existences using label relations reasoned from outputs of the previous
module. The proposed network is characterized by its capacities of extracting
discriminative label-wise features in a proposal-free way and reasoning about
label relations naturally and interpretably. In our experiments, we evaluate
the proposed model on the UCM multi-label dataset and a newly produced dataset,
AID multi-label dataset. Quantitative and qualitative results on these two
datasets demonstrate the effectiveness of our model. To facilitate progress in
the multi-label aerial image classification, the AID multi-label dataset will
be made publicly available
Deep Learning for Aerial Scene Understanding in High Resolution Remote Sensing Imagery from the Lab to the Wild
Diese Arbeit präsentiert die Anwendung von Deep Learning beim Verständnis von Luftszenen, z. B. Luftszenenerkennung, Multi-Label-Objektklassifizierung und semantische Segmentierung. Abgesehen vom Training tiefer Netzwerke unter Laborbedingungen bietet diese Arbeit auch Lernstrategien für praktische Szenarien, z. B. werden Daten ohne Einschränkungen gesammelt oder Annotationen sind knapp
Exploring semantic relationships for hierarchical land use classification based on convolutional neural networks
Land use (LU) is an important information source commonly stored in geospatial databases. Most current work on automatic LU classification for updating topographic databases considers only one category level (e.g. residential or agricultural) consisting of a small number of classes. However, LU databases frequently contain very detailed information, using a hierarchical object catalogue where the number of categories differs depending on the hierarchy level. This paper presents a method for the classification of LU on the basis of aerial images that differentiates a fine-grained class structure, exploiting the hierarchical relationship between categories at different levels of the class catalogue. Starting from a convolutional neural network (CNN) for classifying the categories of all levels, we propose a strategy to simultaneously learn the semantic dependencies between different category levels explicitly. The input to the CNN consists of aerial images and derived data as well as land cover information derived from semantic segmentation. Its output is the class scores at three different semantic levels, based on which predictions that are consistent with the class hierarchy are made. We evaluate our method using two test sites and show how the classification accuracy depends on the semantic category level. While at the coarsest level, an overall accuracy in the order of 90% can be achieved, at the finest level, this accuracy is reduced to around 65%. Our experiments also show which classes are particularly hard to differentiate. © 2020 Copernicus GmbH. All rights reserved
Learning multi-label aerial image classification under label noise: a regularization approach using word embeddings
Training deep neural networks requires well-annotated datasets. However, real world datasets are often noisy, especially in a multi-label scenario, i.e. where each data point can be attributed to more than one class. To this end, we propose a regularization method to learn multi-label classification networks from noisy data. This regularization is based on the assumption that semantically close classes are more likely to appear together in a given image. Hereby, we encode label correlations with prior knowledge and regularize noisy network predictions using label correlations. To evaluate its effectiveness, we perform experiments on a mutli-label aerial image dataset contaminated with controlled levels of label noise. Results indicate that networks trained using the proposed method outperform those directly learned from noisy labels and that the benefits increase proportionally to the amount of noise present
Semantic Interleaving Global Channel Attention for Multilabel Remote Sensing Image Classification
Multi-Label Remote Sensing Image Classification (MLRSIC) has received
increasing research interest. Taking the cooccurrence relationship of multiple
labels as additional information helps to improve the performance of this task.
Current methods focus on using it to constrain the final feature output of a
Convolutional Neural Network (CNN). On the one hand, these methods do not make
full use of label correlation to form feature representation. On the other
hand, they increase the label noise sensitivity of the system, resulting in
poor robustness. In this paper, a novel method called Semantic Interleaving
Global Channel Attention (SIGNA) is proposed for MLRSIC. First, the label
co-occurrence graph is obtained according to the statistical information of the
data set. The label co-occurrence graph is used as the input of the Graph
Neural Network (GNN) to generate optimal feature representations. Then, the
semantic features and visual features are interleaved, to guide the feature
expression of the image from the original feature space to the semantic feature
space with embedded label relations. SIGNA triggers global attention of feature
maps channels in a new semantic feature space to extract more important visual
features. Multihead SIGNA based feature adaptive weighting networks are
proposed to act on any layer of CNN in a plug-and-play manner. For remote
sensing images, better classification performance can be achieved by inserting
CNN into the shallow layer. We conduct extensive experimental comparisons on
three data sets: UCM data set, AID data set, and DFC15 data set. Experimental
results demonstrate that the proposed SIGNA achieves superior classification
performance compared to state-of-the-art (SOTA) methods. It is worth mentioning
that the codes of this paper will be open to the community for reproducibility
research. Our codes are available at https://github.com/kyle-one/SIGNA.Comment: 14 pages, 13 figure
Multi-Label Noise Robust Collaborative Learning Model for Remote Sensing Image Classification
The development of accurate methods for multi-label classification (MLC) of
remote sensing (RS) images is one of the most important research topics in RS.
Methods based on Deep Convolutional Neural Networks (CNNs) have shown strong
performance gains in RS MLC problems. However, CNN-based methods usually
require a high number of reliable training images annotated by multiple
land-cover class labels. Collecting such data is time-consuming and costly. To
address this problem, the publicly available thematic products, which can
include noisy labels, can be used to annotate RS images with zero-labeling
cost. However, multi-label noise (which can be associated with wrong and
missing label annotations) can distort the learning process of the MLC
algorithm. The detection and correction of label noise are challenging tasks,
especially in a multi-label scenario, where each image can be associated with
more than one label. To address this problem, we propose a novel noise robust
collaborative multi-label learning (RCML) method to alleviate the adverse
effects of multi-label noise during the training phase of the CNN model. RCML
identifies, ranks and excludes noisy multi-labels in RS images based on three
main modules: 1) discrepancy module; 2) group lasso module; and 3) swap module.
The discrepancy module ensures that the two networks learn diverse features,
while producing the same predictions. The task of the group lasso module is to
detect the potentially noisy labels assigned to the multi-labeled training
images, while the swap module task is devoted to exchanging the ranking
information between two networks. Unlike existing methods that make assumptions
about the noise distribution, our proposed RCML does not make any prior
assumption about the type of noise in the training set. Our code is publicly
available online: http://www.noisy-labels-in-rs.orgComment: Our code is publicly available online:
http://www.noisy-labels-in-rs.or