Very high-resolution (VHR) remote sensing (RS) scene classification is a
challenging task due to the higher inter-class similarity and intra-class
variability problems. Recently, the existing deep learning (DL)-based methods
have shown great promise in VHR RS scene classification. However, they still
provide an unstable classification performance. To address such a problem, we,
in this letter, propose a novel DL-based approach. For this, we devise an
enhanced VHR attention module (EAM), followed by the atrous spatial pyramid
pooling (ASPP) and global average pooling (GAP). This procedure imparts the
enhanced features from the corresponding level. Then, the multi-level feature
fusion is performed. Experimental results on two widely-used VHR RS datasets
show that the proposed approach yields a competitive and stable/robust
classification performance with the least standard deviation of 0.001. Further,
the highest overall accuracies on the AID and the NWPU datasets are 95.39% and
93.04%, respectively.Comment: This paper is under consideration in the International Journal of
Intelligent Systems (Wiley) journal. Based on the journal's policy and
restrictions, this version may be updated or delete