Remote sensing data has been widely used for various Earth Observation (EO)
missions such as land use and cover classification, weather forecasting,
agricultural management, and environmental monitoring. Most existing remote
sensing data-based models are based on supervised learning that requires large
and representative human-labelled data for model training, which is costly and
time-consuming. Recently, self-supervised learning (SSL) enables the models to
learn a representation from orders of magnitude more unlabelled data. This
representation has been proven to boost the performance of downstream tasks and
has potential for remote sensing applications. The success of SSL is heavily
dependent on a pre-designed pretext task, which introduces an inductive bias
into the model from a large amount of unlabelled data. Since remote sensing
imagery has rich spectral information beyond the standard RGB colour space, the
pretext tasks established in computer vision based on RGB images may not be
straightforward to be extended to the multi/hyperspectral domain. To address
this challenge, this work has designed a novel SSL framework that is capable of
learning representation from both spectra-spatial information of unlabelled
data. The framework contains two novel pretext tasks for object-based and
pixel-based remote sensing data analysis methods, respectively. Through two
typical downstream tasks evaluation (a multi-label land cover classification
task on Sentienl-2 multispectral datasets and a ground soil parameter retrieval
task on hyperspectral datasets), the results demonstrate that the
representation obtained through the proposed SSL achieved a significant
improvement in model performance