In this paper, we propose a fully supervised pre-training scheme based on
contrastive learning particularly tailored to dense classification tasks. The
proposed Context-Self Contrastive Loss (CSCL) learns an embedding space that
makes semantic boundaries pop-up by use of a similarity metric between every
location in a training sample and its local context. For crop type semantic
segmentation from Satellite Image Time Series (SITS) we find performance at
parcel boundaries to be a critical bottleneck and explain how CSCL tackles the
underlying cause of that problem, improving the state-of-the-art performance in
this task. Additionally, using images from the Sentinel-2 (S2) satellite
missions we compile the largest, to our knowledge, SITS dataset densely
annotated by crop type and parcel identities, which we make publicly available
together with the data generation pipeline. Using that data we find CSCL, even
with minimal pre-training, to improve all respective baselines and present a
process for semantic segmentation at super-resolution for obtaining crop
classes at a more granular level. The code and instructions to download the
data can be found in https://github.com/michaeltrs/DeepSatModels.Comment: 15 pages, 17 figure