260 research outputs found
Multimodal Probabilistic Latent Semantic Analysis for Sentinel-1 and Sentinel-2 Image Fusion
Probabilistic topic models have recently shown a great potential in the remote sensing image fusion field, which is particularly helpful in land-cover categorization tasks. This letter first studies the application of probabilistic latent semantic analysis (pLSA) and latent Dirichlet allocation to remote sensing synthetic aperture radar (SAR) and multispectral imaging (MSI) unsupervised land-cover categorization. Then, a novel pLSA-based image fusion approach is presented, which pursues to uncover multimodal feature patterns from SAR and MSI data in order to effectively fuse and categorize Sentinel-1 and Sentinel-2 remotely sensed data. Experiments conducted over two different data sets reveal the advantages of the proposed approach for unsupervised land-cover categorization tasks
Sentinel-2 and Sentinel-3 Intersensor Vegetation Estimation via Constrained Topic Modeling
This letter presents a novel intersensor vegetation estimation framework, which aims at combining Sentinel-2 (S2) spatial resolution with Sentinel-3 (S3) spectral characteristics in order to generate fused vegetation maps. On the one hand, the multispectral instrument (MSI), carried by S2, provides high spatial resolution images. On the other hand, the Ocean and Land Color Instrument (OLCI), one of the instruments of S3, captures the Earth's surface at a substantially coarser spatial resolution but using smaller spectral bandwidths, which makes the OLCI data more convenient to highlight specific spectral features and motivates the development of synergetic fusion products. In this scenario, the approach presented here takes advantage of the proposed constrained probabilistic latent semantic analysis (CpLSA) model to produce intersensor vegetation estimations, which aim at synergically exploiting MSI's spatial resolution and OLCI's spectral characteristics. Initially, CpLSA is used to uncover the MSI reflectance patterns, which are able to represent the OLCI-derived vegetation. Then, the original MSI data are projected onto this higher abstraction-level representation space in order to generate a high-resolution version of the vegetation captured in the OLCI domain. Our experimental comparison, conducted using four data sets, three different regression algorithms, and two vegetation indices, reveals that the proposed framework is able to provide a competitive advantage in terms of quantitative and qualitative vegetation estimation results
X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for Classification of Remote Sensing Data
This paper addresses the problem of semi-supervised transfer learning with
limited cross-modality data in remote sensing. A large amount of multi-modal
earth observation images, such as multispectral imagery (MSI) or synthetic
aperture radar (SAR) data, are openly available on a global scale, enabling
parsing global urban scenes through remote sensing imagery. However, their
ability in identifying materials (pixel-wise classification) remains limited,
due to the noisy collection environment and poor discriminative information as
well as limited number of well-annotated training images. To this end, we
propose a novel cross-modal deep-learning framework, called X-ModalNet, with
three well-designed modules: self-adversarial module, interactive learning
module, and label propagation module, by learning to transfer more
discriminative information from a small-scale hyperspectral image (HSI) into
the classification task using a large-scale MSI or SAR data. Significantly,
X-ModalNet generalizes well, owing to propagating labels on an updatable graph
constructed by high-level features on the top of the network, yielding
semi-supervised cross-modality learning. We evaluate X-ModalNet on two
multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a
significant improvement in comparison with several state-of-the-art methods
Weakly Supervised Learning for Multi-Image Synthesis
Machine learning-based approaches have been achieving state-of-the-art results on many computer vision tasks. While deep learning and convolutional networks have been incredibly popular, these approaches come at the expense of huge amounts of labeled data required for training. Manually annotating large amounts of data, often millions of images in a single dataset, is costly and time consuming. To deal with the problem of data annotation, the research community has been exploring approaches that require less amount of labelled data.
The central problem that we consider in this research is image synthesis without any manual labeling. Image synthesis is a classic computer vision task that requires understanding of image contents and their semantic and geometric properties. We propose that we can train image synthesis models by relying on sequences of videos and using weakly supervised learning. Large amounts of unlabeled data are freely available on the internet. We propose to set up the training in a multi-image setting so that we can use one of the images as the target - this allows us to rely only on images for training and removes the need for manual annotations. We demonstrate three main contributions in this work.
First, we present a method of fusing multiple noisy overhead images to make a single, artifact-free image. We present a weakly supervised method that relies on crowd-sourced labels from online maps and a completely unsupervised variant that only requires a series of satellite images as inputs. Second, we propose a single-image novel view synthesis method for complex, outdoor scenes. We propose a learning-based method that uses pairs of nearby images captured on urban roads and their respective GPS coordinates as supervision. We show that a model trained with this automatically captured data can render a new view of a scene that can be as far as 10 meters from the input image. Third, we consider the problem of synthesizing new images of a scene under different conditions, such as time of day and season, based on a single input image. As opposed to existing methods, we do not need manual annotations for transient attributes, such as fog or snow, for training. We train our model by using streams of images captured from outdoor webcams and time-lapse videos.
Through these applications, we show several settings where we can train state-of-the-art deep learning methods without manual annotations. This work focuses on three image synthesis tasks. We propose weakly supervised learning and remove requirements for manual annotations by relying on sequences of images. Our approach is in line with the research efforts that aim to minimize the labels required for training machine learning methods
Sentinel-3/FLEX Biophysical Product Confidence Using Sentinel-2 Land-Cover Spatial Distributions
The estimation of biophysical variables from remote sensing data raises important challenges in terms of the acquisition technology and its limitations. In this way, some vegetation parameters, such as chlorophyll fluorescence, require sensors with a high spectral resolution that constrains the spatial resolution while significantly increasing the subpixel land-cover heterogeneity. Precisely, this spatial variability often makes that rather different canopy structures are aggregated together, which eventually generates important deviations in the corresponding parameter quantification. In the context of the Copernicus program (and other related Earth Explorer missions), this article proposes a new statistical methodology to manage the subpixel spatial heterogeneity problem in Sentinel-3 (S3) and FLuorescence EXplorer (FLEX) by taking advantage of the higher spatial resolution of Sentinel-2 (S2). Specifically, the proposed approach first characterizes the subpixel spatial patterns of S3/FLEX using inter-sensor data from S2. Then, a multivariate analysis is conducted to model the influence of these spatial patterns in the errors of the estimated biophysical variables related to chlorophyll which are used as fluorescence proxies. Finally, these modeled distributions are employed to predict the confidence of S3/FLEX products on demand. Our experiments, conducted using multiple operational S2 and simulated S3 data products, reveal the advantages of the proposed methodology to effectively measure the confidence and expected deviations of different vegetation parameters with respect to standard regression algorithms. The source codes of this work will be available at https://github.com/rufernan/PixelS3
SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation
Self-supervised pre-training bears potential to generate expressive
representations without human annotation. Most pre-training in Earth
observation (EO) are based on ImageNet or medium-size, labeled remote sensing
(RS) datasets. We share an unlabeled RS dataset SSL4EO-S12 (Self-Supervised
Learning for Earth Observation - Sentinel-1/2) to assemble a large-scale,
global, multimodal, and multi-seasonal corpus of satellite imagery from the ESA
Sentinel-1 \& -2 satellite missions. For EO applications we demonstrate
SSL4EO-S12 to succeed in self-supervised pre-training for a set of methods:
MoCo-v2, DINO, MAE, and data2vec. Resulting models yield downstream performance
close to, or surpassing accuracy measures of supervised learning. In addition,
pre-training on SSL4EO-S12 excels compared to existing datasets. We make openly
available the dataset, related source code, and pre-trained models at
https://github.com/zhu-xlab/SSL4EO-S12.Comment: Accepted by IEEE Geoscience and Remote Sensing Magazine. 18 page
Multitemporal Mosaicing for Sentinel-3/FLEX Derived Level-2 Product Composites
The increasing availability of remote sensing data
raises important challenges in terms of operational data provision
and spatial coverage for conducting global studies and analyses. In
this regard, existing multitemporal mosaicing techniques are generally limited to producing spectral image composites without considering the particular features of higher-level biophysical and other
derived products, such as those provided by the Sentinel-3 (S3) and
Fluorescence Explorer (FLEX) tandem missions. To relieve these
limitations, this article proposes a novel multitemporal mosaicing
algorithm specially designed for operational S3-derived products
and also studies its applicability within the FLEX mission context.
Specifically, we design a new operational methodology to automatically produce multitemporal mosaics from derived S3/FLEX
products with the objective of facilitating the automatic processing
of high-level data products, where weekly, monthly, seasonal, or
annual biophysical mosaics can be generated by means of four
processes proposed in this work: 1) operational data acquisition;
2) spatial mosaicing and rearrangement; 3) temporal compositing;
and 4) confidence measures. The experimental part of the work
tests the consistency of the proposed framework over different S3
product collections while showing its advantages with respect to
other standard mosaicing alternatives. The source codes of this
work will be made available for reproducible research
- …