26 research outputs found

    Code-Aligned Autoencoders for Unsupervised Change Detection in Multimodal Remote Sensing Images

    Get PDF
    Image translation with convolutional autoencoders has recently been used as an approach to multimodal change detection (CD) in bitemporal satellite images. A main challenge is the alignment of the code spaces by reducing the contribution of change pixels to the learning of the translation function. Many existing approaches train the networks by exploiting supervised information of the change areas, which, however, is not always available. We propose to extract relational pixel information captured by domain-specific affinity matrices at the input and use this to enforce alignment of the code spaces and reduce the impact of change pixels on the learning objective. A change prior is derived in an unsupervised fashion from pixel pair affinities that are comparable across domains. To achieve code space alignment, we enforce pixels with similar affinity relations in the input domains to be correlated also in code space. We demonstrate the utility of this procedure in combination with cycle consistency. The proposed approach is compared with the state-of-the-art machine learning and deep learning algorithms. Experiments conducted on four real and representative datasets show the effectiveness of our methodology

    Unsupervised Image Regression for Heterogeneous Change Detection

    Get PDF
    Change detection (CD) in heterogeneous multitemporal satellite images is an emerging and challenging topic in remote sensing. In particular, one of the main challenges is to tackle the problem in an unsupervised manner. In this paper, we propose an unsupervised framework for bitemporal heterogeneous CD based on the comparison of affinity matrices and image regression. First, our method quantifies the similarity of affinity matrices computed from colocated image patches in the two images. This is done to automatically identify pixels that are likely to be unchanged. With the identified pixels as pseudotraining data, we learn a transformation to map the first image to the domain of the other image and vice versa. Four regression methods are selected to carry out the transformation: Gaussian process regression, support vector regression, random forest regression (RFR), and a recently proposed kernel regression method called homogeneous pixel transformation. To evaluate the potentials and limitations of our framework and also the benefits and disadvantages of each regression method, we perform experiments on two real data sets. The results indicate that the comparison of the affinity matrices can already be considered a CD method by itself. However, image regression is shown to improve the results obtained by the previous step alone and produces accurate CD maps despite of the heterogeneity of the multitemporal input data. Notably, the RFR approach excels by achieving similar accuracy as the other methods, but with a significantly lower computational cost and with fast and robust tuning of hyperparameters

    Deep Image Translation With an Affinity-Based Change Prior for Unsupervised Multimodal Change Detection

    Get PDF
    © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Image translation with convolutional neural networks has recently been used as an approach to multimodal change detection. Existing approaches train the networks by exploiting supervised information of the change areas, which, however, is not always available. A main challenge in the unsupervised problem setting is to avoid that change pixels affect the learning of the translation function. We propose two new network architectures trained with loss functions weighted by priors that reduce the impact of change pixels on the learning objective. The change prior is derived in an unsupervised fashion from relational pixel information captured by domain-specific affinity matrices. Specifically, we use the vertex degrees associated with an absolute affinity difference matrix and demonstrate their utility in combination with cycle consistency and adversarial training. The proposed neural networks are compared with the state-of-the-art algorithms. Experiments conducted on three real data sets show the effectiveness of our methodology

    Analyse hiérarchique d'images multimodales

    Get PDF
    There is a growing interest in the development of adapted processing tools for multimodal images (several images acquired over the same scene with different characteristics). Allowing a more complete description of the scene, multimodal images are of interest in various image processing fields, but their optimal handling and exploitation raise several issues. This thesis extends hierarchical representations, a powerful tool for classical image analysis and processing, to multimodal images in order to better exploit the additional information brought by the multimodality and improve classical image processing techniques. %when applied to real applications. This thesis focuses on three different multimodalities frequently encountered in the remote sensing field. We first investigate the spectral-spatial information of hyperspectral images. Based on an adapted construction and processing of the hierarchical representation, we derive a segmentation which is optimal with respect to the spectral unmixing operation. We then focus on the temporal multimodality and sequences of hyperspectral images. Using the hierarchical representation of the frames in the sequence, we propose a new method to achieve object tracking and apply it to chemical gas plume tracking in thermal infrared hyperspectral video sequences. Finally, we study the sensorial multimodality, being images acquired with different sensors. Relying on the concept of braids of partitions, we propose a novel methodology of image segmentation, based on an energetic minimization framework.Il y a un intérêt grandissant pour le développement d’outils de traitements adaptés aux images multimodales (plusieurs images de la même scène acquises avec différentes caractéristiques). Permettant une représentation plus complète de la scène, ces images multimodales ont de l'intérêt dans plusieurs domaines du traitement d'images, mais les exploiter et les manipuler de manière optimale soulève plusieurs questions. Cette thèse étend les représentations hiérarchiques, outil puissant pour le traitement et l’analyse d’images classiques, aux images multimodales afin de mieux exploiter l’information additionnelle apportée par la multimodalité et améliorer les techniques classiques de traitement d’images. Cette thèse se concentre sur trois différentes multimodalités fréquemment rencontrées dans le domaine de la télédétection. Nous examinons premièrement l’information spectrale-spatiale des images hyperspectrales. Une construction et un traitement adaptés de la représentation hiérarchique nous permettent de produire une carte de segmentation de l'image optimale vis-à-vis de l'opération de démélange spectrale. Nous nous concentrons ensuite sur la multimodalité temporelle, traitant des séquences d’images hyperspectrales. En utilisant les représentations hiérarchiques des différentes images de la séquence, nous proposons une nouvelle méthode pour effectuer du suivi d’objet et l’appliquons au suivi de nuages de gaz chimique dans des séquences d’images hyperspectrales dans le domaine thermique infrarouge. Finalement, nous étudions la multimodalité sensorielle, c’est-à-dire les images acquises par différents capteurs. Nous appuyant sur le concept des tresses de partitions, nous proposons une nouvelle méthodologie de segmentation se basant sur un cadre de minimisation d’énergie

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Deep Vision in Optical Imagery: From Perception to Reasoning

    Get PDF
    Deep learning has achieved extraordinary success in a wide range of tasks in computer vision field over the past years. Remote sensing data present different properties as compared to natural images/videos, due to their unique imaging technique, shooting angle, etc. For instance, hyperspectral images usually have hundreds of spectral bands, offering additional information, and the size of objects (e.g., vehicles) in remote sensing images is quite limited, which brings challenges for detection or segmentation tasks. This thesis focuses on two kinds of remote sensing data, namely hyper/multi-spectral and high-resolution images, and explores several methods to try to find answers to the following questions: - In comparison with natural images or videos in computer vision, the unique asset of hyper/multi-spectral data is their rich spectral information. But what this “additional” information brings for learning a network? And how do we take full advantage of these spectral bands? - Remote sensing images at high resolution have pretty different characteristics, bringing challenges for several tasks, for example, small object segmentation. Can we devise tailored networks for such tasks? - Deep networks have produced stunning results in a variety of perception tasks, e.g., image classification, object detection, and semantic segmentation. While the capacity to reason about relations over space is vital for intelligent species. Can a network/module with the capacity of reasoning benefit to parsing remote sensing data? To this end, a couple of networks are devised to figure out what a network learns from hyperspectral images and how to efficiently use spectral bands. In addition, a multi-task learning network is investigated for the instance segmentation of vehicles from aerial images and videos. Finally, relational reasoning modules are designed to improve semantic segmentation of aerial images
    corecore