15 research outputs found

    Deep constrained clustering applied to satellite image time series

    Get PDF
    International audienceThe advent of satellite imagery is generating an unprecedented amount of remote sensing images. Current satellites now achieve frequent revisits and high mission availability and provide series of images of the Earth captured at different dates that can be seen as time series. Analyzing satellite image time series allows to perform continuous wide range Earth observation with applications in agricultural mapping , environmental disaster monitoring, etc. However, the lack of large quantity of labeled data generally prevents from easily applying supervised methods. On the contrary, unsupervised methods do not require expert knowledge but sometimes provide poor results. In this context, constrained clustering, which is a class of semi-supervised learning algorithms , is an alternative and offers a good trade-off of supervision. In this paper, we explore the use of constraints with deep clustering approaches to process satellite image time series. Our experimental study relies on deep embedded clustering and the deep constrained framework using pairwise constraints (must-link and cannot-link). Experiments on a real dataset composed of 11 satellite images show promising results and open many perspectives for applying deep constrained clustering to satellite image time series

    Constrained Distance Based Clustering for Satellite Image Time-Series

    Get PDF
    International audienceThe advent of high-resolution instruments for time-series sampling poses added complexity for the formal definition of thematic classes in the remote sensing domain-required by supervised methods-while unsupervised methods ignore expert knowledge and intuition. Constrained clustering is becoming an increasingly popular approach in data mining because it offers a solution to these problems, however, its application in remote sensing is relatively unknown. This article addresses this divide by adapting publicly available constrained clustering implementations to use the dynamic time warping (DTW) dissimilarity measure, which is sometimes used for time-series analysis. A comparative study is presented, in which their performance is evaluated (using both DTW and Euclidean distances). It is found that adding constraints to the clustering problem results in an increase in accuracy when compared to unconstrained clustering. The output of such algorithms are homogeneous in spatially defined regions. Declarative approaches and k-Means based algorithms are simple to apply, requiring little or no choice of parameter values. Spectral methods, however, require careful tuning, which is unrealistic in a semi-supervised setting, although they offer the highest accuracy. These conclusions were drawn from two applications: crop clustering using 11 multi-spectral Landsat images non-uniformly sampled over a period of eight months in 2007; and tree-cut detection using 10 NDVI Sentinel-2 images non-uniformly sampled between 2016 and 2018

    Clustering et apprentissage profond sous contraintes pour l'analyse de séries temporelles : application à l'analyse temporelle incrémentale en télédétection

    No full text
    Current satellites now achieve frequent revisits and high mission availability and provide series of images of the Earth captured at different dates that can be seen as time series. Their analysis allows continuous observation of the Earth on a wide spectrum with applications in agricultural mapping, environmental disaster monitoring, etc. However, this phenomenon is not limited to the field of remote sensing. Similar growth can be observed in many fields, such as medicine or finance. Whether it is for remote sensing or other domains, the analysis of these data faces the same issues.A large amount of data does not always imply sufficient labeling, which generally prevents a good application of supervised methods. Indeed, labeling remains a very time-consuming task, but also a complex one, as it requires expertise on the analyzed data. On the other hand, unsupervised methods do not require the expert's knowledge but sometimes give poor results or results that are far from the expert's expectations.In this context, constrained clustering, which is a form of semi-supervised learning algorithms, is an alternative and offers a good compromise in terms of investment for the expert. However, constrained clustering methods are subject to important limitations on the quality of the obtained results. We show in this thesis that two factors strongly limit the impact of constraints, consistency, which is the amount of information in the set of constraints that the algorithm can determine by its own bias, and coherence, which is the degree of agreement between the constraints themselves.In order to address the consistency problem, we propose a new method, I-SAMARAH, based on collaborative clustering and incremental integration of constraints. However, we also show that the consistency problem remains an important scientific challenge that we propose to address in a more prospective way with methods based on deep learning.Depuis quelques années, les satellites réalisent des captures d'images de la Terre avec une haute fréquence de revisite et une haute disponibilité, qu'on peut représenter sous forme de séries temporelles. Cela permet d'effectuer une observation continue de la Terre avec des applications dans le suivi agricole, la gestion de catastrophes naturelles, etc. Cependant, ce phénomène ne se limite pas au domaine de la télédétection. On peut en effet observer une croissance similaire dans de nombreux domaines, tel que la médecine ou la finance. Or, dans tous ces domaines, l'analyse de ces données fait face aux mêmes problématiques.Une grande quantité de données n'est pas toujours accompagnée d'un étiquetage suffisant, ce qui empêche généralement une bonne application des méthodes supervisées. En effet, l'étiquetage reste une tâche très chronophage et complexe, car nécessitant une expertise sur les données analysées. A l'opposé, les méthodes non supervisées ne nécessitent pas de connaissances de l'expert mais donnent parfois des résultats médiocres.Dans ce contexte, le clustering sous contraintes est une alternative qui offre un bon compromis en termes d'investissement pour l'expert. Toutefois, les méthodes de clustering sous contraintes sont sujettes à des limitations importantes. Nous montrons dans cette thèse que deux facteurs limites fortement l'impact des contraintes, la consistance, qui est la quantité d'information dans l'ensemble des contraintes que l'algorithme peut déterminer par ses propres biais, et la cohérence, qui est le degré d'accord entre les contraintes elles-mêmes.Afin de répondre au problème de consistance, nous proposons une nouvelle méthode, I-SAMARAH, basée sur le clustering collaboratif et l'intégration des contraintes de manière incrémentale. Cependant, nous montrons également que le problème de cohérence reste important que nous proposons d'aborder de manière plus prospective avec des méthodes basées sur l'apprentissage profond

    Constrained clustering and deep learning for time series analysis : with application to incremental temporal analysis for remote sensing

    No full text
    Depuis quelques années, les satellites réalisent des captures d'images de la Terre avec une haute fréquence de revisite et une haute disponibilité, qu'on peut représenter sous forme de séries temporelles. Cela permet d'effectuer une observation continue de la Terre avec des applications dans le suivi agricole, la gestion de catastrophes naturelles, etc. Cependant, ce phénomène ne se limite pas au domaine de la télédétection. On peut en effet observer une croissance similaire dans de nombreux domaines, tel que la médecine ou la finance. Or, dans tous ces domaines, l'analyse de ces données fait face aux mêmes problématiques.Une grande quantité de données n'est pas toujours accompagnée d'un étiquetage suffisant, ce qui empêche généralement une bonne application des méthodes supervisées. En effet, l'étiquetage reste une tâche très chronophage et complexe, car nécessitant une expertise sur les données analysées. A l'opposé, les méthodes non supervisées ne nécessitent pas de connaissances de l'expert mais donnent parfois des résultats médiocres.Dans ce contexte, le clustering sous contraintes est une alternative qui offre un bon compromis en termes d'investissement pour l'expert. Toutefois, les méthodes de clustering sous contraintes sont sujettes à des limitations importantes. Nous montrons dans cette thèse que deux facteurs limites fortement l'impact des contraintes, la consistance, qui est la quantité d'information dans l'ensemble des contraintes que l'algorithme peut déterminer par ses propres biais, et la cohérence, qui est le degré d'accord entre les contraintes elles-mêmes.Afin de répondre au problème de consistance, nous proposons une nouvelle méthode, I-SAMARAH, basée sur le clustering collaboratif et l'intégration des contraintes de manière incrémentale. Cependant, nous montrons également que le problème de cohérence reste important que nous proposons d'aborder de manière plus prospective avec des méthodes basées sur l'apprentissage profond.Current satellites now achieve frequent revisits and high mission availability and provide series of images of the Earth captured at different dates that can be seen as time series. Their analysis allows continuous observation of the Earth on a wide spectrum with applications in agricultural mapping, environmental disaster monitoring, etc. However, this phenomenon is not limited to the field of remote sensing. Similar growth can be observed in many fields, such as medicine or finance. Whether it is for remote sensing or other domains, the analysis of these data faces the same issues.A large amount of data does not always imply sufficient labeling, which generally prevents a good application of supervised methods. Indeed, labeling remains a very time-consuming task, but also a complex one, as it requires expertise on the analyzed data. On the other hand, unsupervised methods do not require the expert's knowledge but sometimes give poor results or results that are far from the expert's expectations.In this context, constrained clustering, which is a form of semi-supervised learning algorithms, is an alternative and offers a good compromise in terms of investment for the expert. However, constrained clustering methods are subject to important limitations on the quality of the obtained results. We show in this thesis that two factors strongly limit the impact of constraints, consistency, which is the amount of information in the set of constraints that the algorithm can determine by its own bias, and coherence, which is the degree of agreement between the constraints themselves.In order to address the consistency problem, we propose a new method, I-SAMARAH, based on collaborative clustering and incremental integration of constraints. However, we also show that the consistency problem remains an important scientific challenge that we propose to address in a more prospective way with methods based on deep learning

    Constrained Distance Based K-Means Clustering for Satellite Image Time-Series

    No full text
    International audienceThe advent of high-resolution instruments for time-series sampling poses added complexity for the formal definition of thematic classes in the remote sensing domain-required by supervised methods-while unsupervised methods ignore expert knowledge and intuition. Constrained clustering is becoming an increasingly popular approach in data mining because it offers a solution to these problems, however, its application in remote sensing is relatively unknown. This article addresses this divide by adapting publicly available k-Means constrained clustering implementations to use the dynamic time warping (DTW) dissimilarity measure, which is thought to be more appropriate for time-series analysis. Adding constraints to the clustering problem increases accuracy when compared to unconstrained clustering. The output of such algorithms are homogeneous in spatially defined regions

    Deep Clustering Methods Study Applied to Satellite Images Time Series

    No full text
    International audienceClustering is an essential tool for data analysis and visualization. It is particularly useful in case of a lack of labels, which prevent the use of supervised methods. The analysis of satellite images is particularly prone to this problem, especially when studied as time series, because the access to this type of data is still recent. Among all clustering methods, the ones based on Deep Neural Networks (DNNs) have seen an increasing interest lately, but only a few works have been conducted on time series yet. This paper aims to give more insight on how current clustering methods based on DNNs can be applied to Satellite Images Time Series (SITS) and it shows that with a proper configuration they can perform better compared to classical non-deep methods

    Grad Centroid Activation Mapping for Convolutional Neural Networks

    No full text
    International audienceAn important research effort has been recently dedicated to understand the decision mechanism of deep neural networks. Among them, Class Activation Mapping (CAM) and its variations have proved their capacity to obtain useful insights about Convolutional Neural Network (CNN) models' decisions. However, these methods remain limited to the supervised case regardless of CNN-based advances in unsupervised tasks such as clustering. To fill this gap, we propose a new method called Grad-CeAM for centroid-based clustering methods used on CNN representation. Through an experimental study, we show that our method has the capacity to localize discriminating features used by a CNN model to create its representation and that it can be used to explain the clusters assignment. We also show that this method can be used in different application domains by providing uses cases on time series and images clustering

    End-to-end deep representation learning for time series clustering: a comparative study

    No full text
    International audienceTime series are ubiquitous in data mining applications. Similar to other types of data, annotations can be challenging to acquire, thus preventing from training Time Series Classification (TSC) models. In this context, clustering methods can be an appropriate alternative as they create homogeneous groups allowing a better analysis of the data structure. Time series clustering has been investigated for many years and multiple approaches have already been proposed. Following the advent of deep learning in computer vision, researchers recently started to study the use of deep clustering to cluster time series data. The existing approaches mostly rely on representation learning (imported from computer vision), which consists of learning a representation of the data and performing the clustering task using this new representation. The goal of this paper is to provide a careful study and an experimental comparison of the existing literature on time series representation learning for deep clustering. In this paper, we went beyond the sole comparison of existing approaches and proposed to decompose deep clustering methods into three main components: (1) network architecture, (2) pretext loss, and (3) clustering loss. We evaluated all combinations of these components (totaling 300 different models) with the objective to study their relative influence on th

    Grad Centroid Activation Mapping for Convolutional Neural Networks

    No full text
    International audienceAn important research effort has been recently dedicated to understand the decision mechanism of deep neural networks. Among them, Class Activation Mapping (CAM) and its variations have proved their capacity to obtain useful insights about Convolutional Neural Network (CNN) models' decisions. However, these methods remain limited to the supervised case regardless of CNN-based advances in unsupervised tasks such as clustering. To fill this gap, we propose a new method called Grad-CeAM for centroid-based clustering methods used on CNN representation. Through an experimental study, we show that our method has the capacity to localize discriminating features used by a CNN model to create its representation and that it can be used to explain the clusters assignment. We also show that this method can be used in different application domains by providing uses cases on time series and images clustering
    corecore