12 research outputs found

    Supervised and Semi-Supervised Self-Organizing Maps for Regression and Classification Focusing on Hyperspectral Data

    Get PDF
    Machine learning approaches are valuable methods in hyperspectral remote sensing, especially for the classification of land cover or for the regression of physical parameters. While the recording of hyperspectral data has become affordable with innovative technologies, the acquisition of reference data (ground truth) has remained expensive and time-consuming. There is a need for methodological approaches that can handle datasets with significantly more hyperspectral input data than reference data. We introduce the Supervised Self-organizing Maps (SuSi) framework, which can perform unsupervised, supervised and semi-supervised classification as well as regression on high-dimensional data. The methodology of the SuSi framework is presented and compared to other frameworks. Its different parts are evaluated on two hyperspectral datasets. The results of the evaluations can be summarized in four major findings: (1) The supervised and semi-Supervised Self-organizing Maps (SOM) outperform random forest in the regression of soil moisture. (2) In the classification of land cover, the supervised and semi-supervised SOM reveal great potential. (3) The unsupervised SOM is a valuable tool to understand the data. (4) The SuSi framework is versatile, flexible, and easy to use. The SuSi framework is provided as an open-source Python package on GitHub

    FACE IMAGE RETRIEVAL SYSTEM USING COMBINATION METHOD OF SELF ORGANIZING MAP AND NORMALIZED CROSS CORRELATION

    Get PDF
    Content based image retrieval (CBIR) is one method in computer vision that is widely applied in various fields of life. In this study, two algorithms will be combined, namely self organizing map (SOM) and normalized cross correlation (NCC) to test the method in the face image retrieval system. The SOM algorithm is used to perform learning on the system created and the NCC method is used to calculate the proximity value between the input image and the image contained in the database to be displayed as the result of image retrieval. The test results in the proposed research show good results with an accuracy rate of face image retrieval of 93.62%. This percentage is higher than using the usual SOM method with an accuracy rate of face image retrieval of 91.62%

    On the Application of Data Clustering Algorithm used in Information Retrieval for Satellite Imagery Segmentation

    Get PDF
    This study proposes an automated technique for segmenting satellite imagery using unsupervised learning. Autoencoders, a type of neural network, are employed for dimensionality reduction and feature extraction. The study evaluates different segmentation architectures and encoders and identifies the best performing combination as the DeepLabv3+ architecture with a ResNet-152 encoder. This approach achieves high performance scores across multiple metrics and can be beneficial in various fields, including agriculture, land use monitoring, and disaster response

    An improved pheromone-based kohonen self-organising map in clustering and visualising balanced and imbalanced datasets

    Get PDF
    The data distribution issue remains an unsolved clustering problem in data mining, especially in dealing with imbalanced datasets. The Kohonen Self-Organising Map (KSOM) is one of the well-known clustering algorithms that can solve various problems without a pre-defined number of clusters. However, similar to other clustering algorithms, this algorithm requires sufficient data for its unsupervised learning process. The inadequate amount of class label data in a dataset significantly affects the clustering learning process, leading to inefficient and unreliable results. Numerous research have been conducted by hybridising and optimising the KSOM algorithm with various optimisation techniques. Unfortunately, the problems are still unsolved, especially separation boundary and overlapping clusters. Therefore, this research proposed an improved pheromone- based PKSOM algorithm known as iPKSOM to solve the mentioned problem. Six different datasets, i.e. Iris, Seed, Glass, Titanic, WDBC, and Tropical Wood datasets were chosen to investigate the effectiveness of the iPKSOM algorithm. All datasets were observed and compared with the original KSOM results. This modification significantly impacted the clustering process by improving and refining the scatteredness of clustering data and reducing overlapping clusters. Therefore, this proposed algorithm can be implemented in clustering other complex datasets, such as high dimensional and streaming data

    X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for Classification of Remote Sensing Data

    Get PDF
    This paper addresses the problem of semi-supervised transfer learning with limited cross-modality data in remote sensing. A large amount of multi-modal earth observation images, such as multispectral imagery (MSI) or synthetic aperture radar (SAR) data, are openly available on a global scale, enabling parsing global urban scenes through remote sensing imagery. However, their ability in identifying materials (pixel-wise classification) remains limited, due to the noisy collection environment and poor discriminative information as well as limited number of well-annotated training images. To this end, we propose a novel cross-modal deep-learning framework, called X-ModalNet, with three well-designed modules: self-adversarial module, interactive learning module, and label propagation module, by learning to transfer more discriminative information from a small-scale hyperspectral image (HSI) into the classification task using a large-scale MSI or SAR data. Significantly, X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network, yielding semi-supervised cross-modality learning. We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods

    Deep Learning for Land Cover Change Detection

    Get PDF
    Land cover and its change are crucial for many environmental applications. This study focuses on the land cover classification and change detection with multitemporal and multispectral Sentinel-2 satellite data. To address the challenging land cover change detection task, we rely on two different deep learning architectures and selected pre-processing steps. For example, we define an excluded class and deal with temporal water shoreline changes in the pre-processing. We employ a fully convolutional neural network (FCN), and we combine the FCN with long short-term memory (LSTM) networks. The FCN can only handle monotemporal input data, while the FCN combined with LSTM can use sequential information (multitemporal). Besides, we provided fixed and variable sequences as training sequences for the combined FCN and LSTM approach. The former refers to using six defined satellite images, while the latter consists of image sequences from an extended training pool of ten images. Further, we propose measures for the robustness concerning the selection of Sentinel-2 image data as evaluation metrics. We can distinguish between actual land cover changes and misclassifications of the deep learning approaches with these metrics. According to the provided metrics, both multitemporal LSTM approaches outperform the monotemporal FCN approach, about 3 to 5 percentage points (p.p.). The LSTM approach trained on the variable sequences detects 3 p.p. more land cover changes than the LSTM approach trained on the fixed sequences. Besides, applying our selected pre-processing improves the water classification and avoids reducing the dataset effectively by 17.6%. The presented LSTM approaches can be modified to provide applicability for a variable number of image sequences since we published the code of the deep learning models. The Sentinel-2 data and the ground truth are also freely available

    Machine Learning Framework for the Estimation of Average Speed in Rural Road Networks with OpenStreetMap Data

    Get PDF
    Average speed information, which is essential for routing applications, is often missing in the freely available OpenStreetMap (OSM) road network. In this contribution, we propose an estimation framework, including different machine learning (ML) models that estimate rural roads’ average speed based on current road information in OSM. We rely on three datasets covering two regions in Chile and Australia. Google Directions API data serves as reference data. An appropriate estimation framework is presented, which involves supervised ML models, unsupervised clustering, and dimensionality reduction to generate new input features. The regression performance of each model with different input feature modes is evaluated on each dataset. The best performing model results in a coefficient of determination R2^{2}=80.43%, which is significantly better than previous approaches relying on domain-knowledge. Overall, the potential of the ML-based estimation framework to estimate the average speed with OSM road network data is demonstrated. This ML-based approach is data-driven and does not require any domain knowledge. In the future, we intend to focus on the generalization ability of the estimation framework concerning its application in different regions worldwide. The implementation of our estimation framework for an exemplary dataset is provided on GitHub

    Development and Applications of Machine Learning Methods for Hyperspectral Data

    Get PDF
    Die hyperspektrale Fernerkundung der Erde stützt sich auf Daten passiver optischer Sensoren, die auf Plattformen wie Satelliten und unbemannten Luftfahrzeugen montiert sind. Hyperspektrale Daten umfassen Informationen zur Identifizierung von Materialien und zur Überwachung von Umweltvariablen wie Bodentextur, Bodenfeuchte, Chlorophyll a und Landbedeckung. Methoden zur Datenanalyse sind erforderlich, um Informationen aus hyperspektralen Daten zu erhalten. Ein leistungsstarkes Werkzeug bei der Analyse von Hyperspektraldaten ist das Maschinelle Lernen, eine Untergruppe von Künstlicher Intelligenz. Maschinelle Lernverfahren können nichtlineare Korrelationen lösen und sind bei steigenden Datenmengen skalierbar. Jeder Datensatz und jedes maschinelle Lernverfahren bringt neue Herausforderungen mit sich, die innovative Lösungen erfordern. Das Ziel dieser Arbeit ist die Entwicklung und Anwendung von maschinellen Lernverfahren auf hyperspektrale Fernerkundungsdaten. Im Rahmen dieser Arbeit werden Studien vorgestellt, die sich mit drei wesentlichen Herausforderungen befassen: (I) Datensätze, welche nur wenige Datenpunkte mit dazugehörigen Ausgabedaten enthalten, (II) das begrenzte Potential von nicht-tiefen maschinellen Lernverfahren auf hyperspektralen Daten und (III) Unterschiede zwischen den Verteilungen der Trainings- und Testdatensätzen. Die Studien zur Herausforderung (I) führen zur Entwicklung und Veröffentlichung eines Frameworks von Selbstorganisierten Karten (SOMs) für unüberwachtes, überwachtes und teilüberwachtes Lernen. Die SOM wird auf einen hyperspektralen Datensatz in der (teil-)überwachten Regression der Bodenfeuchte angewendet und übertrifft ein Standardverfahren des maschinellen Lernens. Das SOM-Framework zeigt eine angemessene Leistung in der (teil-)überwachten Klassifikation der Landbedeckung. Es bietet zusätzliche Visualisierungsmöglichkeiten, um das Verständnis des zugrunde liegenden Datensatzes zu verbessern. In den Studien, die sich mit Herausforderung (II) befassen, werden drei innovative eindimensionale Convolutional Neural Network (CNN) Architekturen entwickelt. Die CNNs werden für eine Bodentexturklassifikation auf einen frei verfügbaren hyperspektralen Datensatz angewendet. Ihre Leistung wird mit zwei bestehenden CNN-Ansätzen und einem Random Forest verglichen. Die beiden wichtigsten Erkenntnisse lassen sich wie folgt zusammenfassen: Erstens zeigen die CNN-Ansätze eine deutlich bessere Leistung als der angewandte nicht-tiefe Random Forest-Ansatz. Zweitens verbessert das Hinzufügen von Informationen über hyperspektrale Bandnummern zur Eingabeschicht eines CNNs die Leistung im Bezug auf die einzelnen Klassen. Die Studien über die Herausforderung (III) basieren auf einem Datensatz, der auf fünf verschiedenen Messgebieten in Peru im Jahr 2019 erfasst wurde. Die Unterschiede zwischen den Messgebieten werden mit qualitativen Methoden und mit unüberwachten maschinellen Lernverfahren, wie zum Beispiel Principal Component Analysis und Autoencoder, analysiert. Basierend auf den Ergebnissen wird eine überwachte Regression der Bodenfeuchte bei verschiedenen Kombinationen von Messgebieten durchgeführt. Zusätzlich wird der Datensatz mit Monte-Carlo-Methoden ergänzt, um die Auswirkungen der Verschiebung der Verteilungen des Datensatzes auf die Regression zu untersuchen. Der angewandte SOM-Regressor ist relativ robust gegenüber dem Rauschen des Bodenfeuchtesensors und zeigt eine gute Leistung bei kleinen Datensätzen, während der angewandte Random Forest auf dem gesamten Datensatz am besten funktioniert. Die Verschiebung der Verteilungen macht diese Regressionsaufgabe schwierig; einige Kombinationen von Messgebieten bilden einen deutlich sinnvolleren Trainingsdatensatz als andere. Insgesamt zeigen die vorgestellten Studien, die sich mit den drei größten Herausforderungen befassen, vielversprechende Ergebnisse. Die Arbeit gibt schließlich Hinweise darauf, wie die entwickelten maschinellen Lernverfahren in der zukünftigen Forschung weiter verbessert werden können
    corecore