    Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels

    IoU losses are surrogates that directly optimize the Jaccard index. In semantic segmentation, leveraging IoU losses as part of the loss function is shown to perform better with respect to the Jaccard index measure than optimizing pixel-wise losses such as the cross-entropy loss alone. The most notable IoU losses are the soft Jaccard loss and the Lovasz-Softmax loss. However, these losses are incompatible with soft labels which are ubiquitous in machine learning. In this paper, we propose Jaccard metric losses (JMLs), which are identical to the soft Jaccard loss in a standard setting with hard labels, but are compatible with soft labels. With JMLs, we study two of the most popular use cases of soft labels: label smoothing and knowledge distillation. With a variety of architectures, our experiments show significant improvements over the cross-entropy loss on three semantic segmentation datasets (Cityscapes, PASCAL VOC and DeepGlobe Land), and our simple approach outperforms state-of-the-art knowledge distillation methods by a large margin. Code is available at: \href{https://github.com/zifuwanggg/JDTLosses}{https://github.com/zifuwanggg/JDTLosses}.Comment: Submitted to ICML2023. Code is available at https://github.com/zifuwanggg/JDTLosse

    The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery

    The scientific outcomes of the 2022 Landslide4Sense (L4S) competition organized by the Institute of Advanced Research in Artificial Intelligence (IARAI) are presented here. The objective of the competition is to automatically detect landslides based on large-scale multiple sources of satellite imagery collected globally. The 2022 L4S aims to foster interdisciplinary research on recent developments in deep learning (DL) models for the semantic segmentation task using satellite imagery. In the past few years, DL-based models have achieved performance that meets expectations on image interpretation, due to the development of convolutional neural networks (CNNs). The main objective of this article is to present the details and the best-performing algorithms featured in this competition. The winning solutions are elaborated with state-of-the-art models like the Swin Transformer, SegFormer, and U-Net. Advanced machine learning techniques and strategies such as hard example mining, self-training, and mix-up data augmentation are also considered. Moreover, we describe the L4S benchmark data set in order to facilitate further comparisons, and report the results of the accuracy assessment online. The data is accessible on Future Development Leaderboard for future evaluation at https://www.iarai.ac.at/landslide4sense/challenge/ , and researchers are invited to submit more prediction results, evaluate the accuracy of their methods, compare them with those of other users, and, ideally, improve the landslide detection results reported in this article

    Post-analysis of OSM-GAN Spatial Change Detection

    Keeping crowdsourced maps up-to-date is important for a wide range of location-based applications (route planning, urban planning, navigation, tourism, etc.).We propose a novelmap updatingmechanism that combines the latest freely available remote sensing data with the current state of online vector map data to train a Deep Learning (DL) neural network. It uses a GenerativeAdversarial Network (GAN) to perform image-to-image translation, followed by segmentation and raster-vector comparison processes to identify changes to map features (e.g. buildings, roads, etc.) when compared to existing map data. This paper evaluates various GAN models trained with sixteen different datasets designed for use by our change detection/map updating procedure. Each GAN model is evaluated quantitatively and qualitatively to select the most accurate DL model for use in future spatial change detection applications

    Land cover and forest segmentation using deep neural networks

    Tiivistelmä. Land Use and Land Cover (LULC) information is important for a variety of applications notably ones related to forestry. The segmentation of remotely sensed images has attracted various research subjects. However this is no easy task, with various challenges to face including the complexity of satellite images, the difficulty to get hold of them, and lack of ready datasets. It has become clear that trying to classify on multiple classes requires more elaborate methods such as Deep Learning (DL). Deep Neural Networks (DNNs) have a promising potential to be a good candidate for the task. However DNNs require a huge amount of data to train including the Ground Truth (GT) data. In this thesis a DL pixel-based approach backed by the state of the art semantic segmentation methods is followed to tackle the problem of LULC mapping. The DNN used is based on DeepLabv3 network with an encoder-decoder architecture. To tackle the issue of lack of data the Sentinel-2 satellite whose data is provided for free by Copernicus was used with the GT mapping from Corine Land Cover (CLC) provided by Copernicus and modified by Tyke to a higher resolution. From the multispectral images in Sentinel-2 Red Green Blue (RGB), and Near Infra Red (NIR) channels were extracted, the 4th channel being extremely useful in the detection of vegetation. This ended up achieving quite good accuracy on a DNN based on ResNet-50 which was calculated using the Mean Intersection over Union (MIoU) metric reaching 0.53MIoU. It was possible to use this data to transfer the learning to a data from Pleiades-1 satellite with much better resolution, Very High Resolution (VHR) in fact. The results were excellent especially when compared on training right away on that data reaching an accuracy of 0.98 and 0.85MIoU

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    Sentinel-1-based water and flood mapping: benchmarking convolutional neural networks against an operational rule-based processing chain

    In this study, the effectiveness of several convolutional neural network architectures (AlbuNet-34/FCN/DeepLabV3+/U-Net/U-Net++) for water and flood mapping using Sentinel-1 amplitude data is compared to an operational rule-based processor (S-1FS). This comparison is made using a globally distributed dataset of Sentinel-1 scenes and the corresponding ground truth water masks derived from Sentinel-2 data to evaluate the performance of the classifiers on a global scale in various environmental conditions. The impact of using single versus dual-polarized input data on the segmentation capabilities of AlbuNet-34 is evaluated. The weighted cross entropy loss is combined with the Lovász loss and various data augmentation methods are investigated. Furthermore, the concept of atrous spatial pyramid pooling used in DeepLabV3+ and the multiscale feature fusion inherent in U-Net++ are assessed. Finally, the generalization capacity of AlbuNet-34 is tested in a realistic flood mapping scenario by using additional data from two flood events and the Sen1Floods11 dataset. The model trained using dual polarized data outperforms the S-1FS significantly and increases the intersection over union (IoU) score by 5%. Using a weighted combination of the cross entropy and the Lovász loss increases the IoU score by another 2%. Geometric data augmentation degrades the performance while radiometric data augmentation leads to better testing results. FCN/DeepLabV3+/U-Net/U-Net++ perform not significantly different to AlbuNet-34. Models trained on data showing no distinct inundation perform very well in mapping the water extent during two flood events, reaching IoU scores of 0.96 and 0.94, respectively, and perform comparatively well on the Sen1Floods11 dataset

    Pre-Trained Driving in Localized Surroundings with Semantic Radar Information and Machine Learning

    Entlang der Signalverarbeitungskette von Radar Detektionen bis zur Fahrzeugansteuerung, diskutiert diese Arbeit eine semantischen Radar Segmentierung, einen darauf aufbauenden Radar SLAM, sowie eine im Verbund realisierte autonome Parkfunktion. Die Radarsegmentierung der (statischen) Umgebung wird durch ein Radar-spezifisches neuronales Netzwerk RadarNet erreicht. Diese Segmentierung ermöglicht die Entwicklung des semantischen Radar Graph-SLAM SERALOC. Auf der Grundlage der semantischen Radar SLAM Karte wird eine beispielhafte autonome Parkfunktionalität in einem realen Versuchsträger umgesetzt. Entlang eines aufgezeichneten Referenzfades parkt die Funktion ausschließlich auf Basis der Radar Wahrnehmung mit bisher unerreichter Positioniergenauigkeit. Im ersten Schritt wird ein Datensatz von 8.2 · 10^6 punktweise semantisch gelabelten Radarpunktwolken über eine Strecke von 2507.35m generiert. Es sind keine vergleichbaren Datensätze dieser Annotationsebene und Radarspezifikation öffentlich verfügbar. Das überwachte Training der semantischen Segmentierung RadarNet erreicht 28.97% mIoU auf sechs Klassen. Außerdem wird ein automatisiertes Radar-Labeling-Framework SeRaLF vorgestellt, welches das Radarlabeling multimodal mittels Referenzkameras und LiDAR unterstützt. Für die kohärente Kartierung wird ein Radarsignal-Vorfilter auf der Grundlage einer Aktivierungskarte entworfen, welcher Rauschen und andere dynamische Mehrwegreflektionen unterdrückt. Ein speziell für Radar angepasstes Graph-SLAM-Frontend mit Radar-Odometrie Kanten zwischen Teil-Karten und semantisch separater NDT Registrierung setzt die vorgefilterten semantischen Radarscans zu einer konsistenten metrischen Karte zusammen. Die Kartierungsgenauigkeit und die Datenassoziation werden somit erhöht und der erste semantische Radar Graph-SLAM für beliebige statische Umgebungen realisiert. Integriert in ein reales Testfahrzeug, wird das Zusammenspiel der live RadarNet Segmentierung und des semantischen Radar Graph-SLAM anhand einer rein Radar-basierten autonomen Parkfunktionalität evaluiert. Im Durchschnitt über 42 autonome Parkmanöver (∅3.73 km/h) bei durchschnittlicher Manöverlänge von ∅172.75m wird ein Median absoluter Posenfehler von 0.235m und End-Posenfehler von 0.2443m erreicht, der vergleichbare Radar-Lokalisierungsergebnisse um ≈ 50% übertrifft. Die Kartengenauigkeit von veränderlichen, neukartierten Orten über eine Kartierungsdistanz von ∅165m ergibt eine ≈ 56%-ige Kartenkonsistenz bei einer Abweichung von ∅0.163m. Für das autonome Parken wurde ein gegebener Trajektorienplaner und Regleransatz verwendet

    Deep learning for the early detection of harmful algal blooms and improving water quality monitoring

    Climate change will affect how water sources are managed and monitored. The frequency of algal blooms will increase with climate change as it presents favourable conditions for the reproduction of phytoplankton. During monitoring, possible sensory failures in monitoring systems result in partially filled data which may affect critical systems. Therefore, imputation becomes necessary to decrease error and increase data quality. This work investigates two issues in water quality data analysis: improving data quality and anomaly detection. It consists of three main topics: data imputation, early algal bloom detection using in-situ data and early algal bloom detection using multiple modalities.The data imputation problem is addressed by experimenting with various methods with a water quality dataset that includes four locations around the North Sea and the Irish Sea with different characteristics and high miss rates, testing model generalisability. A novel neural network architecture with self-attention is proposed in which imputation is done in a single pass, reducing execution time. The self-attention components increase the interpretability of the imputation process at each stage of the network, providing knowledge to domain experts.After data curation, algal activity is predicted using transformer networks, between 1 to 7 days ahead, and the importance of the input with regard to the output of the prediction model is explained using SHAP, aiming to explain model behaviour to domain experts which is overlooked in previous approaches. The prediction model improves bloom detection performance by 5% on average and the explanation summarizes the complex structure of the model to input-output relationships. Performance improvements on the initial unimodal bloom detection model are made by incorporating multiple modalities into the detection process which were only used for validation purposes previously. The problem of missing data is also tackled by using coordinated representations, replacing low quality in-situ data with satellite data and vice versa, instead of imputation which may result in biased results