381 research outputs found

    Deep domain adaptation by weighted entropy minimization for the classification of aerial images

    Get PDF
    Fully convolutional neural networks (FCN) are successfully used for the automated pixel-wise classification of aerial images and possibly additional data. However, they require many labelled training samples to perform well. One approach addressing this issue is semi-supervised domain adaptation (SSDA). Here, labelled training samples from a source domain and unlabelled samples from a target domain are used jointly to obtain a target domain classifier, without requiring any labelled samples from the target domain. In this paper, a two-step approach for SSDA is proposed. The first step corresponds to a supervised training on the source domain, making use of strong data augmentation to increase the initial performance on the target domain. Secondly, the model is adapted by entropy minimization using a novel weighting strategy. The approach is evaluated on the basis of five domains, corresponding to five cities. Several training variants and adaptation scenarios are tested, indicating that proper data augmentation can already improve the initial target domain performance significantly resulting in an average overall accuracy of 77.5%. The weighted entropy minimization improves the overall accuracy on the target domains in 19 out of 20 scenarios on average by 1.8%. In all experiments a novel FCN architecture is used that yields results comparable to those of the best-performing models on the ISPRS labelling challenge while having an order of magnitude fewer parameters than commonly used FCNs. © 2020 Copernicus GmbH. All rights reserved

    Building Change Detection in Airborne Laser Scanning and Dense Image Matching Point Clouds Using a Residual Neural Network

    Get PDF
    National Mapping Agencies (NMAs) acquire nation-wide point cloud data from Airborne Laser Scanning (ALS) sensors as well as using Dense Image Matching (DIM) on aerial images. As these datasets are often captured years apart, they contain implicit information about changes in the real world. While detecting changes within point clouds is not a new topic per se, detecting changes in point clouds from different sensors, which consequently have different point densities, point distributions and characteristics, is still an on-going problem. As such, we approach this task using a residual neural network, which detects building changes using height and class information on a raster level. In the experiments, we show that this approach is capable of detecting building changes automatically and reliably independent of the given point clouds and for various building sizes achieving mean F1-Scores of 80.5% and 79.8% for ALS-ALS and ALS-DIM point clouds on an object-level and F1-Scores of 91.1% and 86.3% on a raster-level, respectively

    Using semantically paired images to improve domain adaptation for the semantic segmentation of aerial images

    Get PDF
    Modern machine learning, especially deep learning, which is used in a variety of applications, requires a lot of labelled data for model training. Having an insufficient amount of training examples leads to models which do not generalize well to new input instances. This is a particular significant problem for tasks involving aerial images: Often training data is only available for a limited geographical area and a narrow time window, thus leading to models which perform poorly in different regions, at different times of day, or during different seasons. Domain adaptation can mitigate this issue by using labelled source domain training examples and unlabeled target domain images to train a model which performs well on both domains. Modern adversarial domain adaptation approaches use unpaired data. We propose using pairs of semantically similar images, i.e., whose segmentations are accurate predictions of each other, for improved model performance. In this paper we show that, as an upper limit based on ground truth, using semantically paired aerial images during training almost always increases model performance with an average improvement of 4.2% accuracy and .036 mean intersection-over-union (mIoU). Using a practical estimate of semantic similarity, we still achieve improvements in more than half of all cases, with average improvements of 2.5% accuracy and .017 mIoU in those cases. © 2020 Copernicus GmbH. All rights reserved

    Uncertainty Representation and Quantification of 3d Building Models

    Get PDF
    The quality of environmental perception is of great interest for localization tasks in autonomous systems. Maps, generated from the sensed information, are often used as additional spatial references in these applications. The quantification of the map uncertainties gives an insight into how reliable and complete the map is, avoiding the potential systematic deviation in pose estimation. Mapping 3D buildings in urban areas using Light detection and ranging (LiDAR) point clouds is a challenging task as it is often subject to uncertain error sources in the real world such as sensor noise and occlusions, which should be well represented in the 3D models for the downstream localization tasks. In this paper, we propose a method to model 3D building façades in complex urban scenes with uncertainty quantification, where the uncertainties of windows and façades are indicated in a probabilistic fashion. The potential locations of the missing objects (here: windows) are inferred by the available data and layout patterns with the Monte Carlo (MC) sampling approach. The proposed 3D building model and uncertainty measures are evaluated using the real-world LiDAR point clouds collected by Riegl Mobile Mapping System. The experimental results show that our uncertainty representation conveys the quality information of the estimated locations and shapes for the modelled map objects

    Addressing Class Imbalance in Multi-Class Image Classification by Means of Auxiliary Feature Space Restrictions

    Get PDF
    Learning from imbalanced class distributions generally leads to a classifier that is not able to distinguish classes with few training examples from the other classes. In the context of cultural heritage, addressing this problem becomes important when existing digital online collections consisting of images depicting artifacts and assigned semantic annotations shall be completed automatically; images with known annotations can be used to train a classifier that predicts missing information, where training data is often highly imbalanced. In the present paper, combining a classification loss with an auxiliary clustering loss is proposed to improve the classification performance particularly for underrepresented classes, where additionally different sampling strategies are applied. The proposed auxiliary loss aims to cluster feature vectors with respect to the semantic annotations as well as to visual properties of the images to be classified and thus, is supposed to help the classifier in distinguishing individual classes. We conduct an ablation study on a dataset consisting of images depicting silk fabrics coming along with annotations for different silk-related classification tasks. Experimental results show improvements of up to 10.5% in average F1-score and up to 20.8% in the F1-score averaged over the underrepresented classes in some classification tasks

    Deep learning based feature matching and its application in image orientation

    Get PDF
    Matching images containing large viewpoint and viewing direction changes, resulting in large perspective differences, still is a very challenging problem. Affine shape estimation, orientation assignment and feature description algorithms based on detected hand crafted features have shown to be error prone. In this paper, affine shape estimation, orientation assignment and description of local features is achieved through deep learning. Those three modules are trained based on loss functions optimizing the matching performance of input patch pairs. The trained descriptors are first evaluated on the Brown dataset (Brown et al., 2011), a standard descriptor performance benchmark. The whole pipeline is then tested on images of small blocks acquired with an aerial penta camera, to compute image orientation. The results show that learned features perform significantly better than alternatives based on hand crafted features. © 2020 Copernicus GmbH. All rights reserved

    Using redundant information from multiple aerial images for the detection of bomb craters based on marked point processes

    Get PDF
    Many countries were the target of air strikes during World War II. Numerous unexploded bombs still exist in the ground. These duds can be tracked down with the help of bomb craters, indicating areas where unexploded bombs may be located. Such areas are documented in so-called impact maps based on detected bomb craters. In this paper, a stochastic approach based on marked point processes (MPPs) for the automatic detection of bomb craters in aerial images taken during World War II is presented. As most areas are covered by multiple images, the influence of redundant image information on the object detection result is investigated: We compare the results generated based on single images with those obtained by our new approach that combines the individual detection results of multiple images covering the same location. The object model for the bomb craters is represented by circles. Our MPP approach determines the most likely configuration of objects within the scene. The goal is reached by minimizing an energy function that describes the conformity with a predefined model by Reversible Jump Markov Chain Monte Carlo sampling in combination with simulated annealing. Afterwards, a probability map is generated from the automatic detections via kernel density estimation. By setting a threshold, areas around the detections are classified as contaminated or uncontaminated sites, respectively, which results in an impact map. Our results show a significant improvement with respect to its quality when redundant image information is used. © 2020 Copernicus GmbH. All rights reserved

    Self-Supervised Adversarial Shape Completion

    Get PDF
    The goal of this paper is 3D shape completion: given an incomplete instance of a known category, hallucinate a complete version of it that is geometrically plausible. We develop an adversarial framework that makes it possible to learn shape completion in a self-supervised fashion, only from incomplete examples. This is enabled by a discriminator network that rejects incomplete shapes, via a loss function that separately assesses local sub-regions of the generated example and accepts only regions with sufficiently high point count. This inductive bias against empty regions forces the generator to output complete shapes. We demonstrate the effectiveness of this approach on synthetic data from ShapeNet and ModelNet, and on a real mobile mapping dataset with nearly 9'000 incomplete cars. Moreover, we apply it to the KITTI autonomous driving dataset without retraining, to highlight its ability to generalise to different data characteristics

    A hybrid global image orientation method for simultaneously estimating global rotations and global translations

    Get PDF
    In recent years, the determination of global image orientation, i.e. global SfM, has gained a lot of attentions from researchers, mainly due to its time efficiency. Most of the global methods take relative rotations and translations as input for a two-step strategy comprised of global rotation averaging and global translation averaging. This paper by contrast presents a hybrid approach that aims to solve global rotations and translations simultaneously, but hierarchically. We first extract an optimal minimum cover connected image triplet set (OMCTS) which includes all available images with a minimum number of triplets, all of them with the three related relative orientations being compatible to each other. For non-collinear triplets in the OMCTS, we introduce some basic characterizations of the corresponding essential matrices and solve for the image pose parameters by averaging the constrained essential matrices. For the collinear triplets, on the other hand, the image pose parameters are estimated by relative orientation using the depth of object points from individual local spatial intersection. Finally, all image orientations are estimated in a common coordinate frame by traversing every solved triplet using a similarity transformation. We show results of our method on different benchmarks and demonstrate the performance and capability of the proposed approach by comparing with other global SfM methods. © 2020 Copernicus GmbH. All rights reserved

    Exploring semantic relationships for hierarchical land use classification based on convolutional neural networks

    Get PDF
    Land use (LU) is an important information source commonly stored in geospatial databases. Most current work on automatic LU classification for updating topographic databases considers only one category level (e.g. residential or agricultural) consisting of a small number of classes. However, LU databases frequently contain very detailed information, using a hierarchical object catalogue where the number of categories differs depending on the hierarchy level. This paper presents a method for the classification of LU on the basis of aerial images that differentiates a fine-grained class structure, exploiting the hierarchical relationship between categories at different levels of the class catalogue. Starting from a convolutional neural network (CNN) for classifying the categories of all levels, we propose a strategy to simultaneously learn the semantic dependencies between different category levels explicitly. The input to the CNN consists of aerial images and derived data as well as land cover information derived from semantic segmentation. Its output is the class scores at three different semantic levels, based on which predictions that are consistent with the class hierarchy are made. We evaluate our method using two test sites and show how the classification accuracy depends on the semantic category level. While at the coarsest level, an overall accuracy in the order of 90% can be achieved, at the finest level, this accuracy is reduced to around 65%. Our experiments also show which classes are particularly hard to differentiate. © 2020 Copernicus GmbH. All rights reserved
    corecore