5 research outputs found

    Automatic building detection in aerial and satellite images

    Full text link
    Abstract—Automatic creation of 3D urban city maps could be an innovative way for providing geometric data for varieties of applications such as civilian emergency situations, natural disaster management, military situations, and urban planning. Reliable and consistent extraction of quantitative information from remotely sensed imagery is crucial to the success of any of the above applications. This paper describes the development of an automated roof detection system from single monocular electro-optic satellite imagery. The system employs a fresh ap-proach in which each input image is segmented at several levels. The border line definition of such segments combined with line segments detected on the original image are used to generate a set of quadrilateral rooftop hypotheses. For each hypothesis a probability score is computed that represents the evidence of true building according to the image gradient field and line segment definitions. The presented results demonstrate that the system is capable of detecting small gabled residential rooftops with variant light reflection properties with high positional accuracies. Index Terms—Building extraction, satellite image processing, aerial image processing, photogrammetry, computer vision, geo-metrical shape extraction. I

    MERGING DIGITAL SURFACE MODELS IMPLEMENTING BAYESIAN APPROACHES

    Get PDF

    Performance and Transferability Assessment of Convolutional Neural Network (CNN) Based Building Detection Models for Emergency Response

    Get PDF
    Remote Sensing data from Earth Observation (EO) is used for a wide variety of applications. Over the last decade, in the event of a natural calamity, the importance of using geo referenced products from satellite and aerial imagery has been on the rise. They play a vital role in helping the first responders by providing valuable information in the form of hazard zone maps that help in relocation of people, in post disaster evaluation to get a better understanding of the impact on the disaster zone and in the rehabilitation and reconstruction of damaged property. In remote sensing-based emergency mapping, there are major limitations during the acquisition and processing of earth observation data. In most cases, satellite data can be acquired only from that set of EO satellites that are in orbit over the hazard zone during the time of the disaster. This can be compensated by deploying sensors on board airplanes and Unmanned Aerial Vehicles (UAVs) like drones for data acquisition. This gives rise to an archive of multi modal data that have different acquisition geometry, radiometry, acquisition conditions and Ground Sampling Distance. This forces the data processing and analysis team to be equipped with methods that can readily handle such versatile data. With the dominance of artificial intelligence in earth observation, this thesis focuses on developing a Convolutional Neural Network (CNN) model that provides a robust performance for detecting exposed buildings when subjected to optical data from different kinds of sensors and platforms. This thesis starts with an approach of training a region-based network to obtain a baseline model, which then is improved gradually by using advanced techniques like data augmentation and fine tuning. A comprehensive performance evaluation is carried out under consideration of different training-testing scenarios. Furthermore, the influence of tile-size on the detection performance is tested. The resultant model after improvements is tested on an independent validation dataset acquired during rapid mapping activation of the Centre for satellite-based crisis information (ZKI) during the floods in Germany, July 2021. Contrary to intuition, the model owning the implementation of augmentation technique on the xView global dataset, shows the best performance for transferability. Due to resource limitation, the pipeline has been trained with a small sliver of the available dataset. The model weights obtained by retraining on the entire dataset with much powerful machines will provide new benchmarks for transferability models in object detection. By combining the resultant exposure with hazard information, we can get a first insight into which areas are likely to be affected in the event of a catastrophe. The importance of this work is that it provides an up-to-date picture of the building stock compared to Open Street Map or cadastre data, at different phases of the disaster

    Label Efficient 3D Scene Understanding

    Get PDF
    3D scene understanding models are becoming increasingly integrated into modern society. With applications ranging from autonomous driving, Augmented Real- ity, Virtual Reality, robotics and mapping, the demand for well-behaved models is rapidly increasing. A key requirement for training modern 3D models is high- quality manually labelled training data. Collecting training data is often the time and monetary bottleneck, limiting the size of datasets. As modern data-driven neu- ral networks require very large datasets to achieve good generalisation, finding al- ternative strategies to manual labelling is sought after for many industries. In this thesis, we present a comprehensive study on achieving 3D scene under- standing with fewer labels. Specifically, we evaluate 4 approaches: existing data, synthetic data, weakly-supervised and self-supervised. Existing data looks at the potential of using readily available national mapping data as coarse labels for train- ing a building segmentation model. We further introduce an energy-based active contour snake algorithm to improve label quality by utilising co-registered LiDAR data. This is attractive as whilst the models may still require manual labels, these labels already exist. Synthetic data also exploits already existing data which was not originally designed for training neural networks. We demonstrate a pipeline for generating a synthetic Mobile Laser Scanner dataset. We experimentally evalu- ate if such a synthetic dataset can be used to pre-train smaller real-world datasets, increasing the generalisation with less data. A weakly-supervised approach is presented which allows for competitive per- formance on challenging real-world benchmark 3D scene understanding datasets with up to 95% less data. We propose a novel learning approach where the loss function is learnt. Our key insight is that the loss function is a local function and therefore can be trained with less data on a simpler task. Once trained our loss function can be used to train a 3D object detector using only unlabelled scenes. Our method is both flexible and very scalable, even performing well across datasets. Finally, we propose a method which only requires a single geometric represen- tation of each object class as supervision for 3D monocular object detection. We discuss why typical L2-like losses do not work for 3D object detection when us- ing differentiable renderer-based optimisation. We show that the undesirable local- minimas that the L2-like losses fall into can be avoided with the inclusion of a Generative Adversarial Network-like loss. We achieve state-of-the-art performance on the challenging 6DoF LineMOD dataset, without any scene level labels

    Merging digital surface models sourced from multi-satellite imagery and their consequent application in automating 3D building modelling

    Get PDF
    Recently, especially within the last two decades, the demand for DSMs (Digital Surface Models) and 3D city models has increased dramatically. This has arisen due to the emergence of new applications beyond construction or analysis and consequently to a focus on accuracy and the cost. This thesis addresses two linked subjects: first improving the quality of the DSM by merging different source DSMs using a Bayesian approach; and second, extracting building footprints using approaches, including Bayesian approaches, and producing 3D models. Regarding the first topic, a probabilistic model has been generated based on the Bayesian approach in order to merge different source DSMs from different sensors. The Bayesian approach is specified to be ideal in the case when the data is limited and this can consequently be compensated by introducing the a priori. The implemented prior is based on the hypothesis that the building roof outlines are specified to be smooth, for that reason local entropy has been implemented in order to infer the a priori data. In addition to the a priori estimation, the quality of the DSMs is obtained by using field checkpoints from differential GNSS. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the Maximum Likelihood model which showed similar quantitative statistical results and better qualitative results. Perhaps it is worth mentioning that, although the DSMs used in the merging have been produced using satellite images, the model can be applied on any type of DSM. The second topic is building footprint extraction based on using satellite imagery. An efficient flow-line for automatic building footprint extraction and 3D model construction, from both stereo panchromatic and multispectral satellite imagery was developed. This flow-line has been applied in an area of different building types, with both hipped and sloped roofs. The flow line consisted of multi stages. First, data preparation, digital orthoimagery and DSMs are created from WorldView-1. Pleiades imagery is used to create a vegetation mask. The orthoimagery then undergoes binary classification into ‘foreground’ (including buildings, shadows, open-water, roads and trees) and ‘background’ (including grass, bare soil, and clay). From the foreground class, shadows and open water are removed after creating a shadow mask by thresholding the same orthoimagery. Likewise roads have been removed, for the time being, after interactively creating a mask using the orthoimagery. NDVI processing of the Pleiades imagery has been used to create a mask for removing the trees. An ‘edge map’ is produced using Canny edge detection to define the exact building boundary outlines, from enhanced orthoimagery. A normalised digital surface model (nDSM) is produced from the original DSM using smoothing and subtracting techniques. Second, start Building Detection and Extraction. Buildings can be detected, in part, in the nDSM as isolated relatively elevated ‘blobs’. These nDSM ‘blobs’ are uniquely labelled to identify rudimentary buildings. Each ‘blob’ is paired with its corresponding ‘foreground’ area from the orthoimagery. Each ‘foreground’ area is used as an initial building boundary, which is then vectorised and simplified. Some unnecessary details in the ‘edge map’, particularly on the roofs of the buildings can be removed using mathematical morphology. Some building edges are not detected in the ‘edge map’ due to low contrast in some parts of the orthoimagery. The ‘edge map’ is subsequently further improved also using mathematical morphology, leading to the ‘modified edge map’. Finally, A Bayesian approach is used to find the most probable coordinates of the building footprints, based on the ‘modified edge map’. The proposal that is made for the footprint a priori data is based on the creating a PDF which assumes that the probable footprint angle at the corner is 90o and along the edge is 180o, with a less probable value given to the other angles such as 45o and 135o. The 3D model is constructed by extracting the elevation of the buildings from the DSM and combining it with the regularized building boundary. Validation, both quantitatively and qualitatively has shown that the developed process and associated algorithms have successfully been able to extract building footprints and create 3D models
    corecore