4,143 research outputs found

    Advancing Land Cover Mapping in Remote Sensing with Deep Learning

    Get PDF
    Automatic mapping of land cover in remote sensing data plays an increasingly significant role in several earth observation (EO) applications, such as sustainable development, autonomous agriculture, and urban planning. Due to the complexity of the real ground surface and environment, accurate classification of land cover types is facing many challenges. This thesis provides novel deep learning-based solutions to land cover mapping challenges such as how to deal with intricate objects and imbalanced classes in multi-spectral and high-spatial resolution remote sensing data. The first work presents a novel model to learn richer multi-scale and global contextual representations in very high-resolution remote sensing images, namely the dense dilated convolutions' merging (DDCM) network. The proposed method is light-weighted, flexible and extendable, so that it can be used as a simple yet effective encoder and decoder module to address different classification and semantic mapping challenges. Intensive experiments on different benchmark remote sensing datasets demonstrate that the proposed method can achieve better performance but consume much fewer computation resources compared with other published methods. Next, a novel graph model is developed for capturing long-range pixel dependencies in remote sensing images to improve land cover mapping. One key component in the method is the self-constructing graph (SCG) module that can effectively construct global context relations (latent graph structure) without requiring prior knowledge graphs. The proposed SCG-based models achieved competitive performance on different representative remote sensing datasets with faster training and lower computational cost compared to strong baseline models. The third work introduces a new framework, namely the multi-view self-constructing graph (MSCG) network, to extend the vanilla SCG model to be able to capture multi-view context representations with rotation invariance to achieve improved segmentation performance. Meanwhile, a novel adaptive class weighting loss function is developed to alleviate the issue of class imbalance commonly found in EO datasets for semantic segmentation. Experiments on benchmark data demonstrate the proposed framework is computationally efficient and robust to produce improved segmentation results for imbalanced classes. To address the key challenges in multi-modal land cover mapping of remote sensing data, namely, 'what', 'how' and 'where' to effectively fuse multi-source features and to efficiently learn optimal joint representations of different modalities, the last work presents a compact and scalable multi-modal deep learning framework (MultiModNet) based on two novel modules: the pyramid attention fusion module and the gated fusion unit. The proposed MultiModNet outperforms the strong baselines on two representative remote sensing datasets with fewer parameters and at a lower computational cost. Extensive ablation studies also validate the effectiveness and flexibility of the framework

    A Weakly Supervised Approach for Estimating Spatial Density Functions from High-Resolution Satellite Imagery

    Full text link
    We propose a neural network component, the regional aggregation layer, that makes it possible to train a pixel-level density estimator using only coarse-grained density aggregates, which reflect the number of objects in an image region. Our approach is simple to use and does not require domain-specific assumptions about the nature of the density function. We evaluate our approach on several synthetic datasets. In addition, we use this approach to learn to estimate high-resolution population and housing density from satellite imagery. In all cases, we find that our approach results in better density estimates than a commonly used baseline. We also show how our housing density estimator can be used to classify buildings as residential or non-residential.Comment: 10 pages, 8 figures. ACM SIGSPATIAL 2018, Seattle, US

    Uncertainty, interpretability and dataset limitations in Deep Learning

    Full text link
    [eng] Deep Learning (DL) has gained traction in the last years thanks to the exponential increase in compute power. New techniques and methods are published at a daily basis, and records are being set across multiple disciplines. Undeniably, DL has brought a revolution to the machine learning field and to our lives. However, not everything has been resolved and some considerations must be taken into account. For instance, obtaining uncertainty measures and bounds is still an open problem. Models should be able to capture and express the confidence they have in their decisions, and Artificial Neural Networks (ANN) are known to lack in this regard. Be it through out of distribution samples, adversarial attacks, or simply unrelated or nonsensical inputs, ANN models demonstrate an unfounded and incorrect tendency to still output high probabilities. Likewise, interpretability remains an unresolved question. Some fields not only need but rely on being able to provide human interpretations of the thought process of models. ANNs, and specially deep models trained with DL, are hard to reason about. Last but not least, there is a tendency that indicates that models are getting deeper and more complex. At the same time, to cope with the increasing number of parameters, datasets are required to be of higher quality and, usually, larger. Not all research, and even less real world applications, can keep with the increasing demands. Therefore, taking into account the previous issues, the main aim of this thesis is to provide methods and frameworks to tackle each of them. These approaches should be applicable to any suitable field and dataset, and are employed with real world datasets as proof of concept. First, we propose a method that provides interpretability with respect to the results through uncertainty measures. The model in question is capable of reasoning about the uncertainty inherent in data and leverages that information to progressively refine its outputs. In particular, the method is applied to land cover segmentation, a classification task that aims to assign a type of land to each pixel in satellite images. The dataset and application serve to prove that the final uncertainty bound enables the end-user to reason about the possible errors in the segmentation result. Second, Recurrent Neural Networks are used as a method to create robust models towards lacking datasets, both in terms of size and class balance. We apply them to two different fields, road extraction in satellite images and Wireless Capsule Endoscopy (WCE). The former demonstrates that contextual information in the temporal axis of data can be used to create models that achieve comparable results to state-of-the-art while being less complex. The latter, in turn, proves that contextual information for polyp detection can be crucial to obtain models that generalize better and obtain higher performance. Last, we propose two methods to leverage unlabeled data in the model creation process. Often datasets are easier to obtain than to label, which results in many wasted opportunities with traditional classification approaches. Our approaches based on self-supervised learning result in a novel contrastive loss that is capable of extracting meaningful information out of pseudo-labeled data. Applying both methods to WCE data proves that the extracted inherent knowledge creates models that perform better in extremely unbalanced datasets and with lack of data. To summarize, this thesis demonstrates potential solutions to obtain uncertainty bounds, provide reasonable explanations of the outputs, and to combat lack of data or unbalanced datasets. Overall, the presented methods have a positive impact on the DL field and could have a real and tangible effect for the society.[cat] És innegable que el Deep Learning ha causat una revolució en molts aspectes no solament de l’aprenentatge automàtic però també de les nostres vides diàries. Tot i així, encara queden aspectes a millorar. Les xarxes neuronals tenen problemes per estimar la seva confiança en les prediccions, i sovint reporten probabilitats altes en casos que no tenen relació amb el model o que directament no tenen sentit. De la mateixa forma, interpretar els resultats d’un model profund i complex resulta una tasca extremadament complicada. Aquests mateixos models, cada cop amb més paràmetres i més potents, requereixen també de dades més ben etiquetades i més completes. Tenint en compte aquestes limitacions, l’objectiu principal és el de buscar mètodes i algoritmes per trobar-ne solució. Primerament, es proposa la creació d’un mètode capaç d’obtenir incertesa en imatges satèl·lit i d’utilitzar-la per crear models més robustos i resultats interpretables. En segon lloc, s’utilitzen Recurrent Neural Networks (RNN) per combatre la falta de dades mitjançant l’obtenció d’informació contextual de dades temporals. Aquestes s’apliquen per l’extracció de carreteres d’imatges satèl·lit i per la classificació de pòlips en imatges obtingudes amb Wireless Capsule Endoscopy (WCE). Finalment, es plantegen dos mètodes per tractar amb la falta de dades etiquetades i desbalancejos en les classes amb l’ús de Self-supervised Learning (SSL). Seqüències no etiquetades d’imatges d’intestins s’incorporen en el models en una fase prèvia a la classificació tradicional. Aquesta tesi demostra que les solucions proposades per obtenir mesures d’incertesa són efectives per donar explicacions raonables i interpretables sobre els resultats. Igualment, es prova que el context en dades de caràcter temporal, obtingut amb RNNs, serveix per obtenir models més simples que poden arribar a solucionar els problemes derivats de la falta de dades. Per últim, es mostra que SSL serveix per combatre de forma efectiva els problemes de generalització degut a dades no balancejades en diversos dominis de WCE. Concloem que aquesta tesi presenta mètodes amb un impacte real en diversos aspectes de DL a la vegada que demostra la capacitat de tenir un impacte positiu en la societat

    Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods

    Get PDF
    This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended

    Dense Dilated Convolutions Merging Network for Land Cover Classification

    Get PDF
    Land cover classification of remote sensing images is a challenging task due to limited amounts of annotated data, highly imbalanced classes, frequent incorrect pixel-level annotations, and an inherent complexity in the semantic segmentation task. In this article, we propose a novel architecture called the dense dilated convolutions' merging network (DDCM-Net) to address this task. The proposed DDCM-Net consists of dense dilated image convolutions merged with varying dilation rates. This effectively utilizes rich combinations of dilated convolutions that enlarge the network's receptive fields with fewer parameters and features compared with the state-of-the-art approaches in the remote sensing domain. Importantly, DDCM-Net obtains fused local- and global-context information, in effect incorporating surrounding discriminative capability for multiscale and complex-shaped objects with similar color and textures in very high-resolution aerial imagery. We demonstrate the effectiveness, robustness, and flexibility of the proposed DDCM-Net on the publicly available ISPRS Potsdam and Vaihingen data sets, as well as the DeepGlobe land cover data set. Our single model, trained on three-band Potsdam and Vaihingen data sets, achieves better accuracy in terms of both mean intersection over union (mIoU) and F1-score compared with other published models trained with more than three-band data. We further validate our model on the DeepGlobe data set, achieving state-of-the-art result 56.2% mIoU with much fewer parameters and at a lower computational cost compared with related recent work. Code available at https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorchComment: Semantic Segmentation, 12 pages, TGRS-2020 early access in IEEE Transactions on Geoscience and Remote Sensing. 2020, Code available at https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorc

    Geoscience-aware deep learning:A new paradigm for remote sensing

    Get PDF
    Information extraction is a key activity for remote sensing images. A common distinction exists between knowledge-driven and data-driven methods. Knowledge-driven methods have advanced reasoning ability and interpretability, but have difficulty in handling complicated tasks since prior knowledge is usually limited when facing the highly complex spatial patterns and geoscience phenomena found in reality. Data-driven models, especially those emerging in machine learning (ML) and deep learning (DL), have achieved substantial progress in geoscience and remote sensing applications. Although DL models have powerful feature learning and representation capabilities, traditional DL has inherent problems including working as a black box and generally requiring a large number of labeled training data. The focus of this paper is on methods that integrate domain knowledge, such as geoscience knowledge and geoscience features (GK/GFs), into the design of DL models. The paper introduces the new paradigm of geoscience-aware deep learning (GADL), in which GK/GFs and DL models are combined deeply to extract information from remote sensing data. It first provides a comprehensive summary of GK/GFs used in GADL, which forms the basis for subsequent integration of GK/GFs with DL models. This is followed by a taxonomy of approaches for integrating GK/GFs with DL models. Several approaches are detailed using illustrative examples. Challenges and research prospects in GADL are then discussed. Developing more novel and advanced methods in GADL is expected to become the prevailing trend in advancing remotely sensed information extraction in the future.</p
    • …
    corecore