215 research outputs found

    Deep learning in remote sensing: a review

    Get PDF
    Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks

    Full text link
    Semantic labeling (or pixel-level land-cover classification) in ultra-high resolution imagery (< 10cm) requires statistical models able to learn high level concepts from spatial data, with large appearance variations. Convolutional Neural Networks (CNNs) achieve this goal by learning discriminatively a hierarchy of representations of increasing abstraction. In this paper we present a CNN-based system relying on an downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolutions. By doing so, the CNN learns to densely label every pixel at the original resolution of the image. This results in many advantages, including i) state-of-the-art numerical accuracy, ii) improved geometric accuracy of predictions and iii) high efficiency at inference time. We test the proposed system on the Vaihingen and Potsdam sub-decimeter resolution datasets, involving semantic labeling of aerial images of 9cm and 5cm resolution, respectively. These datasets are composed by many large and fully annotated tiles allowing an unbiased evaluation of models making use of spatial information. We do so by comparing two standard CNN architectures to the proposed one: standard patch classification, prediction of local label patches by employing only convolutions and full patch labeling by employing deconvolutions. All the systems compare favorably or outperform a state-of-the-art baseline relying on superpixels and powerful appearance descriptors. The proposed full patch labeling CNN outperforms these models by a large margin, also showing a very appealing inference time.Comment: Accepted in IEEE Transactions on Geoscience and Remote Sensing, 201

    An explainable convolutional autoencoder model for unsupervised change detection

    Get PDF
    Abstract. Transfer learning methods reuse a deep learning model developed for a task on another task. Such methods have been remarkably successful in a wide range of image processing applications. Following the trend, few transfer learning based methods have been proposed for unsupervised multi-temporal image analysis and change detection (CD). Inspite of their success, the transfer learning based CD methods suffer from limited explainability. In this paper, we propose an explainable convolutional autoencoder model for CD. The model is trained in: 1) an unsupervised way using, as the bi-temporal images, patches extracted from the same geographic location; 2) a greedy fashion, one encoder and decoder layer pair at a time. A number of features relevant for CD is chosen from the encoder layer. To build an explainable model, only selected features from the encoder layer is retained and the rest is discarded. Following this, another encoder and decoder layer pair is added to the model in similar fashion until convergence. We further visualize the features to better interpret the learned features. We validated the proposed method on a Landsat-8 dataset obtained in Spain. Using a set of experiments, we demonstrate the explainability and effectiveness of the proposed model

    Overcoming Missing and Incomplete Modalities with Generative Adversarial Networks for Building Footprint Segmentation

    Full text link
    The integration of information acquired with different modalities, spatial resolution and spectral bands has shown to improve predictive accuracies. Data fusion is therefore one of the key challenges in remote sensing. Most prior work focusing on multi-modal fusion, assumes that modalities are always available during inference. This assumption limits the applications of multi-modal models since in practice the data collection process is likely to generate data with missing, incomplete or corrupted modalities. In this paper, we show that Generative Adversarial Networks can be effectively used to overcome the problems that arise when modalities are missing or incomplete. Focusing on semantic segmentation of building footprints with missing modalities, our approach achieves an improvement of about 2% on the Intersection over Union (IoU) against the same network that relies only on the available modality

    Deep Learning for Remote Sensing Image Processing

    Get PDF
    Remote sensing images have many applications such as ground object detection, environmental change monitoring, urban growth monitoring and natural disaster damage assessment. As of 2019, there were roughly 700 satellites listing “earth observation” as their primary application. Both spatial and temporal resolutions of satellite images have improved consistently in recent years and provided opportunities in resolving fine details on the Earth\u27s surface. In the past decade, deep learning techniques have revolutionized many applications in the field of computer vision but have not fully been explored in remote sensing image processing. In this dissertation, several state-of-the-art deep learning models have been investigated and customized for satellite image processing in the applications of landcover classification and ground object detection. First, a simple and effective Convolutional Neural Network (CNN) model is developed to detect fresh soil from tunnel digging activities near the U.S. and Mexico border by using pansharpened synthetic hyperspectral images. These tunnels’ exits are usually hidden under warehouses and are used for illegal activities, for example, by drug dealers. Detecting fresh soil nearby is an indirect way to search for these tunnels. While multispectral images have been used widely and regularly in remote sensing since the 1970s, with the fast advances in hyperspectral sensors, hyperspectral imagery is becoming popular. A combination of 80 synthetic hyperspectral channels with the original eight multispectral channels collected by the WorldView-2 satellite are used by CNN to detect fresh soil. Experimental results show that detection performance can be significantly improved by the combination of synthetic hyperspectral images with those original multispectral channels. Second, an end-to-end, pixel-level Fully Convolutional Network (FCN) model is implemented to estimate the number of refugee tents in the Rukban area near the Syrian-Jordan border using high-resolution multispectral satellite images collected by WordView-2. Rukban is a desert area crossing the border between Syria and Jordan, and thousands of Syrian refugees have fled into this area since the Syrian civil war in 2014. In the past few years, the number of refugee shelters for the forcibly displaced Syrian refugees in this area has increased rapidly. Estimating the location and number of refugee tents has become a key factor in maintaining the sustainability of the refugee shelter camps. Manually counting the shelters is labor-intensive and sometimes prohibitive given the large quantities. In addition, these shelters/tents are usually small in size, irregular in shape, and sparsely distributed in a very large area and could be easily missed by the traditional image-analysis techniques, making the image-based approaches also challenging. The FCN model is also boosted by transfer learning with the knowledge in the pre-trained VGG-16 model. Experimental results show that the FCN model is very accurate and has less than 2% of error. Last, we investigate the Generative Adversarial Networks (GAN) to augment training data to improve the training of FCN model for refugee tent detection. Segmentation based methods like FCN require a large amount of finely labeled images for training. In practice, this is labor-intensive, time consuming, and tedious. The data-hungry problem is currently a big hurdle for this application. Experimental results show that the GAN model is a better tool as compared to traditional methods for data augmentation. Overall, our research made a significant contribution to remote sensing image processin
    • …
    corecore