7 research outputs found

    A Convolutional Neural Network with Parallel Multi-Scale Spatial Pooling to Detect Temporal Changes in SAR Images

    Full text link
    In synthetic aperture radar (SAR) image change detection, it is quite challenging to exploit the changing information from the noisy difference image subject to the speckle. In this paper, we propose a multi-scale spatial pooling (MSSP) network to exploit the changed information from the noisy difference image. Being different from the traditional convolutional network with only mono-scale pooling kernels, in the proposed method, multi-scale pooling kernels are equipped in a convolutional network to exploit the spatial context information on changed regions from the difference image. Furthermore, to verify the generalization of the proposed method, we apply our proposed method to the cross-dataset bitemporal SAR image change detection, where the MSSP network (MSSP-Net) is trained on a dataset and then applied to an unknown testing dataset. We compare the proposed method with other state-of-arts and the comparisons are performed on four challenging datasets of bitemporal SAR images. Experimental results demonstrate that our proposed method obtains comparable results with S-PCA-Net on YR-A and YR-B dataset and outperforms other state-of-art methods, especially on the Sendai-A and Sendai-B datasets with more complex scenes. More important, MSSP-Net is more efficient than S-PCA-Net and convolutional neural networks (CNN) with less executing time in both training and testing phases

    Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images

    Full text link
    Semantic segmentation models based on convolutional neural networks (CNNs) have gained much attention in relation to remote sensing and have achieved remarkable performance for the extraction of buildings from high-resolution aerial images. However, the issue of limited generalization for unseen images remains. When there is a domain gap between the training and test datasets, CNN-based segmentation models trained by a training dataset fail to segment buildings for the test dataset. In this paper, we propose segmentation networks based on a domain adaptive transfer attack (DATA) scheme for building extraction from aerial images. The proposed system combines the domain transfer and adversarial attack concepts. Based on the DATA scheme, the distribution of the input images can be shifted to that of the target images while turning images into adversarial examples against a target network. Defending adversarial examples adapted to the target domain can overcome the performance degradation due to the domain gap and increase the robustness of the segmentation model. Cross-dataset experiments and the ablation study are conducted for the three different datasets: the Inria aerial image labeling dataset, the Massachusetts building dataset, and the WHU East Asia dataset. Compared to the performance of the segmentation network without the DATA scheme, the proposed method shows improvements in the overall IoU. Moreover, it is verified that the proposed method outperforms even when compared to feature adaptation (FA) and output space adaptation (OSA).Comment: 11pages, 12 figure

    Semantic Segmentation Using Modified U-Net Architecture for Crack Detection

    Get PDF
    The visual inspection of a concrete crack is essential to maintaining its good condition during the service life of the bridge. The visual inspection has been done manually by inspectors, but unfortunately, the results are subjective. On the other hand, automated visual inspection approaches are faster and less subjective. Concrete crack is an important deficiency type that is assessed by inspectors. Recently, various Convolutional Neural Networks (CNNs) have become a prominent strategy to spot concrete cracks mechanically. The CNNs outperforms the traditional image processing approaches in accuracy for the high-level recognition task. Of them, U-Net, a CNN based semantic segmentation method, has been one of the most popular in the deep learning because of its excellent performance in open-source crack classification. Although the results of the trained U-Net look good for some dataset, the model still requires further improvement for the set of hard examples of concrete crack that contains the stain, waterspot, and small width crack. In this paper, we address the challenging problem of accurately detecting a thin concrete crack. We designed a U-Net like structure that has a contracting path and an expansive path to overcome this challenge and compared it to current models, including original U-Net and pyramid pooling module network. The proposed architecture utilizes multiple feature maps in a down-sampling path to obtain a higher pixel-level segmentation precision. The down-sampled feature is then up-sampled from the output of the pyramid pooling module [13], giving a binary crack and non-crack semantic segmentation. In the experiment, we have collected hard examples and evaluated the approach. Experimental results demonstrate that the proposed network outperforms the U-Net and a pyramid pooling module network in detecting a thin crack in a noisy environment

    MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

    Get PDF
    Semantic segmentation of remotely sensed images plays an important role in land resource management, yield estimation, and economic assessment. U-Net, a deep encoder-decoder architecture, has been used frequently for image segmentation with high accuracy. In this Letter, we incorporate multi-scale features generated by different layers of U-Net and design a multi-scale skip connected and asymmetric-convolution-based U-Net (MACU-Net), for segmentation using fine-resolution remotely sensed images. Our design has the following advantages: (1) The multi-scale skip connections combine and realign semantic features contained in both low-level and high-level feature maps; (2) the asymmetric convolution block strengthens the feature representation and feature extraction capability of a standard convolution layer. Experiments conducted on two remotely sensed datasets captured by different satellite sensors demonstrate that the proposed MACU-Net transcends the U-Net, U-NetPPL, U-Net 3+, amongst other benchmark approaches

    Dense Refinement Residual Network for Road Extraction From Aerial Imagery Data

    Get PDF
    Extraction of roads from high-resolution aerial images with a high degree of accuracy is a prerequisite in various applications. In aerial images, road pixels and background pixels are generally in the ratio of ones-to-tens, which implies a class imbalance problem. Existing semantic segmentation architectures generally do well in road-dominated cases but fail in background-dominated scenarios. This paper proposes a dense refinement residual network (DRR Net) for semantic segmentation of aerial imagery data. The proposed semantic segmentation architecture is composed of multiple DRR modules for the extraction of diversified roads alleviating the class imbalance problem. Each module of the proposed architecture utilizes dense convolutions at various scales only in the encoder for feature learning. Residual connections in each module of the proposed architecture provide the guided learning path by propagating the combined features to subsequent DRR modules. Segmentation maps undergo various levels of refinement based on the number of DRR modules utilized in the architecture. To emphasize more on small object instances, the proposed architecture has been trained with a composite loss function. The qualitative and quantitative results are reported by utilizing the Massachusetts roads dataset. The experimental results report that the proposed architecture provides better results as compared to other recent architectures

    Objects Segmentation From High-Resolution Aerial Images Using U-Net With Pyramid Pooling Layers

    No full text
    Extracting manufactured features such as buildings, roads, and water from aerial images is critical for urban planning, traffic management, and industrial development. Recently, convolutional neural networks (CNNs) have become a popular strategy to capture contextual features automatically. In order to train CNNs, a large training data are required, but it is not straightforward to use free-accessible data sets due to imperfect labeling. To address this issue, we make a large scale of data sets using RGB aerial images and convert them to digital maps with location information such as roads, buildings, and water from the metropolitan area of Seoul in South Korea. The numbers of training and test data are 72 400 and 9600, respectively. Based on our self-made data sets, we design a multiobject segmentation system and propose an algorithm that utilizes pyramid pooling layers (PPLs) to improve U-Net. Test results indicate that U-Net with PPLs, called UNetPPL, learn fine-grained classification maps and outperforms other algorithms of fully convolutional network and U-Net, achieving the mean intersection of union (mIOU) of 79.52 and the pixel accuracy of 87.61% for four types of objects (i.e., building, road, water, and background). © 2004-2012 IEEE.1

    Optimised U-Net for Land Use–Land Cover Classification Using Aerial Photography

    Get PDF
    Convolutional Neural Networks (CNN) consist of various hyper-parameters which need to be specifed or can be altered when defning a deep learning architecture. There are numerous studies which have tested diferent types of networks (e.g. U-Net, DeepLabv3+) or created new architectures, benchmarked against well-known test datasets. However, there is a lack of real-world mapping applications demonstrating the efects of changing network hyper-parameters on model performance for land use and land cover (LULC) semantic segmentation. In this paper, we analysed the efects on training time and classifcation accuracy by altering parameters such as the number of initial convolutional flters, kernel size, network depth, kernel initialiser and activation functions, loss and loss optimiser functions, and learning rate. We achieved this using a well-known top performing architecture, the U-Net, in conjunction with LULC training data and two multispectral aerial images from North Queensland, Australia. A 2018 image was used to train and test CNN models with diferent parameters and a 2015 image was used for assessing the optimised parameters. We found more complex models with a larger number of flters and larger kernel size produce classifcations of higher accuracy but take longer to train. Using an accuracy-time ranking formula, we found using 56 initial flters with kernel size of 5×5 provide the best compromise between training time and accuracy. When fully training a model using these parameters and testing on the 2015 image, we achieved a kappa score of 0.84. This compares to the original U-Net parameters which achieved a kappa score of 0.73
    corecore