7 research outputs found
A Convolutional Neural Network with Parallel Multi-Scale Spatial Pooling to Detect Temporal Changes in SAR Images
In synthetic aperture radar (SAR) image change detection, it is quite
challenging to exploit the changing information from the noisy difference image
subject to the speckle. In this paper, we propose a multi-scale spatial pooling
(MSSP) network to exploit the changed information from the noisy difference
image. Being different from the traditional convolutional network with only
mono-scale pooling kernels, in the proposed method, multi-scale pooling kernels
are equipped in a convolutional network to exploit the spatial context
information on changed regions from the difference image. Furthermore, to
verify the generalization of the proposed method, we apply our proposed method
to the cross-dataset bitemporal SAR image change detection, where the MSSP
network (MSSP-Net) is trained on a dataset and then applied to an unknown
testing dataset. We compare the proposed method with other state-of-arts and
the comparisons are performed on four challenging datasets of bitemporal SAR
images. Experimental results demonstrate that our proposed method obtains
comparable results with S-PCA-Net on YR-A and YR-B dataset and outperforms
other state-of-art methods, especially on the Sendai-A and Sendai-B datasets
with more complex scenes. More important, MSSP-Net is more efficient than
S-PCA-Net and convolutional neural networks (CNN) with less executing time in
both training and testing phases
Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images
Semantic segmentation models based on convolutional neural networks (CNNs)
have gained much attention in relation to remote sensing and have achieved
remarkable performance for the extraction of buildings from high-resolution
aerial images. However, the issue of limited generalization for unseen images
remains. When there is a domain gap between the training and test datasets,
CNN-based segmentation models trained by a training dataset fail to segment
buildings for the test dataset. In this paper, we propose segmentation networks
based on a domain adaptive transfer attack (DATA) scheme for building
extraction from aerial images. The proposed system combines the domain transfer
and adversarial attack concepts. Based on the DATA scheme, the distribution of
the input images can be shifted to that of the target images while turning
images into adversarial examples against a target network. Defending
adversarial examples adapted to the target domain can overcome the performance
degradation due to the domain gap and increase the robustness of the
segmentation model. Cross-dataset experiments and the ablation study are
conducted for the three different datasets: the Inria aerial image labeling
dataset, the Massachusetts building dataset, and the WHU East Asia dataset.
Compared to the performance of the segmentation network without the DATA
scheme, the proposed method shows improvements in the overall IoU. Moreover, it
is verified that the proposed method outperforms even when compared to feature
adaptation (FA) and output space adaptation (OSA).Comment: 11pages, 12 figure
Semantic Segmentation Using Modified U-Net Architecture for Crack Detection
The visual inspection of a concrete crack is essential to maintaining its good condition during the service life of the bridge. The visual inspection has been done manually by inspectors, but unfortunately, the results are subjective. On the other hand, automated visual inspection approaches are faster and less subjective. Concrete crack is an important deficiency type that is assessed by inspectors. Recently, various Convolutional Neural Networks (CNNs) have become a prominent strategy to spot concrete cracks mechanically. The CNNs outperforms the traditional image processing approaches in accuracy for the high-level recognition task. Of them, U-Net, a CNN based semantic segmentation method, has been one of the most popular in the deep learning because of its excellent performance in open-source crack classification. Although the results of the trained U-Net look good for some dataset, the model still requires further improvement for the set of hard examples of concrete crack that contains the stain, waterspot, and small width crack. In this paper, we address the challenging problem of accurately detecting a thin concrete crack. We designed a U-Net like structure that has a contracting path and an expansive path to overcome this challenge and compared it to current models, including original U-Net and pyramid pooling module network. The proposed architecture utilizes multiple feature maps in a down-sampling path to obtain a higher pixel-level segmentation precision. The down-sampled feature is then up-sampled from the output of the pyramid pooling module [13], giving a binary crack and non-crack semantic segmentation. In the experiment, we have collected hard examples and evaluated the approach. Experimental results demonstrate that the proposed network outperforms the U-Net and a pyramid pooling module network in detecting a thin crack in a noisy environment
MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed Images
Semantic segmentation of remotely sensed images plays an important role in land resource management, yield estimation, and economic assessment. U-Net, a deep encoder-decoder architecture, has been used frequently for image segmentation with high accuracy. In this Letter, we incorporate multi-scale features generated by different layers of U-Net and design a multi-scale skip connected and asymmetric-convolution-based U-Net (MACU-Net), for segmentation using fine-resolution remotely sensed images. Our design has the following advantages: (1) The multi-scale skip connections combine and realign semantic features contained in both low-level and high-level feature maps; (2) the asymmetric convolution block strengthens the feature representation and feature extraction capability of a standard convolution layer. Experiments conducted on two remotely sensed datasets captured by different satellite sensors demonstrate that the proposed MACU-Net transcends the U-Net, U-NetPPL, U-Net 3+, amongst other benchmark approaches
Dense Refinement Residual Network for Road Extraction From Aerial Imagery Data
Extraction of roads from high-resolution aerial images with a high degree of accuracy is a prerequisite in various applications. In aerial images, road pixels and background pixels are generally in the ratio of ones-to-tens, which implies a class imbalance problem. Existing semantic segmentation architectures generally do well in road-dominated cases but fail in background-dominated scenarios. This paper proposes a dense refinement residual network (DRR Net) for semantic segmentation of aerial imagery data. The proposed semantic segmentation architecture is composed of multiple DRR modules for the extraction of diversified roads alleviating the class imbalance problem. Each module of the proposed architecture utilizes dense convolutions at various scales only in the encoder for feature learning. Residual connections in each module of the proposed architecture provide the guided learning path by propagating the combined features to subsequent DRR modules. Segmentation maps undergo various levels of refinement based on the number of DRR modules utilized in the architecture. To emphasize more on small object instances, the proposed architecture has been trained with a composite loss function. The qualitative and quantitative results are reported by utilizing the Massachusetts roads dataset. The experimental results report that the proposed architecture provides better results as compared to other recent architectures
Objects Segmentation From High-Resolution Aerial Images Using U-Net With Pyramid Pooling Layers
Extracting manufactured features such as buildings, roads, and water from aerial images is critical for urban planning, traffic management, and industrial development. Recently, convolutional neural networks (CNNs) have become a popular strategy to capture contextual features automatically. In order to train CNNs, a large training data are required, but it is not straightforward to use free-accessible data sets due to imperfect labeling. To address this issue, we make a large scale of data sets using RGB aerial images and convert them to digital maps with location information such as roads, buildings, and water from the metropolitan area of Seoul in South Korea. The numbers of training and test data are 72 400 and 9600, respectively. Based on our self-made data sets, we design a multiobject segmentation system and propose an algorithm that utilizes pyramid pooling layers (PPLs) to improve U-Net. Test results indicate that U-Net with PPLs, called UNetPPL, learn fine-grained classification maps and outperforms other algorithms of fully convolutional network and U-Net, achieving the mean intersection of union (mIOU) of 79.52 and the pixel accuracy of 87.61% for four types of objects (i.e., building, road, water, and background). © 2004-2012 IEEE.1
Optimised U-Net for Land Use–Land Cover Classification Using Aerial Photography
Convolutional Neural Networks (CNN) consist of various hyper-parameters which need to be specifed or can be altered when defning a deep learning architecture. There are numerous studies which have tested diferent types of networks (e.g. U-Net, DeepLabv3+) or created new architectures, benchmarked against well-known test datasets. However, there is a lack of real-world mapping applications demonstrating the efects of changing network hyper-parameters on model performance for land use and land cover (LULC) semantic segmentation. In this paper, we analysed the efects on training time and classifcation accuracy by altering parameters such as the number of initial convolutional flters, kernel size, network depth, kernel initialiser and activation functions, loss and loss optimiser functions, and learning rate. We achieved this using a well-known top performing architecture, the U-Net, in conjunction with LULC training data and two multispectral aerial images from North Queensland, Australia. A 2018 image was used to train and test CNN models with diferent parameters and a 2015 image was used for assessing the optimised parameters. We found more complex models with a larger number of flters and larger kernel size produce classifcations of higher accuracy but take longer to train. Using an accuracy-time ranking formula, we found using 56 initial flters with kernel size of 5×5 provide the best compromise between training time and accuracy. When fully training a model using these parameters and testing on the 2015 image, we achieved a kappa score of 0.84. This compares to the original U-Net parameters which achieved a kappa score of 0.73