1,937 research outputs found
Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images
Semantic segmentation models based on convolutional neural networks (CNNs)
have gained much attention in relation to remote sensing and have achieved
remarkable performance for the extraction of buildings from high-resolution
aerial images. However, the issue of limited generalization for unseen images
remains. When there is a domain gap between the training and test datasets,
CNN-based segmentation models trained by a training dataset fail to segment
buildings for the test dataset. In this paper, we propose segmentation networks
based on a domain adaptive transfer attack (DATA) scheme for building
extraction from aerial images. The proposed system combines the domain transfer
and adversarial attack concepts. Based on the DATA scheme, the distribution of
the input images can be shifted to that of the target images while turning
images into adversarial examples against a target network. Defending
adversarial examples adapted to the target domain can overcome the performance
degradation due to the domain gap and increase the robustness of the
segmentation model. Cross-dataset experiments and the ablation study are
conducted for the three different datasets: the Inria aerial image labeling
dataset, the Massachusetts building dataset, and the WHU East Asia dataset.
Compared to the performance of the segmentation network without the DATA
scheme, the proposed method shows improvements in the overall IoU. Moreover, it
is verified that the proposed method outperforms even when compared to feature
adaptation (FA) and output space adaptation (OSA).Comment: 11pages, 12 figure
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Road Segmentation for Remote Sensing Images using Adversarial Spatial Pyramid Networks
Road extraction in remote sensing images is of great importance for a wide
range of applications. Because of the complex background, and high density,
most of the existing methods fail to accurately extract a road network that
appears correct and complete. Moreover, they suffer from either insufficient
training data or high costs of manual annotation. To address these problems, we
introduce a new model to apply structured domain adaption for synthetic image
generation and road segmentation. We incorporate a feature pyramid network into
generative adversarial networks to minimize the difference between the source
and target domains. A generator is learned to produce quality synthetic images,
and the discriminator attempts to distinguish them. We also propose a feature
pyramid network that improves the performance of the proposed model by
extracting effective features from all the layers of the network for describing
different scales objects. Indeed, a novel scale-wise architecture is introduced
to learn from the multi-level feature maps and improve the semantics of the
features. For optimization, the model is trained by a joint reconstruction loss
function, which minimizes the difference between the fake images and the real
ones. A wide range of experiments on three datasets prove the superior
performance of the proposed approach in terms of accuracy and efficiency. In
particular, our model achieves state-of-the-art 78.86 IOU on the Massachusetts
dataset with 14.89M parameters and 86.78B FLOPs, with 4x fewer FLOPs but higher
accuracy (+3.47% IOU) than the top performer among state-of-the-art approaches
used in the evaluation
Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps
Despite recent advancements in image generation, diffusion models still
remain largely underexplored in Earth Observation. In this paper we show that
state-of-the-art pretrained diffusion models can be conditioned on cartographic
data to generate realistic satellite images. We provide two large datasets of
paired OpenStreetMap images and satellite views over the region of Mainland
Scotland and the Central Belt. We train a ControlNet model and qualitatively
evaluate the results, demonstrating that both image quality and map fidelity
are possible. Finally, we provide some insights on the opportunities and
challenges of applying these models for remote sensing. Our model weights and
code for creating the dataset are publicly available at
https://github.com/miquel-espinosa/map-sat.Comment: 13 pages, 6 figures. preprin
- …