663 research outputs found
Building-road Collaborative Extraction from Remotely Sensed Images via Cross-Interaction
Buildings are the basic carrier of social production and human life; roads
are the links that interconnect social networks. Building and road information
has important application value in the frontier fields of regional coordinated
development, disaster prevention, auto-driving, etc. Mapping buildings and
roads from very high-resolution (VHR) remote sensing images have become a hot
research topic. However, the existing methods often ignore the strong spatial
correlation between roads and buildings and extract them in isolation. To fully
utilize the complementary advantages between buildings and roads, we propose a
building-road collaborative extraction method based on multi-task and
cross-scale feature interaction to improve the accuracy of both tasks in a
complementary way. A multi-task interaction module is proposed to interact
information across tasks and preserve the unique information of each task,
which tackle the seesaw phenomenon in multitask learning. By considering the
variation in appearance and structure between buildings and roads, a
cross-scale interaction module is designed to automatically learn the optimal
reception field for different tasks. Compared with many existing methods that
train each task individually, the proposed collaborative extraction method can
utilize the complementary advantages between buildings and roads by the
proposed inter-task and inter-scale feature interactions, and automatically
select the optimal reception field for different tasks. Experiments on a wide
range of urban and rural scenarios show that the proposed algorithm can achieve
building-road extraction with outstanding performance and efficiency.Comment: 34 pages,9 figures, submitted to ISPRS Journal of Photogrammetry and
Remote Sensin
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
Aerial scene recognition is a fundamental task in remote sensing and has
recently received increased interest. While the visual information from
overhead images with powerful models and efficient algorithms yields
considerable performance on scene recognition, it still suffers from the
variation of ground objects, lighting conditions etc. Inspired by the
multi-channel perception theory in cognition science, in this paper, for
improving the performance on the aerial scene recognition, we explore a novel
audiovisual aerial scene recognition task using both images and sounds as
input. Based on an observation that some specific sound events are more likely
to be heard at a given geographic location, we propose to exploit the knowledge
from the sound events to improve the performance on the aerial scene
recognition. For this purpose, we have constructed a new dataset named AuDio
Visual Aerial sceNe reCognition datasEt (ADVANCE). With the help of this
dataset, we evaluate three proposed approaches for transferring the sound event
knowledge to the aerial scene recognition task in a multimodal learning
framework, and show the benefit of exploiting the audio information for the
aerial scene recognition. The source code is publicly available for
reproducibility purposes.Comment: ECCV 202
Machine Learning for Robust Understanding of Scene Materials in Hyperspectral Images
The major challenges in hyperspectral (HS) imaging and data analysis are expensive sensors, high dimensionality of the signal, limited ground truth, and spectral variability. This dissertation develops and analyzes machine learning based methods to address these problems. In the first part, we examine one of the most important HS data analysis tasks-vegetation parameter estimation. We present two Gaussian processes based approaches for improving the accuracy of vegetation parameter retrieval when ground truth is limited and/or spectral variability is high. The first is the adoption of covariance functions based on well-established metrics, such as, spectral angle and spectral correlation, which are known to be better measures of similarity for spectral data. The second is the joint modeling of related vegetation parameters by multitask Gaussian processes so that the prediction accuracy of the vegetation parameter of interest can be improved with the aid of related vegetation parameters for which a larger set of ground truth is available. The efficacy of the proposed methods is demonstrated by comparing them against state-of-the art approaches on three real-world HS datasets and one synthetic dataset.
In the second part, we demonstrate how Bayesian optimization can be applied to jointly tune the different components of hyperspectral data analysis frameworks for better performance. Experimental validation on the spatial-spectral classification framework consisting of a classifier and a Markov random field is provided.
In the third part, we investigate whether high dimensional HS spectra can be reconstructed from low dimensional multispectral (MS) signals, that can be obtained from much cheaper, lower spectral resolution sensors. A novel end-to-end convolutional residual neural network architecture is proposed that can simultaneously optimize both the MS bands and the transformation to reconstruct HS spectra from MS signals by analyzing a large quantity of HS data. The learned band can be implemented in sensor hardware and the learned transformation can be incorporated in the data processing pipeline to build a low-cost hyperspectral data collection system. Using a diverse set of real-world datasets, we show how the proposed approach of optimizing MS bands along with the transformation rather than just optimizing the transformation with fixed bands, as proposed by previous studies, can drastically increase the reconstruction accuracy. Additionally, we also investigate the prospects of using reconstructed HS spectra for land cover classification
Hierarchical Disentanglement-Alignment Network for Robust SAR Vehicle Recognition
Vehicle recognition is a fundamental problem in SAR image interpretation.
However, robustly recognizing vehicle targets is a challenging task in SAR due
to the large intraclass variations and small interclass variations.
Additionally, the lack of large datasets further complicates the task. Inspired
by the analysis of target signature variations and deep learning
explainability, this paper proposes a novel domain alignment framework named
the Hierarchical Disentanglement-Alignment Network (HDANet) to achieve
robustness under various operating conditions. Concisely, HDANet integrates
feature disentanglement and alignment into a unified framework with three
modules: domain data generation, multitask-assisted mask disentanglement, and
domain alignment of target features. The first module generates diverse data
for alignment, and three simple but effective data augmentation methods are
designed to simulate target signature variations. The second module
disentangles the target features from background clutter using the
multitask-assisted mask to prevent clutter from interfering with subsequent
alignment. The third module employs a contrastive loss for domain alignment to
extract robust target features from generated diverse data and disentangled
features. Lastly, the proposed method demonstrates impressive robustness across
nine operating conditions in the MSTAR dataset, and extensive qualitative and
quantitative analyses validate the effectiveness of our framework
HED-UNet: Combined Segmentation and Edge Detection for Monitoring the Antarctic Coastline
Deep learning-based coastline detection algorithms have begun to outshine
traditional statistical methods in recent years. However, they are usually
trained only as single-purpose models to either segment land and water or
delineate the coastline. In contrast to this, a human annotator will usually
keep a mental map of both segmentation and delineation when performing manual
coastline detection. To take into account this task duality, we therefore
devise a new model to unite these two approaches in a deep learning model. By
taking inspiration from the main building blocks of a semantic segmentation
framework (UNet) and an edge detection framework (HED), both tasks are combined
in a natural way. Training is made efficient by employing deep supervision on
side predictions at multiple resolutions. Finally, a hierarchical attention
mechanism is introduced to adaptively merge these multiscale predictions into
the final model output. The advantages of this approach over other traditional
and deep learning-based methods for coastline detection are demonstrated on a
dataset of Sentinel-1 imagery covering parts of the Antarctic coast, where
coastline detection is notoriously difficult. An implementation of our method
is available at \url{https://github.com/khdlr/HED-UNet}.Comment: This work has been accepted by IEEE TGRS for publication. Copyright
may be transferred without notice, after which this version may no longer be
accessibl
- …