3,542 research outputs found
Cross-CBAM: A Lightweight network for Scene Segmentation
Scene parsing is a great challenge for real-time semantic segmentation.
Although traditional semantic segmentation networks have made remarkable
leap-forwards in semantic accuracy, the performance of inference speed is
unsatisfactory. Meanwhile, this progress is achieved with fairly large networks
and powerful computational resources. However, it is difficult to run extremely
large models on edge computing devices with limited computing power, which
poses a huge challenge to the real-time semantic segmentation tasks. In this
paper, we present the Cross-CBAM network, a novel lightweight network for
real-time semantic segmentation. Specifically, a Squeeze-and-Excitation Atrous
Spatial Pyramid Pooling Module(SE-ASPP) is proposed to get variable
field-of-view and multiscale information. And we propose a Cross Convolutional
Block Attention Module(CCBAM), in which a cross-multiply operation is employed
in the CCBAM module to make high-level semantic information guide low-level
detail information. Different from previous work, these works use attention to
focus on the desired information in the backbone. CCBAM uses cross-attention
for feature fusion in the FPN structure. Extensive experiments on the
Cityscapes dataset and Camvid dataset demonstrate the effectiveness of the
proposed Cross-CBAM model by achieving a promising trade-off between
segmentation accuracy and inference speed. On the Cityscapes test set, we
achieve 73.4% mIoU with a speed of 240.9FPS and 77.2% mIoU with a speed of
88.6FPS on NVIDIA GTX 1080Ti
Attentive Single-Tasking of Multiple Tasks
In this work we address task interference in universal networks by
considering that a network is trained on multiple tasks, but performs one task
at a time, an approach we refer to as "single-tasking multiple tasks". The
network thus modifies its behaviour through task-dependent feature adaptation,
or task attention. This gives the network the ability to accentuate the
features that are adapted to a task, while shunning irrelevant ones. We further
reduce task interference by forcing the task gradients to be statistically
indistinguishable through adversarial training, ensuring that the common
backbone architecture serving all tasks is not dominated by any of the
task-specific gradients. Results in three multi-task dense labelling problems
consistently show: (i) a large reduction in the number of parameters while
preserving, or even improving performance and (ii) a smooth trade-off between
computation and multi-task accuracy. We provide our system's code and
pre-trained models at http://vision.ee.ethz.ch/~kmaninis/astmt/.Comment: CVPR 2019 Camera Read
Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images
Semantic segmentation models based on convolutional neural networks (CNNs)
have gained much attention in relation to remote sensing and have achieved
remarkable performance for the extraction of buildings from high-resolution
aerial images. However, the issue of limited generalization for unseen images
remains. When there is a domain gap between the training and test datasets,
CNN-based segmentation models trained by a training dataset fail to segment
buildings for the test dataset. In this paper, we propose segmentation networks
based on a domain adaptive transfer attack (DATA) scheme for building
extraction from aerial images. The proposed system combines the domain transfer
and adversarial attack concepts. Based on the DATA scheme, the distribution of
the input images can be shifted to that of the target images while turning
images into adversarial examples against a target network. Defending
adversarial examples adapted to the target domain can overcome the performance
degradation due to the domain gap and increase the robustness of the
segmentation model. Cross-dataset experiments and the ablation study are
conducted for the three different datasets: the Inria aerial image labeling
dataset, the Massachusetts building dataset, and the WHU East Asia dataset.
Compared to the performance of the segmentation network without the DATA
scheme, the proposed method shows improvements in the overall IoU. Moreover, it
is verified that the proposed method outperforms even when compared to feature
adaptation (FA) and output space adaptation (OSA).Comment: 11pages, 12 figure
- …