4 research outputs found
Self-Calibrated Cross Attention Network for Few-Shot Segmentation
The key to the success of few-shot segmentation (FSS) lies in how to
effectively utilize support samples. Most solutions compress support foreground
(FG) features into prototypes, but lose some spatial details. Instead, others
use cross attention to fuse query features with uncompressed support FG. Query
FG could be fused with support FG, however, query background (BG) cannot find
matched BG features in support FG, yet inevitably integrates dissimilar
features. Besides, as both query FG and BG are combined with support FG, they
get entangled, thereby leading to ineffective segmentation. To cope with these
issues, we design a self-calibrated cross attention (SCCA) block. For efficient
patch-based attention, query and support features are firstly split into
patches. Then, we design a patch alignment module to align each query patch
with its most similar support patch for better cross attention. Specifically,
SCCA takes a query patch as Q, and groups the patches from the same query image
and the aligned patches from the support image as K&V. In this way, the query
BG features are fused with matched BG features (from query patches), and thus
the aforementioned issues will be mitigated. Moreover, when calculating SCCA,
we design a scaled-cosine mechanism to better utilize the support features for
similarity calculation. Extensive experiments conducted on PASCAL-5^i and
COCO-20^i demonstrate the superiority of our model, e.g., the mIoU score under
5-shot setting on COCO-20^i is 5.6%+ better than previous state-of-the-arts.
The code is available at https://github.com/Sam1224/SCCAN.Comment: This paper is accepted by ICCV'2
Road Extraction with Satellite Images and Partial Road Maps
Road extraction is a process of automatically generating road maps mainly
from satellite images. Existing models all target to generate roads from the
scratch despite that a large quantity of road maps, though incomplete, are
publicly available (e.g. those from OpenStreetMap) and can help with road
extraction. In this paper, we propose to conduct road extraction based on
satellite images and partial road maps, which is new. We then propose a
two-branch Partial to Complete Network (P2CNet) for the task, which has two
prominent components: Gated Self-Attention Module (GSAM) and Missing Part (MP)
loss. GSAM leverages a channel-wise self-attention module and a gate module to
capture long-range semantics, filter out useless information, and better fuse
the features from two branches. MP loss is derived from the partial road maps,
trying to give more attention to the road pixels that do not exist in partial
road maps. Extensive experiments are conducted to demonstrate the effectiveness
of our model, e.g. P2CNet achieves state-of-the-art performance with the IoU
scores of 70.71% and 75.52%, respectively, on the SpaceNet and OSM datasets.Comment: This paper has been accepted by IEEE Transactions on Geoscience and
Remote Sensin