Difference features obtained by comparing the images of two periods play an
indispensable role in the change detection (CD) task. However, a pair of
bi-temporal images can exhibit diverse changes, which may cause various
difference features. Identifying changed pixels with differ difference features
to be the same category is thus a challenge for CD. Most nowadays' methods
acquire distinctive difference features in implicit ways like enhancing image
representation or supervision information. Nevertheless, informative image
features only guarantee object semantics are modeled and can not guarantee that
changed pixels have similar semantics in the difference feature space and are
distinct from those unchanged ones. In this work, the generalized
representation of various changes is learned straightforwardly in the
difference feature space, and a novel Changes-Aware Transformer (CAT) for
refining difference features is proposed. This generalized representation can
perceive which pixels are changed and which are unchanged and further guide the
update of pixels' difference features. CAT effectively accomplishes this
refinement process through the stacked cosine cross-attention layer and
self-attention layer. After refinement, the changed pixels in the difference
feature space are closer to each other, which facilitates change detection. In
addition, CAT is compatible with various backbone networks and existing CD
methods. Experiments on remote sensing CD data set and street scene CD data set
show that our method achieves state-of-the-art performance and has excellent
generalization