Domain shift widely exists in the visual world, while modern deep neural
networks commonly suffer from severe performance degradation under domain shift
due to the poor generalization ability, which limits the real-world
applications. The domain shift mainly lies in the limited source environmental
variations and the large distribution gap between source and unseen target
data. To this end, we propose a unified framework, Style-HAllucinated Dual
consistEncy learning (SHADE), to handle such domain shift in various visual
tasks. Specifically, SHADE is constructed based on two consistency constraints,
Style Consistency (SC) and Retrospection Consistency (RC). SC enriches the
source situations and encourages the model to learn consistent representation
across style-diversified samples. RC leverages general visual knowledge to
prevent the model from overfitting to source data and thus largely keeps the
representation consistent between the source and general visual models.
Furthermore, we present a novel style hallucination module (SHM) to generate
style-diversified samples that are essential to consistency learning. SHM
selects basis styles from the source distribution, enabling the model to
dynamically generate diverse and realistic samples during training. Extensive
experiments demonstrate that our versatile SHADE can significantly enhance the
generalization in various visual recognition tasks, including image
classification, semantic segmentation and object detection, with different
models, i.e., ConvNets and Transformer.Comment: Accepted by IJCV. Journal extension of arXiv:2204.02548. Code is
available at https://github.com/HeliosZhao/SHADE-VisualD