Continuous Video Domain Adaptation (CVDA) is a scenario where a source model
is required to adapt to a series of individually available changing target
domains continuously without source data or target supervision. It has wide
applications, such as robotic vision and autonomous driving. The main
underlying challenge of CVDA is to learn helpful information only from the
unsupervised target data while avoiding forgetting previously learned knowledge
catastrophically, which is out of the capability of previous Video-based
Unsupervised Domain Adaptation methods. Therefore, we propose a
Confidence-Attentive network with geneRalization enhanced self-knowledge
disTillation (CART) to address the challenge in CVDA. Firstly, to learn from
unsupervised domains, we propose to learn from pseudo labels. However, in
continuous adaptation, prediction errors can accumulate rapidly in pseudo
labels, and CART effectively tackles this problem with two key modules.
Specifically, The first module generates refined pseudo labels using model
predictions and deploys a novel attentive learning strategy. The second module
compares the outputs of augmented data from the current model to the outputs of
weakly augmented data from the source model, forming a novel consistency
regularization on the model to alleviate the accumulation of prediction errors.
Extensive experiments suggest that the CVDA performance of CART outperforms
existing methods by a considerable margin.Comment: 16 pages, 9 tables, 10 figure