Recently, large-scale pre-trained models have shown their advantages in many
tasks. However, due to the huge computational complexity and storage
requirements, it is challenging to apply the large-scale model to real scenes.
A common solution is knowledge distillation which regards the large-scale model
as a teacher model and helps to train a small student model to obtain a
competitive performance. Cross-task Knowledge distillation expands the
application scenarios of the large-scale pre-trained model. Existing knowledge
distillation works focus on directly mimicking the final prediction or the
intermediate layers of the teacher model, which represent the global-level
characteristics and are task-specific. To alleviate the constraint of different
label spaces, capturing invariant intrinsic local object characteristics (such
as the shape characteristics of the leg and tail of the cattle and horse) plays
a key role. Considering the complexity and variability of real scene tasks, we
propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach
to transfer the intrinsic local-level object knowledge of a large-scale teacher
network to various task scenarios. First, to better transfer the generalized
knowledge in the teacher model in cross-task scenarios, we propose a prototype
learning module to learn from the essential feature representation of objects
in the teacher model. Secondly, for diverse downstream tasks, we propose a
task-adaptive feature augmentation module to enhance the features of the
student model with the learned generalization prototype features and guide the
training of the student model to improve its generalization ability. The
experimental results on various visual tasks demonstrate the effectiveness of
our approach for large-scale model cross-task knowledge distillation scenes