1 research outputs found
Orderly Dual-Teacher Knowledge Distillation for Lightweight Human Pose Estimation
Although deep convolution neural networks (DCNN) have achieved excellent
performance in human pose estimation, these networks often have a large number
of parameters and computations, leading to the slow inference speed. For this
issue, an effective solution is knowledge distillation, which transfers
knowledge from a large pre-trained network (teacher) to a small network
(student). However, there are some defects in the existing approaches: (I) Only
a single teacher is adopted, neglecting the potential that a student can learn
from multiple teachers. (II) The human segmentation mask can be regarded as
additional prior information to restrict the location of keypoints, which is
never utilized. (III) A student with a small number of parameters cannot fully
imitate heatmaps provided by datasets and teachers. (IV) There exists noise in
heatmaps generated by teachers, which causes model degradation. To overcome
these defects, we propose an orderly dual-teacher knowledge distillation (ODKD)
framework, which consists of two teachers with different capabilities.
Specifically, the weaker one (primary teacher, PT) is used to teach keypoints
information, the stronger one (senior teacher, ST) is utilized to transfer
segmentation and keypoints information by adding the human segmentation mask.
Taking dual-teacher together, an orderly learning strategy is proposed to
promote knowledge absorbability. Moreover, we employ a binarization operation
which further improves the learning ability of the student and reduces noise in
heatmaps. Experimental results on COCO and OCHuman keypoints datasets show that
our proposed ODKD can improve the performance of different lightweight models
by a large margin, and HRNet-W16 equipped with ODKD achieves state-of-the-art
performance for lightweight human pose estimation