For tackling the task of 2D human pose estimation, the great majority of the
recent methods regard this task as a heatmap estimation problem, and optimize
the heatmap prediction using the Gaussian-smoothed heatmap as the optimization
objective and using the pixel-wise loss (e.g. MSE) as the loss function. In
this paper, we show that optimizing the heatmap prediction in such a way, the
model performance of body joint localization, which is the intrinsic objective
of this task, may not be consistently improved during the optimization process
of the heatmap prediction. To address this problem, from a novel perspective,
we propose to formulate the optimization of the heatmap prediction as a
distribution matching problem between the predicted heatmap and the dot
annotation of the body joint directly. By doing so, our proposed method does
not need to construct the Gaussian-smoothed heatmap and can achieve a more
consistent model performance improvement during the optimization of the heatmap
prediction. We show the effectiveness of our proposed method through extensive
experiments on the COCO dataset and the MPII dataset.Comment: NeurIPS 202