The 2D heatmap-based approaches have dominated Human Pose Estimation (HPE)
for years due to high performance. However, the long-standing quantization
error problem in the 2D heatmap-based methods leads to several well-known
drawbacks: 1) The performance for the low-resolution inputs is limited; 2) To
improve the feature map resolution for higher localization precision, multiple
costly upsampling layers are required; 3) Extra post-processing is adopted to
reduce the quantization error. To address these issues, we aim to explore a
brand new scheme, called \textit{SimCC}, which reformulates HPE as two
classification tasks for horizontal and vertical coordinates. The proposed
SimCC uniformly divides each pixel into several bins, thus achieving
\emph{sub-pixel} localization precision and low quantization error. Benefiting
from that, SimCC can omit additional refinement post-processing and exclude
upsampling layers under certain settings, resulting in a more simple and
effective pipeline for HPE. Extensive experiments conducted over COCO,
CrowdPose, and MPII datasets show that SimCC outperforms heatmap-based
counterparts, especially in low-resolution settings by a large margin