Traditional geometric registration based estimation methods only exploit the
CAD model implicitly, which leads to their dependence on observation quality
and deficiency to occlusion. To address the problem,the paper proposes a
bidirectional correspondence prediction network with a point-wise
attention-aware mechanism. This network not only requires the model points to
predict the correspondence but also explicitly models the geometric
similarities between observations and the model prior. Our key insight is that
the correlations between each model point and scene point provide essential
information for learning point-pair matches. To further tackle the correlation
noises brought by feature distribution divergence, we design a simple but
effective pseudo-siamese network to improve feature homogeneity. Experimental
results on the public datasets of LineMOD, YCB-Video, and Occ-LineMOD show that
the proposed method achieves better performance than other state-of-the-art
methods under the same evaluation criteria. Its robustness in estimating poses
is greatly improved, especially in an environment with severe occlusions