In this work we address the problem of finding reliable pixel-level
correspondences under difficult imaging conditions. We propose an approach
where a single convolutional neural network plays a dual role: It is
simultaneously a dense feature descriptor and a feature detector. By postponing
the detection to a later stage, the obtained keypoints are more stable than
their traditional counterparts based on early detection of low-level
structures. We show that this model can be trained using pixel correspondences
extracted from readily available large-scale SfM reconstructions, without any
further annotations. The proposed method obtains state-of-the-art performance
on both the difficult Aachen Day-Night localization dataset and the InLoc
indoor localization benchmark, as well as competitive performance on other
benchmarks for image matching and 3D reconstruction.Comment: Accepted at CVPR 201