We introduce a comprehensive benchmark for local features and robust
estimation algorithms, focusing on the downstream task -- the accuracy of the
reconstructed camera pose -- as our primary metric. Our pipeline's modular
structure allows easy integration, configuration, and combination of different
methods and heuristics. This is demonstrated by embedding dozens of popular
algorithms and evaluating them, from seminal works to the cutting edge of
machine learning research. We show that with proper settings, classical
solutions may still outperform the perceived state of the art.
Besides establishing the actual state of the art, the conducted experiments
reveal unexpected properties of Structure from Motion (SfM) pipelines that can
help improve their performance, for both algorithmic and learned methods. Data
and code are online https://github.com/vcg-uvic/image-matching-benchmark,
providing an easy-to-use and flexible framework for the benchmarking of local
features and robust estimation methods, both alongside and against
top-performing methods. This work provides a basis for the Image Matching
Challenge https://vision.uvic.ca/image-matching-challenge.Comment: Added: KeyNet-SOSNet, AffNet-HardNet, TFeat, MKD from korni