We introduce a simple modification of local image descriptors, such as SIFT,
based on pooling gradient orientations across different domain sizes, in
addition to spatial locations. The resulting descriptor, which we call
DSP-SIFT, outperforms other methods in wide-baseline matching benchmarks,
including those based on convolutional neural networks, despite having the same
dimension of SIFT and requiring no training.Comment: Extended version of the CVPR 2015 paper. Technical Report UCLA CSD
14002