thesis

Learning to detect good image features

Abstract

State-of-the-art keypoint detection algorithms have been designed to extract specific structures from images and to achieve a high keypoint repeatability, which means that they should find the same points in images undergoing specific transformations. However, this criterion does not guarantee that the selected keypoints will be the optimal ones during the successive matching step. The approach that has been developed in this thesis work is aimed at extracting keypoints that maximize the matching performance according to a pre-selected image descriptor. In order to do that, a classifier has been trained on a set of “good” and “bad” descriptors extracted from training images that are affected by a set of pre-defined nuisances. The set of “good” keypoints used for the training is filled with those vectors that are related to the points that gave correct matches during an initial matching step. On the contrary, randomly chosen points that are far away from the positives are labeled as “bad” keypoints. Finally, the descriptors computed at the “good” and “bad” locations form the set of features used to train the classifier that will judge each pixel of an unseen input image as a good or bad candidate for driving the extraction of a set of keypoints. This approach requires, though, the descriptors to be computed at every pixel of the image and this leads to a high computational effort. Moreover, if a certain descriptor extractor is used during the training step, it must be used also during the testing. In order to overcome these problems, the last part of this thesis has been focused on the creation and training of a convolutional neural network (CNN) that uses as positive samples the patches centered at those locations that give correct correspondences during the matching step. Eventually, the results and the performances of the developed algorithm have compared to the state-of-the-art using a public benchmark

    Similar works