975 research outputs found
Large Scale Image Search
International audienceWe address the problem of large scale image search, for which many recent methods use a bag-of-features image representation. We shows the sub-optimality of such a representation for matching descriptors and derive a more precise representation based on 1) Hamming embedding (HE) and 2) weak geometric consistency constraints (WGC). HE provides binary signatures that refine the matching based on visual words. WGC filters matching descriptors that are not consistent in terms of angle and scale. HE and WGC are integrated within an inverted file system and are efficiently exploited even in the case of very large datasets. Experiments performed on a dataset of one million images show a significant improvement due to the binary signatures and the weak geometric consistency constraints, as well as their efficiency. Estimation of the full geometric transformation, i.e., a re-ranking step on a short list of images, is complementary to our weak geometric consistency constraints and allows to further improve the accuracy. This is joint work with H. Jegou and M. Douz
Packing and Padding: Coupled Multi-index for Accurate Image Retrieval
In Bag-of-Words (BoW) based image retrieval, the SIFT visual word has a low
discriminative power, so false positive matches occur prevalently. Apart from
the information loss during quantization, another cause is that the SIFT
feature only describes the local gradient distribution. To address this
problem, this paper proposes a coupled Multi-Index (c-MI) framework to perform
feature fusion at indexing level. Basically, complementary features are coupled
into a multi-dimensional inverted index. Each dimension of c-MI corresponds to
one kind of feature, and the retrieval process votes for images similar in both
SIFT and other feature spaces. Specifically, we exploit the fusion of local
color feature into c-MI. While the precision of visual match is greatly
enhanced, we adopt Multiple Assignment to improve recall. The joint cooperation
of SIFT and color features significantly reduces the impact of false positive
matches.
Extensive experiments on several benchmark datasets demonstrate that c-MI
improves the retrieval accuracy significantly, while consuming only half of the
query time compared to the baseline. Importantly, we show that c-MI is well
complementary to many prior techniques. Assembling these methods, we have
obtained an mAP of 85.8% and N-S score of 3.85 on Holidays and Ukbench
datasets, respectively, which compare favorably with the state-of-the-arts.Comment: 8 pages, 7 figures, 6 tables. Accepted to CVPR 201
Adding Cues to Binary Feature Descriptors for Visual Place Recognition
In this paper we propose an approach to embed continuous and selector cues in
binary feature descriptors used for visual place recognition. The embedding is
achieved by extending each feature descriptor with a binary string that encodes
a cue and supports the Hamming distance metric. Augmenting the descriptors in
such a way has the advantage of being transparent to the procedure used to
compare them. We present two concrete applications of our methodology,
demonstrating the two considered types of cues. In addition to that, we
conducted on these applications a broad quantitative and comparative evaluation
covering five benchmark datasets and several state-of-the-art image retrieval
approaches in combination with various binary descriptor types.Comment: 8 pages, 8 figures, source: www.gitlab.com/srrg-software/srrg_bench,
submitted to ICRA 201
A location-aware embedding technique for accurate landmark recognition
The current state of the research in landmark recognition highlights the good
accuracy which can be achieved by embedding techniques, such as Fisher vector
and VLAD. All these techniques do not exploit spatial information, i.e.
consider all the features and the corresponding descriptors without embedding
their location in the image. This paper presents a new variant of the
well-known VLAD (Vector of Locally Aggregated Descriptors) embedding technique
which accounts, at a certain degree, for the location of features. The driving
motivation comes from the observation that, usually, the most interesting part
of an image (e.g., the landmark to be recognized) is almost at the center of
the image, while the features at the borders are irrelevant features which do
no depend on the landmark. The proposed variant, called locVLAD (location-aware
VLAD), computes the mean of the two global descriptors: the VLAD executed on
the entire original image, and the one computed on a cropped image which
removes a certain percentage of the image borders. This simple variant shows an
accuracy greater than the existing state-of-the-art approach. Experiments are
conducted on two public datasets (ZuBuD and Holidays) which are used both for
training and testing. Morever a more balanced version of ZuBuD is proposed.Comment: 6 pages, 5 figures, ICDSC 201
Hamming Embedding and Weak Geometry Consistency for Large Scale Image Search - extended version
This technical report presents and extends a recent paper we have proposed for large scale image search. State-of-the-art methods build on the bag-of- features image representation. We first analyze bag-of-features in the framework of approximate nearest neighbor search. This shows the sub-optimality of such a representation for matching descriptors and leads us to derive a more precise representation based on 1) Hamming embedding (HE) and 2) weak geometric consistency constraints (WGC). HE provides binary signatures that refine the matching based on visual words. WGC filters matching descriptors that are not consistent in terms of angle and scale. HE and WGC are integrated within an inverted file and are efficiently exploited for all images, even in the case of very large datasets. Experiments performed on a dataset of one million of images show a significant improvement due to the binary signature and the weak geometric consistency constraints, as well as their efficiency. Estimation of the full geometric transformation, i.e., a re-ranking step on a short list of images, is complementary to our weak geometric consistency constraints and allows to further improve the accuracy.Ce rapport technique reprend et étend un article récent sur la recherche d'images dans des grandes bases. Les méthodes de l'état de l'art reposent sur une représentation des images par sac de mots. Nous exprimons la mise en correspondance de ces descripteurs dans le contexte de la recherche approximative de plus proches voisins. Nous montrons que cette représentation est sous-optimale. Ceci nous amène à définir une représentation plus précise, basée sur 1) l'immersion dans un espace de Hamming (HE) et 2) des contraintes géométriques faibles (WGC). Le HE ajoute aux descripteurs une signature binaire qui permet d'affiner leur mise en correspondance. Le WGC filtre les correspondances de points dont les caractéristiques d'angle et d'échelle ne sont pas cohérentes. HE et WGC sont intégrés dans une structure de fichier invers é et appliqués à toutes les images, mˆeme pour de très grandes bases. Des expériences sur un million d'images montrent que la signature binaire et la contrainte géométrique faible améliorent significativement la précision, sans allongement des temps de calcul. Le réordonnancement des meilleures images par l'estimation d'une transformation géométrique complète est complémentaire avec notre WGC, et améliore encore la précision
- …