40 research outputs found
An Efficient Index for Visual Search in Appearance-based SLAM
Vector-quantization can be a computationally expensive step in visual
bag-of-words (BoW) search when the vocabulary is large. A BoW-based appearance
SLAM needs to tackle this problem for an efficient real-time operation. We
propose an effective method to speed up the vector-quantization process in
BoW-based visual SLAM. We employ a graph-based nearest neighbor search (GNNS)
algorithm to this aim, and experimentally show that it can outperform the
state-of-the-art. The graph-based search structure used in GNNS can efficiently
be integrated into the BoW model and the SLAM framework. The graph-based index,
which is a k-NN graph, is built over the vocabulary words and can be extracted
from the BoW's vocabulary construction procedure, by adding one iteration to
the k-means clustering, which adds small extra cost. Moreover, exploiting the
fact that images acquired for appearance-based SLAM are sequential, GNNS search
can be initiated judiciously which helps increase the speedup of the
quantization process considerably
Learning to Navigate the Energy Landscape
In this paper, we present a novel and efficient architecture for addressing
computer vision problems that use `Analysis by Synthesis'. Analysis by
synthesis involves the minimization of the reconstruction error which is
typically a non-convex function of the latent target variables.
State-of-the-art methods adopt a hybrid scheme where discriminatively trained
predictors like Random Forests or Convolutional Neural Networks are used to
initialize local search algorithms. While these methods have been shown to
produce promising results, they often get stuck in local optima. Our method
goes beyond the conventional hybrid architecture by not only proposing multiple
accurate initial solutions but by also defining a navigational structure over
the solution space that can be used for extremely efficient gradient-free local
search. We demonstrate the efficacy of our approach on the challenging problem
of RGB Camera Relocalization. To make the RGB camera relocalization problem
particularly challenging, we introduce a new dataset of 3D environments which
are significantly larger than those found in other publicly-available datasets.
Our experiments reveal that the proposed method is able to achieve
state-of-the-art camera relocalization results. We also demonstrate the
generalizability of our approach on Hand Pose Estimation and Image Retrieval
tasks
Offline Pseudo Relevance Feedback for Efficient and Effective Single-pass Dense Retrieval
Dense retrieval has made significant advancements in information retrieval
(IR) by achieving high levels of effectiveness while maintaining online
efficiency during a single-pass retrieval process. However, the application of
pseudo relevance feedback (PRF) to further enhance retrieval effectiveness
results in a doubling of online latency. To address this challenge, this paper
presents a single-pass dense retrieval framework that shifts the PRF process
offline through the utilization of pre-generated pseudo-queries. As a result,
online retrieval is reduced to a single matching with the pseudo-queries, hence
providing faster online retrieval. The effectiveness of the proposed approach
is evaluated on the standard TREC DL and HARD datasets, and the results
demonstrate its promise. Our code is openly available at
https://github.com/Rosenberg37/OPRF.Comment: Accepted at SIGIR202
Self-supervised Vector-Quantization in Visual SLAM using Deep Convolutional Autoencoders
In this paper, we introduce AE-FABMAP, a new self-supervised bag of
words-based SLAM method. We also present AE-ORB-SLAM, a modified version of the
current state of the art BoW-based path planning algorithm. That is, we have
used a deep convolutional autoencoder to find loop closures. In the context of
bag of words visual SLAM, vector quantization (VQ) is considered as the most
time-consuming part of the SLAM procedure, which is usually performed in the
offline phase of the SLAM algorithm using unsupervised algorithms such as
Kmeans++. We have addressed the loop closure detection part of the BoW-based
SLAM methods in a self-supervised manner, by integrating an autoencoder for
doing vector quantization. This approach can increase the accuracy of
large-scale SLAM, where plenty of unlabeled data is available. The main
advantage of using a self-supervised is that it can help reducing the amount of
labeling. Furthermore, experiments show that autoencoders are far more
efficient than semi-supervised methods like graph convolutional neural
networks, in terms of speed and memory consumption. We integrated this method
into the state of the art long range appearance based visual bag of word SLAM,
FABMAP2, also in ORB-SLAM. Experiments demonstrate the superiority of this
approach in indoor and outdoor datasets over regular FABMAP2 in all cases, and
it achieves higher accuracy in loop closure detection and trajectory
generation
Semi-supervised Vector-Quantization in Visual SLAM using HGCN
In this paper, two semi-supervised appearance based loop closure detection
technique, HGCN-FABMAP and HGCN-BoW are introduced. Furthermore an extension to
the current state of the art localization SLAM algorithm, ORB-SLAM, is
presented. The proposed HGCN-FABMAP method is implemented in an off-line manner
incorporating Bayesian probabilistic schema for loop detection decision making.
Specifically, we let a Hyperbolic Graph Convolutional Neural Network (HGCN) to
operate over the SURF features graph space, and perform vector quantization
part of the SLAM procedure. This part previously was performed in an
unsupervised manner using algorithms like HKmeans, kmeans++,..etc. The main
Advantage of using HGCN, is that it scales linearly in number of graph edges.
Experimental results shows that HGCN-FABMAP algorithm needs far more cluster
centroids than HGCN-ORB, otherwise it fails to detect loop closures. Therefore
we consider HGCN-ORB to be more efficient in terms of memory consumption, also
we conclude the superiority of HGCN-BoW and HGCN-FABMAP with respect to other
algorithms