366 research outputs found
Modelling uncertainty in deep learning for camera relocalization
We present a robust and real-time monocular six degree of freedom visual
relocalization system. We use a Bayesian convolutional neural network to
regress the 6-DOF camera pose from a single RGB image. It is trained in an
end-to-end manner with no need of additional engineering or graph optimisation.
The algorithm can operate indoors and outdoors in real time, taking under 6ms
to compute. It obtains approximately 2m and 6 degrees accuracy for very large
scale outdoor scenes and 0.5m and 10 degrees accuracy indoors. Using a Bayesian
convolutional neural network implementation we obtain an estimate of the
model's relocalization uncertainty and improve state of the art localization
accuracy on a large scale outdoor dataset. We leverage the uncertainty measure
to estimate metric relocalization error and to detect the presence or absence
of the scene in the input image. We show that the model's uncertainty is caused
by images being dissimilar to the training dataset in either pose or
appearance
A hybrid probabilistic model for camera relocalization
We present a hybrid deep learning method for modelling the uncertainty of camera relocalization from a single RGB image. The proposed system leverages the discriminative deep image representation from a convolutional neural networks, and uses Gaussian Process regressors to generate the probability distribution of the six degree of freedom (6DoF) camera pose in an end-to-end fashion. This results in a network that can generate uncertainties over its inferences with no need to sample many times. Furthermore we show that our objective based on KL divergence reduces the dependence on the choice of hyperparameters. The results show that compared to the state-of-the-art Bayesian camera relocalization method, our model produces comparable localization uncertainty and improves the system efficiency significantly, without loss of accuracy.Ming Cai, Chunhua Shen, Ian Rei
Real-Time 6DOF Pose Relocalization for Event Cameras with Stacked Spatial LSTM Networks
We present a new method to relocalize the 6DOF pose of an event camera solely
based on the event stream. Our method first creates the event image from a list
of events that occurs in a very short time interval, then a Stacked Spatial
LSTM Network (SP-LSTM) is used to learn the camera pose. Our SP-LSTM is
composed of a CNN to learn deep features from the event images and a stack of
LSTM to learn spatial dependencies in the image feature space. We show that the
spatial dependency plays an important role in the relocalization task and the
SP-LSTM can effectively learn this information. The experimental results on a
publicly available dataset show that our approach generalizes well and
outperforms recent methods by a substantial margin. Overall, our proposed
method reduces by approx. 6 times the position error and 3 times the
orientation error compared to the current state of the art. The source code and
trained models will be released.Comment: 7 pages, 5 figure
Scene Coordinate Regression with Angle-Based Reprojection Loss for Camera Relocalization
Image-based camera relocalization is an important problem in computer vision
and robotics. Recent works utilize convolutional neural networks (CNNs) to
regress for pixels in a query image their corresponding 3D world coordinates in
the scene. The final pose is then solved via a RANSAC-based optimization scheme
using the predicted coordinates. Usually, the CNN is trained with ground truth
scene coordinates, but it has also been shown that the network can discover 3D
scene geometry automatically by minimizing single-view reprojection loss.
However, due to the deficiencies of the reprojection loss, the network needs to
be carefully initialized. In this paper, we present a new angle-based
reprojection loss, which resolves the issues of the original reprojection loss.
With this new loss function, the network can be trained without careful
initialization, and the system achieves more accurate results. The new loss
also enables us to utilize available multi-view constraints, which further
improve performance.Comment: ECCV 2018 Workshop (Geometry Meets Deep Learning
Exploiting Points and Lines in Regression Forests for RGB-D Camera Relocalization
Camera relocalization plays a vital role in many robotics and computer vision
tasks, such as global localization, recovery from tracking failure and loop
closure detection. Recent random forests based methods exploit randomly sampled
pixel comparison features to predict 3D world locations for 2D image locations
to guide the camera pose optimization. However, these image features are only
sampled randomly in the images, without considering the spatial structures or
geometric information, leading to large errors or failure cases with the
existence of poorly textured areas or in motion blur. Line segment features are
more robust in these environments. In this work, we propose to jointly exploit
points and lines within the framework of uncertainty driven regression forests.
The proposed approach is thoroughly evaluated on three publicly available
datasets against several strong state-of-the-art baselines in terms of several
different error metrics. Experimental results prove the efficacy of our method,
showing superior or on-par state-of-the-art performance.Comment: published as a conference paper at 2018 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS
- …