84 research outputs found
Data Efficient Visual Place Recognition Using Extremely JPEG-Compressed Images
Visual Place Recognition (VPR) is the ability of a robotic platform to
correctly interpret visual stimuli from its on-board cameras in order to
determine whether it is currently located in a previously visited place,
despite different viewpoint, illumination and appearance changes. JPEG is a
widely used image compression standard that is capable of significantly
reducing the size of an image at the cost of image clarity. For applications
where several robotic platforms are simultaneously deployed, the visual data
gathered must be transmitted remotely between each robot. Hence, JPEG
compression can be employed to drastically reduce the amount of data
transmitted over a communication channel, as working with limited bandwidth for
VPR can be proven to be a challenging task. However, the effects of JPEG
compression on the performance of current VPR techniques have not been
previously studied. For this reason, this paper presents an in-depth study of
JPEG compression in VPR related scenarios. We use a selection of
well-established VPR techniques on 8 datasets with various amounts of
compression applied. We show that by introducing compression, the VPR
performance is drastically reduced, especially in the higher spectrum of
compression. To overcome the negative effects of JPEG compression on the VPR
performance, we present a fine-tuned CNN which is optimized for JPEG compressed
data and show that it performs more consistently with the image transformations
detected in extremely compressed JPEG images.Comment: 8 pages, 8 figure
Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors
Visual localization is an attractive problem that estimates the camera
localization from database images based on the query image. It is a crucial
task for various applications, such as autonomous vehicles, assistive
navigation and augmented reality. The challenging issues of the task lie in
various appearance variations between query and database images, including
illumination variations, dynamic object variations and viewpoint variations. In
order to tackle those challenges, Panoramic Annular Localizer into which
panoramic annular lens and robust deep image descriptors are incorporated is
proposed in this paper. The panoramic annular images captured by the single
camera are processed and fed into the NetVLAD network to form the active deep
descriptor, and sequential matching is utilized to generate the localization
result. The experiments carried on the public datasets and in the field
illustrate the validation of the proposed system.Comment: Accepted by ITSC 201
LookUP: Vision-Only Real-Time Precise Underground Localisation for Autonomous Mining Vehicles
A key capability for autonomous underground mining vehicles is real-time
accurate localisation. While significant progress has been made, currently
deployed systems have several limitations ranging from dependence on costly
additional infrastructure to failure of both visual and range sensor-based
techniques in highly aliased or visually challenging environments. In our
previous work, we presented a lightweight coarse vision-based localisation
system that could map and then localise to within a few metres in an
underground mining environment. However, this level of precision is
insufficient for providing a cheaper, more reliable vision-based automation
alternative to current range sensor-based systems. Here we present a new
precision localisation system dubbed "LookUP", which learns a
neural-network-based pixel sampling strategy for estimating homographies based
on ceiling-facing cameras without requiring any manual labelling. This new
system runs in real time on limited computation resource and is demonstrated on
two different underground mine sites, achieving real time performance at ~5
frames per second and a much improved average localisation error of ~1.2 metre.Comment: 7 pages, 7 figures, accepted for IEEE ICRA 201
Training a Convolutional Neural Network for Appearance-Invariant Place Recognition
Place recognition is one of the most challenging problems in computer vision,
and has become a key part in mobile robotics and autonomous driving
applications for performing loop closure in visual SLAM systems. Moreover, the
difficulty of recognizing a revisited location increases with appearance
changes caused, for instance, by weather or illumination variations, which
hinders the long-term application of such algorithms in real environments. In
this paper we present a convolutional neural network (CNN), trained for the
first time with the purpose of recognizing revisited locations under severe
appearance changes, which maps images to a low dimensional space where
Euclidean distances represent place dissimilarity. In order for the network to
learn the desired invariances, we train it with triplets of images selected
from datasets which present a challenging variability in visual appearance. The
triplets are selected in such way that two samples are from the same location
and the third one is taken from a different place. We validate our system
through extensive experimentation, where we demonstrate better performance than
state-of-art algorithms in a number of popular datasets
Addressing Challenging Place Recognition Tasks using Generative Adversarial Networks
Place recognition is an essential component of Simultaneous Localization And
Mapping (SLAM). Under severe appearance change, reliable place recognition is a
difficult perception task since the same place is perceptually very different
in the morning, at night, or over different seasons. This work addresses place
recognition as a domain translation task. Using a pair of coupled Generative
Adversarial Networks (GANs), we show that it is possible to generate the
appearance of one domain (such as summer) from another (such as winter) without
requiring image-to-image correspondences across the domains. Mapping between
domains is learned from sets of images in each domain without knowing the
instance-to-instance correspondence by enforcing a cyclic consistency
constraint. In the process, meaningful feature spaces are learned for each
domain, the distances in which can be used for the task of place recognition.
Experiments show that learned features correspond to visual similarity and can
be effectively used for place recognition across seasons.Comment: Accepted for publication in IEEE International Conference on Robotics
and Automation (ICRA), 201
- …