237,287 research outputs found
Adversarial Training for Adverse Conditions: Robust Metric Localisation using Appearance Transfer
We present a method of improving visual place recognition and metric
localisation under very strong appear- ance change. We learn an invertable
generator that can trans- form the conditions of images, e.g. from day to
night, summer to winter etc. This image transforming filter is explicitly
designed to aid and abet feature-matching using a new loss based on SURF
detector and dense descriptor maps. A network is trained to output synthetic
images optimised for feature matching given only an input RGB image, and these
generated images are used to localize the robot against a previously built map
using traditional sparse matching approaches. We benchmark our results using
multiple traversals of the Oxford RobotCar Dataset over a year-long period,
using one traversal as a map and the other to localise. We show that this
method significantly improves place recognition and localisation under changing
and adverse conditions, while reducing the number of mapping runs needed to
successfully achieve reliable localisation.Comment: Accepted at ICRA201
Visual 3-D SLAM from UAVs
The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs
Multimedia ontology matching by using visual and textual modalities
International audienceOntologies have been intensively applied for improving multimedia search and retrieval by providing explicit meaning to visual content. Several multimedia ontologies have been recently proposed as knowledge models suitable for narrowing the well known semantic gap and for enabling the semantic interpretation of images. Since these ontologies have been created in different application contexts, establishing links between them, a task known as ontology matching, promises to fully unlock their potential in support of multimedia search and retrieval. This paper proposes and compares empirically two extensional ontology matching techniques applied to an important semantic image retrieval issue: automatically associating common-sense knowledge to multimedia concepts. First, we extend a previously introduced textual concept matching approach to use both textual and visual representation of images. In addition, a novel matching technique based on a multi-modal graph is proposed. We argue that the textual and visual modalities have to be seen as complementary rather than as exclusive sources of extensional information in order to improve the efficiency of the application of an ontology matching approach in the multimedia domain. An experimental evaluation is included in the paper
On the use of orientation filters for 3D reconstruction in event-driven stereo vision
The recently developed Dynamic Vision Sensors (DVS) sense visual information asynchronously and code it into trains of events with sub-micro second temporal resolution. This high temporal precision makes the output of these sensors especially suited for dynamic 3D visual reconstruction, by matching corresponding events generated by two different sensors in a stereo setup. This paper explores the use of Gabor filters to extract information about the orientation of the object edges that produce the events, therefore increasing the number of constraints applied to the matching algorithm. This strategy provides more reliably matched pairs of events, improving the final 3D reconstruction.ERANET PRI-PIMCHI- 2011-0768Ministerio de EconomĂa y Competitividad TEC2009-10639-C04-01, TEC2012-37868- C04-01Junta de AndalucĂa TIC-609
Event-driven stereo vision with orientation filters
The recently developed Dynamic Vision Sensors
(DVS) sense dynamic visual information asynchronously and
code it into trains of events with sub-micro second temporal
resolution. This high temporal precision makes the output of
these sensors especially suited for dynamic 3D visual
reconstruction, by matching corresponding events generated by
two different sensors in a stereo setup. This paper explores the
use of Gabor filters to extract information about the orientation
of the object edges that produce the events, applying the
matching algorithm to the events generated by the Gabor filters
and not to those produced by the DVS. This strategy provides
more reliably matched pairs of events, improving the final 3D
reconstruction.European Union PRI-PIMCHI-2011-0768Ministerio de EconomĂa y Competitividad TEC2009-10639-C04-01Ministerio de EconomĂa y Competitividad TEC2012-37868-C04-01Junta de AndalucĂa TIC-609
DGC-GNN: Descriptor-free Geometric-Color Graph Neural Network for 2D-3D Matching
Direct matching of 2D keypoints in an input image to a 3D point cloud of the
scene without requiring visual descriptors has garnered increased interest due
to its lower memory requirements, inherent privacy preservation, and reduced
need for expensive 3D model maintenance compared to visual descriptor-based
methods. However, existing algorithms often compromise on performance,
resulting in a significant deterioration compared to their descriptor-based
counterparts. In this paper, we introduce DGC-GNN, a novel algorithm that
employs a global-to-local Graph Neural Network (GNN) that progressively
exploits geometric and color cues to represent keypoints, thereby improving
matching robustness. Our global-to-local procedure encodes both Euclidean and
angular relations at a coarse level, forming the geometric embedding to guide
the local point matching. We evaluate DGC-GNN on both indoor and outdoor
datasets, demonstrating that it not only doubles the accuracy of the
state-of-the-art descriptor-free algorithm but, also, substantially narrows the
performance gap between descriptor-based and descriptor-free methods. The code
and trained models will be made publicly available
SeqNet: Learning Descriptors for Sequence-based Hierarchical Place Recognition
Visual Place Recognition (VPR) is the task of matching current visual imagery
from a camera to images stored in a reference map of the environment. While
initial VPR systems used simple direct image methods or hand-crafted visual
features, recent work has focused on learning more powerful visual features and
further improving performance through either some form of sequential matcher /
filter or a hierarchical matching process. In both cases the performance of the
initial single-image based system is still far from perfect, putting
significant pressure on the sequence matching or (in the case of hierarchical
systems) pose refinement stages. In this paper we present a novel hybrid system
that creates a high performance initial match hypothesis generator using short
learnt sequential descriptors, which enable selective control sequential score
aggregation using single image learnt descriptors. Sequential descriptors are
generated using a temporal convolutional network dubbed SeqNet, encoding short
image sequences using 1-D convolutions, which are then matched against the
corresponding temporal descriptors from the reference dataset to provide an
ordered list of place match hypotheses. We then perform selective sequential
score aggregation using shortlisted single image learnt descriptors from a
separate pipeline to produce an overall place match hypothesis. Comprehensive
experiments on challenging benchmark datasets demonstrate the proposed method
outperforming recent state-of-the-art methods using the same amount of
sequential information. Source code and supplementary material can be found at
https://github.com/oravus/seqNet.Comment: Accepted for publication in IEEE RA-L 2021; includes supplementar
- …