585 research outputs found
Geometry-Aware Learning of Maps for Camera Localization
Maps are a key component in image-based camera localization and visual SLAM
systems: they are used to establish geometric constraints between images,
correct drift in relative pose estimation, and relocalize cameras after lost
tracking. The exact definitions of maps, however, are often
application-specific and hand-crafted for different scenarios (e.g. 3D
landmarks, lines, planes, bags of visual words). We propose to represent maps
as a deep neural net called MapNet, which enables learning a data-driven map
representation. Unlike prior work on learning maps, MapNet exploits cheap and
ubiquitous sensory inputs like visual odometry and GPS in addition to images
and fuses them together for camera localization. Geometric constraints
expressed by these inputs, which have traditionally been used in bundle
adjustment or pose-graph optimization, are formulated as loss terms in MapNet
training and also used during inference. In addition to directly improving
localization accuracy, this allows us to update the MapNet (i.e., maps) in a
self-supervised manner using additional unlabeled video sequences from the
scene. We also propose a novel parameterization for camera rotation which is
better suited for deep-learning based camera pose regression. Experimental
results on both the indoor 7-Scenes dataset and the outdoor Oxford RobotCar
dataset show significant performance improvement over prior work. The MapNet
project webpage is https://goo.gl/mRB3Au.Comment: CVPR 2018 camera ready paper + supplementary materia
Visual 3-D SLAM from UAVs
The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs
The Event-Camera Dataset and Simulator: Event-based Data for Pose Estimation, Visual Odometry, and SLAM
New vision sensors, such as the Dynamic and Active-pixel Vision sensor
(DAVIS), incorporate a conventional global-shutter camera and an event-based
sensor in the same pixel array. These sensors have great potential for
high-speed robotics and computer vision because they allow us to combine the
benefits of conventional cameras with those of event-based sensors: low
latency, high temporal resolution, and very high dynamic range. However, new
algorithms are required to exploit the sensor characteristics and cope with its
unconventional output, which consists of a stream of asynchronous brightness
changes (called "events") and synchronous grayscale frames. For this purpose,
we present and release a collection of datasets captured with a DAVIS in a
variety of synthetic and real environments, which we hope will motivate
research on new algorithms for high-speed and high-dynamic-range robotics and
computer-vision applications. In addition to global-shutter intensity images
and asynchronous events, we provide inertial measurements and ground-truth
camera poses from a motion-capture system. The latter allows comparing the pose
accuracy of ego-motion estimation algorithms quantitatively. All the data are
released both as standard text files and binary files (i.e., rosbag). This
paper provides an overview of the available data and describes a simulator that
we release open-source to create synthetic event-camera data.Comment: 7 pages, 4 figures, 3 table
Real-Time 6DOF Pose Relocalization for Event Cameras with Stacked Spatial LSTM Networks
We present a new method to relocalize the 6DOF pose of an event camera solely
based on the event stream. Our method first creates the event image from a list
of events that occurs in a very short time interval, then a Stacked Spatial
LSTM Network (SP-LSTM) is used to learn the camera pose. Our SP-LSTM is
composed of a CNN to learn deep features from the event images and a stack of
LSTM to learn spatial dependencies in the image feature space. We show that the
spatial dependency plays an important role in the relocalization task and the
SP-LSTM can effectively learn this information. The experimental results on a
publicly available dataset show that our approach generalizes well and
outperforms recent methods by a substantial margin. Overall, our proposed
method reduces by approx. 6 times the position error and 3 times the
orientation error compared to the current state of the art. The source code and
trained models will be released.Comment: 7 pages, 5 figure
- …