175 research outputs found
Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry
Monocular visual odometry consists of the estimation of the position of an
agent through images of a single camera, and it is applied in autonomous
vehicles, medical robots, and augmented reality. However, monocular systems
suffer from the scale ambiguity problem due to the lack of depth information in
2D frames. This paper contributes by showing an application of the dense
prediction transformer model for scale estimation in monocular visual odometry
systems. Experimental results show that the scale drift problem of monocular
systems can be reduced through the accurate estimation of the depth map by this
model, achieving competitive state-of-the-art performance on a visual odometry
benchmark
Learning Motion Predictors for Smart Wheelchair using Autoregressive Sparse Gaussian Process
Constructing a smart wheelchair on a commercially available powered
wheelchair (PWC) platform avoids a host of seating, mechanical design and
reliability issues but requires methods of predicting and controlling the
motion of a device never intended for robotics. Analog joystick inputs are
subject to black-box transformations which may produce intuitive and adaptable
motion control for human operators, but complicate robotic control approaches;
furthermore, installation of standard axle mounted odometers on a commercial
PWC is difficult. In this work, we present an integrated hardware and software
system for predicting the motion of a commercial PWC platform that does not
require any physical or electronic modification of the chair beyond plugging
into an industry standard auxiliary input port. This system uses an RGB-D
camera and an Arduino interface board to capture motion data, including visual
odometry and joystick signals, via ROS communication. Future motion is
predicted using an autoregressive sparse Gaussian process model. We evaluate
the proposed system on real-world short-term path prediction experiments.
Experimental results demonstrate the system's efficacy when compared to a
baseline neural network model.Comment: The paper has been accepted to the International Conference on
Robotics and Automation (ICRA2018
OpenVSLAM: A Versatile Visual SLAM Framework
In this paper, we introduce OpenVSLAM, a visual SLAM framework with high
usability and extensibility. Visual SLAM systems are essential for AR devices,
autonomous control of robots and drones, etc. However, conventional open-source
visual SLAM frameworks are not appropriately designed as libraries called from
third-party programs. To overcome this situation, we have developed a novel
visual SLAM framework. This software is designed to be easily used and
extended. It incorporates several useful features and functions for research
and development. OpenVSLAM is released at
https://github.com/xdspacelab/openvslam under the 2-clause BSD license.Comment: Accepted to ACM Multimedia 2019 Open Source Software Competition.
Video: https://www.youtube.com/watch?v=Ro_s3Lbx5m
Dense Piecewise Planar RGB-D SLAM for Indoor Environments
The paper exploits weak Manhattan constraints to parse the structure of
indoor environments from RGB-D video sequences in an online setting. We extend
the previous approach for single view parsing of indoor scenes to video
sequences and formulate the problem of recovering the floor plan of the
environment as an optimal labeling problem solved using dynamic programming.
The temporal continuity is enforced in a recursive setting, where labeling from
previous frames is used as a prior term in the objective function. In addition
to recovery of piecewise planar weak Manhattan structure of the extended
environment, the orthogonality constraints are also exploited by visual
odometry and pose graph optimization. This yields reliable estimates in the
presence of large motions and absence of distinctive features to track. We
evaluate our method on several challenging indoors sequences demonstrating
accurate SLAM and dense mapping of low texture environments. On existing TUM
benchmark we achieve competitive results with the alternative approaches which
fail in our environments.Comment: International Conference on Intelligent Robots and Systems (IROS)
201
MDN-VO: Estimating Visual Odometry with Confidence
Visual Odometry (VO) is used in many applications including robotics and
autonomous systems. However, traditional approaches based on feature matching
are computationally expensive and do not directly address failure cases,
instead relying on heuristic methods to detect failure. In this work, we
propose a deep learning-based VO model to efficiently estimate 6-DoF poses, as
well as a confidence model for these estimates. We utilise a CNN - RNN hybrid
model to learn feature representations from image sequences. We then employ a
Mixture Density Network (MDN) which estimates camera motion as a mixture of
Gaussians, based on the extracted spatio-temporal representations. Our model
uses pose labels as a source of supervision, but derives uncertainties in an
unsupervised manner. We evaluate the proposed model on the KITTI and nuScenes
datasets and report extensive quantitative and qualitative results to analyse
the performance of both pose and uncertainty estimation. Our experiments show
that the proposed model exceeds state-of-the-art performance in addition to
detecting failure cases using the predicted pose uncertainty
ProSLAM: Graph SLAM from a Programmer's Perspective
In this paper we present ProSLAM, a lightweight stereo visual SLAM system
designed with simplicity in mind. Our work stems from the experience gathered
by the authors while teaching SLAM to students and aims at providing a highly
modular system that can be easily implemented and understood. Rather than
focusing on the well known mathematical aspects of Stereo Visual SLAM, in this
work we highlight the data structures and the algorithmic aspects that one
needs to tackle during the design of such a system. We implemented ProSLAM
using the C++ programming language in combination with a minimal set of well
known used external libraries. In addition to an open source implementation, we
provide several code snippets that address the core aspects of our approach
directly in this paper. The results of a thorough validation performed on
standard benchmark datasets show that our approach achieves accuracy comparable
to state of the art methods, while requiring substantially less computational
resources.Comment: 8 pages, 8 figure
Visual SLAM with RGB-D cameras based on pose graph optimization
En este trabajo abordamos el problema de localización y mapeo simultáneo (SLAM) utilizando únicamente información obtenida mediante una cámara RGB-D. El objetivo principal es desarrollar un sistema SLAM capaz de estimar la trayectoria completa del sensor y generar una
representación 3D consistente del entorno en tiempo real. Para lograr este objetivo, el sistema se basa en un método de estimación del movimiento del sensor a partir de información de profundidad densa y en técnicas de reconocimiento de lugares a partir de características visuales. A partir de estos algoritmos, se extraen restricciones espaciales entre fotogramas
cuidadosamente seleccionados. Con estas restricciones espaciales se construye
un grafo de poses, empleado para inferir la trayectoria más verosímil. El sistema se ha diseñado para ejecutarse en dos hilos paralelos: uno para el seguimiento y el otro para la construcción de la representación consistente. El sistema se evalúa en conjuntos de datos públicamente accesible, alcanzando una precisión comparable a sistemas de SLAM del estado del arte. Además, el hilo de seguimiento se ejecuta a una frecuencia de 60 Hz en un ordenador portátil de prestaciones modestas. También se realizan pruebas en situaciones más realistas, procesando observaciones adquiridas mientras se movía el sensor por dos entornos de interiores distintos
Computationally efficient solutions for tracking people with a mobile robot: an experimental evaluation of Bayesian filters
Modern service robots will soon become an essential part of modern society. As they have to move and act in human environments, it is essential for them to be provided with a fast and reliable tracking system that localizes people in the neighbourhood. It is therefore important to select the most appropriate filter to estimate the position of these persons.
This paper presents three efficient implementations of multisensor-human tracking based on different Bayesian estimators: Extended Kalman Filter (EKF), Unscented Kalman Filter (UKF) and Sampling Importance Resampling (SIR) particle filter. The system implemented on a mobile robot is explained, introducing the methods used to detect and estimate the position of multiple people. Then, the solutions based on the three filters are discussed in detail. Several real experiments are conducted to evaluate their performance, which is compared in terms of accuracy, robustness and execution time of the estimation. The results show that a solution based on the UKF can perform as good as particle filters and can be often a better choice when computational efficiency is a key issue
- …