530 research outputs found
Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots
Safety is paramount for mobile robotic platforms such as self-driving cars
and unmanned aerial vehicles. This work is devoted to a task that is
indispensable for safety yet was largely overlooked in the past -- detecting
obstacles that are of very thin structures, such as wires, cables and tree
branches. This is a challenging problem, as thin objects can be problematic for
active sensors such as lidar and sonar and even for stereo cameras. In this
work, we propose to use video sequences for thin obstacle detection. We
represent obstacles with edges in the video frames, and reconstruct them in 3D
using efficient edge-based visual odometry techniques. We provide both a
monocular camera solution and a stereo camera solution. The former incorporates
Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter
enjoys a novel, purely vision-based solution. Experiments demonstrated that the
proposed methods are fast and able to detect thin obstacles robustly and
accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio
Learning-based Image Enhancement for Visual Odometry in Challenging HDR Environments
One of the main open challenges in visual odometry (VO) is the robustness to
difficult illumination conditions or high dynamic range (HDR) environments. The
main difficulties in these situations come from both the limitations of the
sensors and the inability to perform a successful tracking of interest points
because of the bold assumptions in VO, such as brightness constancy. We address
this problem from a deep learning perspective, for which we first fine-tune a
Deep Neural Network (DNN) with the purpose of obtaining enhanced
representations of the sequences for VO. Then, we demonstrate how the insertion
of Long Short Term Memory (LSTM) allows us to obtain temporally consistent
sequences, as the estimation depends on previous states. However, the use of
very deep networks does not allow the insertion into a real-time VO framework;
therefore, we also propose a Convolutional Neural Network (CNN) of reduced size
capable of performing faster. Finally, we validate the enhanced representations
by evaluating the sequences produced by the two architectures in several
state-of-art VO algorithms, such as ORB-SLAM and DSO
SelfOdom: Self-supervised Egomotion and Depth Learning via Bi-directional Coarse-to-Fine Scale Recovery
Accurately perceiving location and scene is crucial for autonomous driving
and mobile robots. Recent advances in deep learning have made it possible to
learn egomotion and depth from monocular images in a self-supervised manner,
without requiring highly precise labels to train the networks. However,
monocular vision methods suffer from a limitation known as scale-ambiguity,
which restricts their application when absolute-scale is necessary. To address
this, we propose SelfOdom, a self-supervised dual-network framework that can
robustly and consistently learn and generate pose and depth estimates in global
scale from monocular images. In particular, we introduce a novel coarse-to-fine
training strategy that enables the metric scale to be recovered in a two-stage
process. Furthermore, SelfOdom is flexible and can incorporate inertial data
with images, which improves its robustness in challenging scenarios, using an
attention-based fusion module. Our model excels in both normal and challenging
lighting conditions, including difficult night scenes. Extensive experiments on
public datasets have demonstrated that SelfOdom outperforms representative
traditional and learning-based VO and VIO models.Comment: 14 pages, 8 figures, in submissio
Event-based Simultaneous Localization and Mapping: A Comprehensive Survey
In recent decades, visual simultaneous localization and mapping (vSLAM) has
gained significant interest in both academia and industry. It estimates camera
motion and reconstructs the environment concurrently using visual sensors on a
moving robot. However, conventional cameras are limited by hardware, including
motion blur and low dynamic range, which can negatively impact performance in
challenging scenarios like high-speed motion and high dynamic range
illumination. Recent studies have demonstrated that event cameras, a new type
of bio-inspired visual sensor, offer advantages such as high temporal
resolution, dynamic range, low power consumption, and low latency. This paper
presents a timely and comprehensive review of event-based vSLAM algorithms that
exploit the benefits of asynchronous and irregular event streams for
localization and mapping tasks. The review covers the working principle of
event cameras and various event representations for preprocessing event data.
It also categorizes event-based vSLAM methods into four main categories:
feature-based, direct, motion-compensation, and deep learning methods, with
detailed discussions and practical guidance for each approach. Furthermore, the
paper evaluates the state-of-the-art methods on various benchmarks,
highlighting current challenges and future opportunities in this emerging
research area. A public repository will be maintained to keep track of the
rapid developments in this field at
{\url{https://github.com/kun150kun/ESLAM-survey}}
- …