238 research outputs found
Visual 3-D SLAM from UAVs
The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs
Keyframe-based monocular SLAM: design, survey, and future directions
Extensive research in the field of monocular SLAM for the past fifteen years
has yielded workable systems that found their way into various applications in
robotics and augmented reality. Although filter-based monocular SLAM systems
were common at some time, the more efficient keyframe-based solutions are
becoming the de facto methodology for building a monocular SLAM system. The
objective of this paper is threefold: first, the paper serves as a guideline
for people seeking to design their own monocular SLAM according to specific
environmental constraints. Second, it presents a survey that covers the various
keyframe-based monocular SLAM systems in the literature, detailing the
components of their implementation, and critically assessing the specific
strategies made in each proposed solution. Third, the paper provides insight
into the direction of future research in this field, to address the major
limitations still facing monocular SLAM; namely, in the issues of illumination
changes, initialization, highly dynamic motion, poorly textured scenes,
repetitive textures, map maintenance, and failure recovery
A Non-Rigid Map Fusion-Based RGB-Depth SLAM Method for Endoscopic Capsule Robots
In the gastrointestinal (GI) tract endoscopy field, ingestible wireless
capsule endoscopy is considered as a minimally invasive novel diagnostic
technology to inspect the entire GI tract and to diagnose various diseases and
pathologies. Since the development of this technology, medical device companies
and many groups have made significant progress to turn such passive capsule
endoscopes into robotic active capsule endoscopes to achieve almost all
functions of current active flexible endoscopes. However, the use of robotic
capsule endoscopy still has some challenges. One such challenge is the precise
localization of such active devices in 3D world, which is essential for a
precise three-dimensional (3D) mapping of the inner organ. A reliable 3D map of
the explored inner organ could assist the doctors to make more intuitive and
correct diagnosis. In this paper, we propose to our knowledge for the first
time in literature a visual simultaneous localization and mapping (SLAM) method
specifically developed for endoscopic capsule robots. The proposed RGB-Depth
SLAM method is capable of capturing comprehensive dense globally consistent
surfel-based maps of the inner organs explored by an endoscopic capsule robot
in real time. This is achieved by using dense frame-to-model camera tracking
and windowed surfelbased fusion coupled with frequent model refinement through
non-rigid surface deformations
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
CROSSFIRE: Camera Relocalization On Self-Supervised Features from an Implicit Representation
Beyond novel view synthesis, Neural Radiance Fields are useful for
applications that interact with the real world. In this paper, we use them as
an implicit map of a given scene and propose a camera relocalization algorithm
tailored for this representation. The proposed method enables to compute in
real-time the precise position of a device using a single RGB camera, during
its navigation. In contrast with previous work, we do not rely on pose
regression or photometric alignment but rather use dense local features
obtained through volumetric rendering which are specialized on the scene with a
self-supervised objective. As a result, our algorithm is more accurate than
competitors, able to operate in dynamic outdoor environments with changing
lightning conditions and can be readily integrated in any volumetric neural
renderer.Comment: Accepted to ICCV 202
DH-PTAM: A Deep Hybrid Stereo Events-Frames Parallel Tracking And Mapping System
This paper presents a robust approach for a visual parallel tracking and
mapping (PTAM) system that excels in challenging environments. Our proposed
method combines the strengths of heterogeneous multi-modal visual sensors,
including stereo event-based and frame-based sensors, in a unified reference
frame through a novel spatio-temporal synchronization of stereo visual frames
and stereo event streams. We employ deep learning-based feature extraction and
description for estimation to enhance robustness further. We also introduce an
end-to-end parallel tracking and mapping optimization layer complemented by a
simple loop-closure algorithm for efficient SLAM behavior. Through
comprehensive experiments on both small-scale and large-scale real-world
sequences of VECtor and TUM-VIE benchmarks, our proposed method (DH-PTAM)
demonstrates superior performance compared to state-of-the-art methods in terms
of robustness and accuracy in adverse conditions. Our implementation's
research-based Python API is publicly available on GitHub for further research
and development: https://github.com/AbanobSoliman/DH-PTAM.Comment: Submitted for publication in IEEE RA-
- …