678 research outputs found
Fast and Robust Detection of Fallen People from a Mobile Robot
This paper deals with the problem of detecting fallen people lying on the
floor by means of a mobile robot equipped with a 3D depth sensor. In the
proposed algorithm, inspired by semantic segmentation techniques, the 3D scene
is over-segmented into small patches. Fallen people are then detected by means
of two SVM classifiers: the first one labels each patch, while the second one
captures the spatial relations between them. This novel approach showed to be
robust and fast. Indeed, thanks to the use of small patches, fallen people in
real cluttered scenes with objects side by side are correctly detected.
Moreover, the algorithm can be executed on a mobile robot fitted with a
standard laptop making it possible to exploit the 2D environmental map built by
the robot and the multiple points of view obtained during the robot navigation.
Additionally, this algorithm is robust to illumination changes since it does
not rely on RGB data but on depth data. All the methods have been thoroughly
validated on the IASLAB-RGBD Fallen Person Dataset, which is published online
as a further contribution. It consists of several static and dynamic sequences
with 15 different people and 2 different environments
Robust Digital-Twin Localization via An RGBD-based Transformer Network and A Comprehensive Evaluation on a Mobile Dataset
The potential of digital-twin technology, involving the creation of precise
digital replicas of physical objects, to reshape AR experiences in 3D object
tracking and localization scenarios is significant. However, enabling robust 3D
object tracking in dynamic mobile AR environments remains a formidable
challenge. These scenarios often require a more robust pose estimator capable
of handling the inherent sensor-level measurement noise. In this paper,
recognizing the challenges of comprehensive solutions in existing literature,
we propose a transformer-based 6DoF pose estimator designed to achieve
state-of-the-art accuracy under real-world noisy data. To systematically
validate the new solution's performance against the prior art, we also
introduce a novel RGBD dataset called Digital Twin Tracking Dataset v2 (DTTD2),
which is focused on digital-twin object tracking scenarios. Expanded from an
existing DTTD v1 (DTTD1), the new dataset adds digital-twin data captured using
a cutting-edge mobile RGBD sensor suite on Apple iPhone 14 Pro, expanding the
applicability of our approach to iPhone sensor data. Through extensive
experimentation and in-depth analysis, we illustrate the effectiveness of our
methods under significant depth data errors, surpassing the performance of
existing baselines. Code and dataset are made publicly available at:
https://github.com/augcog/DTTD
Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data
Localization is a key requirement for mobile robot autonomy and human-robot
interaction. Vision-based localization is accurate and flexible, however, it
incurs a high computational burden which limits its application on many
resource-constrained platforms. In this paper, we address the problem of
performing real-time localization in large-scale 3D point cloud maps of
ever-growing size. While most systems using multi-modal information reduce
localization time by employing side-channel information in a coarse manner (eg.
WiFi for a rough prior position estimate), we propose to inter-weave the map
with rich sensory data. This multi-modal approach achieves two key goals
simultaneously. First, it enables us to harness additional sensory data to
localise against a map covering a vast area in real-time; and secondly, it also
allows us to roughly localise devices which are not equipped with a camera. The
key to our approach is a localization policy based on a sequential Monte Carlo
estimator. The localiser uses this policy to attempt point-matching only in
nodes where it is likely to succeed, significantly increasing the efficiency of
the localization process. The proposed multi-modal localization system is
evaluated extensively in a large museum building. The results show that our
multi-modal approach not only increases the localization accuracy but
significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots
(Humanoids) 201
- …