289 research outputs found
Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data
Localization is a key requirement for mobile robot autonomy and human-robot
interaction. Vision-based localization is accurate and flexible, however, it
incurs a high computational burden which limits its application on many
resource-constrained platforms. In this paper, we address the problem of
performing real-time localization in large-scale 3D point cloud maps of
ever-growing size. While most systems using multi-modal information reduce
localization time by employing side-channel information in a coarse manner (eg.
WiFi for a rough prior position estimate), we propose to inter-weave the map
with rich sensory data. This multi-modal approach achieves two key goals
simultaneously. First, it enables us to harness additional sensory data to
localise against a map covering a vast area in real-time; and secondly, it also
allows us to roughly localise devices which are not equipped with a camera. The
key to our approach is a localization policy based on a sequential Monte Carlo
estimator. The localiser uses this policy to attempt point-matching only in
nodes where it is likely to succeed, significantly increasing the efficiency of
the localization process. The proposed multi-modal localization system is
evaluated extensively in a large museum building. The results show that our
multi-modal approach not only increases the localization accuracy but
significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots
(Humanoids) 201
Low-effort place recognition with WiFi fingerprints using deep learning
Using WiFi signals for indoor localization is the main localization modality
of the existing personal indoor localization systems operating on mobile
devices. WiFi fingerprinting is also used for mobile robots, as WiFi signals
are usually available indoors and can provide rough initial position estimate
or can be used together with other positioning systems. Currently, the best
solutions rely on filtering, manual data analysis, and time-consuming parameter
tuning to achieve reliable and accurate localization. In this work, we propose
to use deep neural networks to significantly lower the work-force burden of the
localization system design, while still achieving satisfactory results.
Assuming the state-of-the-art hierarchical approach, we employ the DNN system
for building/floor classification. We show that stacked autoencoders allow to
efficiently reduce the feature space in order to achieve robust and precise
classification. The proposed architecture is verified on the publicly available
UJIIndoorLoc dataset and the results are compared with other solutions
Range-only SLAM schemes exploiting robot-sensor network cooperation
Simultaneous localization and mapping (SLAM) is a key problem in robotics. A robot with no previous knowledge of the environment builds a map of this environment and localizes itself in that map. Range-only SLAM is a particularization of the SLAM problem which only uses the information provided by range sensors. This PhD Thesis describes the design, integration, evaluation and validation of a set of schemes for accurate and e_cient range-only simultaneous localization and mapping exploiting the cooperation between robots and sensor networks. This PhD Thesis proposes a general architecture for range-only simultaneous localization and mapping (RO-SLAM) with cooperation between robots and sensor networks. The adopted architecture has two main characteristics. First, it exploits the sensing, computational and communication capabilities of sensor network nodes. Both, the robot and the beacons actively participate in the execution of the RO-SLAM _lter. Second, it integrates not only robot-beacon measurements but also range measurements between two di_erent beacons, the so-called inter-beacon measurements. Most reported RO-SLAM methods are executed in a centralized manner in the robot. In these methods all tasks in RO-SLAM are executed in the robot, including measurement gathering, integration of measurements in RO-SLAM and the Prediction stage. These fully centralized RO-SLAM methods require high computational burden in the robot and have very poor scalability. This PhD Thesis proposes three di_erent schemes that works under the aforementioned architecture. These schemes exploit the advantages of cooperation between robots and sensor networks and intend to minimize the drawbacks of this cooperation. The _rst scheme proposed in this PhD Thesis is a RO-SLAM scheme with dynamically con_gurable measurement gathering. Integrating inter-beacon measurements in RO-SLAM signi_cantly improves map estimation but involves high consumption of resources, such as the energy required to gather and transmit measurements, the bandwidth required by the measurement collection protocol and the computational burden necessary to integrate the larger number of measurements. The objective of this scheme is to reduce the increment in resource consumption resulting from the integration of inter-beacon measurements by adopting a centralized mechanism running in the robot that adapts measurement gathering. The second scheme of this PhD Thesis consists in a distributed RO-SLAM scheme based on the Sparse Extended Information Filter (SEIF). This scheme reduces the increment in resource consumption resulting from the integration of inter-beacon measurements by adopting a distributed SLAM _lter in which each beacon is responsible for gathering its measurements to the robot and to other beacons and computing the SLAM Update stage in order to integrate its measurements in SLAM. Moreover, it inherits the scalability of the SEIF. The third scheme of this PhD Thesis is a resource-constrained RO-SLAM scheme based on the distributed SEIF previously presented. This scheme includes the two mechanisms developed in the previous contributions {measurement gathering control and distribution of RO-SLAM Update stage between beacons{ in order to reduce the increment in resource consumption resulting from the integration of inter-beacon measurements. This scheme exploits robot-beacon cooperation to improve SLAM accuracy and e_ciency while meeting a given resource consumption bound. The resource consumption bound is expressed in terms of the maximum number of measurements that can be integrated in SLAM per iteration. The sensing channel capacity used, the beacon energy consumed or the computational capacity employed, among others, are proportional to the number of measurements that are gathered and integrated in SLAM. The performance of the proposed schemes have been analyzed and compared with each other and with existing works. The proposed schemes are validated in real experiments with aerial robots. This PhD Thesis proves that the cooperation between robots and sensor networks provides many advantages to solve the RO-SLAM problem. Resource consumption is an important constraint in sensor networks. The proposed architecture allows the exploitation of the cooperation advantages. On the other hand, the proposed schemes give solutions to the resource limitation without degrading performance
Real-time single image depth perception in the wild with handheld devices
Depth perception is paramount to tackle real-world problems, ranging from
autonomous driving to consumer applications. For the latter, depth estimation
from a single image represents the most versatile solution, since a standard
camera is available on almost any handheld device. Nonetheless, two main issues
limit its practical deployment: i) the low reliability when deployed
in-the-wild and ii) the demanding resource requirements to achieve real-time
performance, often not compatible with such devices. Therefore, in this paper,
we deeply investigate these issues showing how they are both addressable
adopting appropriate network design and training strategies -- also outlining
how to map the resulting networks on handheld devices to achieve real-time
performance. Our thorough evaluation highlights the ability of such fast
networks to generalize well to new environments, a crucial feature required to
tackle the extremely varied contexts faced in real applications. Indeed, to
further support this evidence, we report experimental results concerning
real-time depth-aware augmented reality and image blurring with smartphones
in-the-wild.Comment: 11 pages, 9 figure
Real-time 3D object detection and SLAM fusion in a low-cost LiDAR test vehicle setup
Recently released research about deep learning applications related to perception for autonomous driving focuses heavily on the usage of LiDAR point cloud data as input for the neural networks, highlighting the importance of LiDAR technology in the field of Autonomous Driving (AD). In this sense, a great percentage of the vehicle platforms used to create the datasets released for the development of these neural networks, as well as some AD commercial solutions available on the market, heavily invest in an array of sensors, including a large number of sensors as well as several sensor modalities. However, these costs create a barrier to entry for low-cost solutions for the performance of critical perception tasks such as Object Detection and SLAM. This paper explores current vehicle platforms and proposes a low-cost, LiDAR-based test vehicle platform capable of running critical perception tasks (Object Detection and SLAM) in real time. Additionally, we propose the creation of a deep learning-based inference model for Object Detection deployed in a resource-constrained device, as well as a graph-based SLAM implementation, providing important considerations, explored while taking into account the real-time processing requirement and presenting relevant results demonstrating the usability of the developed work in the context of the proposed low-cost platform
Machine Learning for Microcontroller-Class Hardware -- A Review
The advancements in machine learning opened a new opportunity to bring
intelligence to the low-end Internet-of-Things nodes such as microcontrollers.
Conventional machine learning deployment has high memory and compute footprint
hindering their direct deployment on ultra resource-constrained
microcontrollers. This paper highlights the unique requirements of enabling
onboard machine learning for microcontroller class devices. Researchers use a
specialized model development workflow for resource-limited applications to
ensure the compute and latency budget is within the device limits while still
maintaining the desired performance. We characterize a closed-loop widely
applicable workflow of machine learning model development for microcontroller
class devices and show that several classes of applications adopt a specific
instance of it. We present both qualitative and numerical insights into
different stages of model development by showcasing several use cases. Finally,
we identify the open research challenges and unsolved questions demanding
careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa
Deep probabilistic methods for improved radar sensor modelling and pose estimation
Radar’s ability to sense under adverse conditions and at far-range makes it a valuable alternative to vision and lidar for mobile robotic applications. However, its complex, scene-dependent sensing process and significant noise artefacts makes working with radar challenging. Moving past classical rule-based approaches, which have dominated the literature to date, this thesis investigates deep and data-driven solutions across a range of tasks in robotics.
Firstly, a deep approach is developed for mapping raw sensor measurements to a grid-map of occupancy probabilities, outperforming classical filtering approaches by a significant margin. A distribution over the occupancy state is captured, additionally allowing uncertainty in predictions to be identified and managed. The approach is trained entirely using partial labels generated automatically from lidar, without requiring manual labelling.
Next, a deep model is proposed for generating stochastic radar measurements from simulated elevation maps. The model is trained by learning the forward and backward processes side-by-side, using a combination of adversarial and cyclical consistency constraints in combination with a partial alignment loss, using labels generated in lidar. By faithfully replicating the radar sensing process, new models can be trained for down-stream tasks, using labels that are readily available in simulation. In this case, segmentation models trained on simulated radar measurements, when deployed in the real world, are shown to approach the performance of a model trained entirely on real-world measurements.
Finally, the potential of deep approaches applied to the radar odometry task are explored. A learnt feature space is combined with a classical correlative scan matching procedure and optimised for pose prediction, allowing the proposed method to outperform the previous state-of-the-art by a significant margin. Through a probabilistic consideration the uncertainty in the pose is also successfully characterised. Building upon this success, properties of the Fourier Transform are then utilised to separate the search for translation and angle. It is shown that this decoupled search results in a significant boost to run-time performance, allowing the approach to run in real-time on CPUs and embedded devices, whilst remaining competitive with other radar odometry methods proposed in the literature
LE-HGR: A Lightweight and Efficient RGB-based Online Gesture Recognition Network for Embedded AR Devices
Online hand gesture recognition (HGR) techniques are essential in augmented
reality (AR) applications for enabling natural human-to-computer interaction
and communication. In recent years, the consumer market for low-cost AR devices
has been rapidly growing, while the technology maturity in this domain is still
limited. Those devices are typical of low prices, limited memory, and
resource-constrained computational units, which makes online HGR a challenging
problem. To tackle this problem, we propose a lightweight and computationally
efficient HGR framework, namely LE-HGR, to enable real-time gesture recognition
on embedded devices with low computing power. We also show that the proposed
method is of high accuracy and robustness, which is able to reach high-end
performance in a variety of complicated interaction environments. To achieve
our goal, we first propose a cascaded multi-task convolutional neural network
(CNN) to simultaneously predict probabilities of hand detection and regress
hand keypoint locations online. We show that, with the proposed cascaded
architecture design, false-positive estimates can be largely eliminated.
Additionally, an associated mapping approach is introduced to track the hand
trace via the predicted locations, which addresses the interference of
multi-handedness. Subsequently, we propose a trace sequence neural network
(TraceSeqNN) to recognize the hand gesture by exploiting the motion features of
the tracked trace. Finally, we provide a variety of experimental results to
show that the proposed framework is able to achieve state-of-the-art accuracy
with significantly reduced computational cost, which are the key properties for
enabling real-time applications in low-cost commercial devices such as mobile
devices and AR/VR headsets.Comment: Published in: 2019 IEEE International Symposium on Mixed and
Augmented Reality Adjunct (ISMAR-Adjunct
- …