13 research outputs found
Faster than FAST: GPU-Accelerated Frontend for High-Speed VIO
The recent introduction of powerful embedded graphics processing units (GPUs)
has allowed for unforeseen improvements in real-time computer vision
applications. It has enabled algorithms to run onboard, well above the standard
video rates, yielding not only higher information processing capability, but
also reduced latency. This work focuses on the applicability of efficient
low-level, GPU hardware-specific instructions to improve on existing computer
vision algorithms in the field of visual-inertial odometry (VIO). While most
steps of a VIO pipeline work on visual features, they rely on image data for
detection and tracking, of which both steps are well suited for
parallelization. Especially non-maxima suppression and the subsequent feature
selection are prominent contributors to the overall image processing latency.
Our work first revisits the problem of non-maxima suppression for feature
detection specifically on GPUs, and proposes a solution that selects local
response maxima, imposes spatial feature distribution, and extracts features
simultaneously. Our second contribution introduces an enhanced FAST feature
detector that applies the aforementioned non-maxima suppression method.
Finally, we compare our method to other state-of-the-art CPU and GPU
implementations, where we always outperform all of them in feature tracking and
detection, resulting in over 1000fps throughput on an embedded Jetson TX2
platform. Additionally, we demonstrate our work integrated in a VIO pipeline
achieving a metric state estimation at ~200fps.Comment: IEEE International Conference on Intelligent Robots and Systems
(IROS), 2020. Open-source implementation available at
https://github.com/uzh-rpg/vili
GyroFlow: Gyroscope-Guided Unsupervised Optical Flow Learning
Existing optical flow methods are erroneous in challenging scenes, such as
fog, rain, and night because the basic optical flow assumptions such as
brightness and gradient constancy are broken. To address this problem, we
present an unsupervised learning approach that fuses gyroscope into optical
flow learning. Specifically, we first convert gyroscope readings into motion
fields named gyro field. Then, we design a self-guided fusion module to fuse
the background motion extracted from the gyro field with the optical flow and
guide the network to focus on motion details. To the best of our knowledge,
this is the first deep learning-based framework that fuses gyroscope data and
image content for optical flow learning. To validate our method, we propose a
new dataset that covers regular and challenging scenes. Experiments show that
our method outperforms the state-of-art methods in both regular and challenging
scenes
Symmetric Kullback-Leibler Metric Based Tracking Behaviors for Bioinspired Robotic Eyes
A symmetric Kullback-Leibler metric based tracking system, capable of tracking moving targets, is presented for a bionic spherical parallel mechanism to minimize a tracking error function to simulate smooth pursuit of human eyes. More specifically, we propose a real-time moving target tracking algorithm which utilizes spatial histograms taking into account symmetric Kullback-Leibler metric. In the proposed algorithm, the key spatial histograms are extracted and taken into particle filtering framework. Once the target is identified, an image-based control scheme is implemented to drive bionic spherical parallel mechanism such that the identified target is to be tracked at the center of the captured images. Meanwhile, the robot motion information is fed forward to develop an adaptive smooth tracking controller inspired by the Vestibuloocular Reflex mechanism. The proposed tracking system is designed to make the robot track dynamic objects when the robot travels through transmittable terrains, especially bumpy environment. To perform bumpy-resist capability under the condition of violent attitude variation when the robot works in the bumpy environment mentioned, experimental results demonstrate the effectiveness and robustness of our bioinspired tracking system using bionic spherical parallel mechanism inspired by head-eye coordination
Real-time aerial vehicle detection and tracking with depth-aided vision sensing
We study the problem of detecting and tracking flying objects in real-time with color and depth images. We improve the sparse part-based representation learning approach by utilizing depth data from depth vision sensor to achieve much faster detection speed while maintain high detection accuracy. We revised some of algorithms presented in part-based representation method to get marginally better performance. Then we invented a novel data preprocessing method, which is based on edge detection and contour selection to generate possible vehicle locations before the image is processed by classifier. This approach can be applied to any object with distinguishable parts in relatively fixed spatial configurations, and our target here is the flying vehicle at indoor environment. Since flying objects tend to change poses and locations fast and frequently, the detection algorithm needs to run fast so that the tracking algorithm can keep on tracking the detected object. We also use hardware acceleration tools to further increase algorithm speed. The results of vehicle localization and tracking are shown and a critical evaluation of our approaches is also presented
Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs
Humans are able to form a complex mental model of the environment they move
in. This mental model captures geometric and semantic aspects of the scene,
describes the environment at multiple levels of abstractions (e.g., objects,
rooms, buildings), includes static and dynamic entities and their relations
(e.g., a person is in a room at a given time). In contrast, current robots'
internal representations still provide a partial and fragmented understanding
of the environment, either in the form of a sparse or dense set of geometric
primitives (e.g., points, lines, planes, voxels) or as a collection of objects.
This paper attempts to reduce the gap between robot and human perception by
introducing a novel representation, a 3D Dynamic Scene Graph(DSG), that
seamlessly captures metric and semantic aspects of a dynamic environment. A DSG
is a layered graph where nodes represent spatial concepts at different levels
of abstraction, and edges represent spatio-temporal relations among nodes. Our
second contribution is Kimera, the first fully automatic method to build a DSG
from visual-inertial data. Kimera includes state-of-the-art techniques for
visual-inertial SLAM, metric-semantic 3D reconstruction, object localization,
human pose and shape estimation, and scene parsing. Our third contribution is a
comprehensive evaluation of Kimera in real-life datasets and photo-realistic
simulations, including a newly released dataset, uHumans2, which simulates a
collection of crowded indoor and outdoor scenes. Our evaluation shows that
Kimera achieves state-of-the-art performance in visual-inertial SLAM, estimates
an accurate 3D metric-semantic mesh model in real-time, and builds a DSG of a
complex indoor environment with tens of objects and humans in minutes. Our
final contribution shows how to use a DSG for real-time hierarchical semantic
path-planning. The core modules in Kimera are open-source.Comment: 34 pages, 25 figures, 9 tables. arXiv admin note: text overlap with
arXiv:2002.0628