The recent introduction of powerful embedded graphics processing units (GPUs)
has allowed for unforeseen improvements in real-time computer vision
applications. It has enabled algorithms to run onboard, well above the standard
video rates, yielding not only higher information processing capability, but
also reduced latency. This work focuses on the applicability of efficient
low-level, GPU hardware-specific instructions to improve on existing computer
vision algorithms in the field of visual-inertial odometry (VIO). While most
steps of a VIO pipeline work on visual features, they rely on image data for
detection and tracking, of which both steps are well suited for
parallelization. Especially non-maxima suppression and the subsequent feature
selection are prominent contributors to the overall image processing latency.
Our work first revisits the problem of non-maxima suppression for feature
detection specifically on GPUs, and proposes a solution that selects local
response maxima, imposes spatial feature distribution, and extracts features
simultaneously. Our second contribution introduces an enhanced FAST feature
detector that applies the aforementioned non-maxima suppression method.
Finally, we compare our method to other state-of-the-art CPU and GPU
implementations, where we always outperform all of them in feature tracking and
detection, resulting in over 1000fps throughput on an embedded Jetson TX2
platform. Additionally, we demonstrate our work integrated in a VIO pipeline
achieving a metric state estimation at ~200fps.Comment: IEEE International Conference on Intelligent Robots and Systems
(IROS), 2020. Open-source implementation available at
https://github.com/uzh-rpg/vili