12,530 research outputs found

    Faster than FAST: GPU-Accelerated Frontend for High-Speed VIO

    Full text link
    The recent introduction of powerful embedded graphics processing units (GPUs) has allowed for unforeseen improvements in real-time computer vision applications. It has enabled algorithms to run onboard, well above the standard video rates, yielding not only higher information processing capability, but also reduced latency. This work focuses on the applicability of efficient low-level, GPU hardware-specific instructions to improve on existing computer vision algorithms in the field of visual-inertial odometry (VIO). While most steps of a VIO pipeline work on visual features, they rely on image data for detection and tracking, of which both steps are well suited for parallelization. Especially non-maxima suppression and the subsequent feature selection are prominent contributors to the overall image processing latency. Our work first revisits the problem of non-maxima suppression for feature detection specifically on GPUs, and proposes a solution that selects local response maxima, imposes spatial feature distribution, and extracts features simultaneously. Our second contribution introduces an enhanced FAST feature detector that applies the aforementioned non-maxima suppression method. Finally, we compare our method to other state-of-the-art CPU and GPU implementations, where we always outperform all of them in feature tracking and detection, resulting in over 1000fps throughput on an embedded Jetson TX2 platform. Additionally, we demonstrate our work integrated in a VIO pipeline achieving a metric state estimation at ~200fps.Comment: IEEE International Conference on Intelligent Robots and Systems (IROS), 2020. Open-source implementation available at https://github.com/uzh-rpg/vili

    Event-Based Noise Filtration with Point-of-Interest Detection and Tracking for Space Situational Awareness

    Get PDF
    This thesis explores an asynchronous noise-suppression technique to be used in conjunction with asynchronous, Gaussian-blob tracking on dynamic vision sensor (DVS) data. This type of sensor is a member of a relatively new class of neuromorphic sensing devices that emulate the change-based detection properties of the human eye. By leveraging a biologically inspired mode of operation, these sensors can achieve significantly higher sampling rates as compared to conventional cameras, while also eliminating redundant data generated by static backgrounds. The resulting high dynamic range and fast acquisition time of DVS recordings enables the imaging of high-velocity targets despite ordinarily problematic lighting conditions. The technique presented here relies on treating each pixel of the sensor as a spiking cell keeping track of its own activity over time, which in turn can be filtered out of the resulting sensor event stream by user-configurable threshold values that form a temporal bandpass filter. In addition, asynchronous blob-tracking is supplemented with double-exponential smoothing prediction and Bezier curve-fitting in order to smooth tracker movement and interpolate target trajectory respectively. This overall scheme is intended to achieve asynchronous point-source tracking using a DVS for space-based applications, particularly in tracking distant, dim satellites. In the space environment, radiation effects are expected to introduce transient, and possibly persistent, noise into the asynchronous event-stream of the DVS. Given the large distances between objects in space, targets of interest may be no larger than a single pixel and can therefore appear similar to such noise-induced events. In this thesis, the asynchronous approach is experimentally compared to a more traditional approach applied to reconstructed frame data for both performance and accuracy metrics. The results of this research show that the asynchronous approach can produce comparable or even better tracking accuracy, while also drastically reducing the execution time of the process by seven times on average

    Segmentation-assisted detection of dirt impairments in archived film sequences

    Get PDF
    A novel segmentation-assisted method for film dirt detection is proposed. We exploit the fact that film dirt manifests in the spatial domain as a cluster of connected pixels whose intensity differs substantially from that of its neighborhood and we employ a segmentation-based approach to identify this type of structure. A key feature of our approach is the computation of a measure of confidence attached to detected dirt regions which can be utilized for performance fine tuning. Another important feature of our algorithm is the avoidance of the computational complexity associated with motion estimation. Our experimental framework benefits from the availability of manually derived as well as objective ground truth data obtained using infrared scanning. Our results demonstrate that the proposed method compares favorably with standard spatial, temporal and multistage median filtering approaches and provides efficient and robust detection for a wide variety of test material

    DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers

    Get PDF
    In this paper, a new method for generating object and action proposals in images and videos is proposed. It builds on activations of different convolutional layers of a pretrained CNN, combining the localization accuracy of the early layers with the high informative-ness (and hence recall) of the later layers. To this end, we build an inverse cascade that, going backward from the later to the earlier convolutional layers of the CNN, selects the most promising locations and refines them in a coarse-to-fine manner. The method is efficient, because i) it re-uses the same features extracted for detection, ii) it aggregates features using integral images, and iii) it avoids a dense evaluation of the proposals thanks to the use of the inverse coarse-to-fine cascade. The method is also accurate. We show that our DeepProposals outperform most of the previously proposed object proposal and action proposal approaches and, when plugged into a CNN-based object detector, produce state-of-the-art detection performance.Comment: 15 page

    Repulsion Loss: Detecting Pedestrians in a Crowd

    Full text link
    Detecting individual pedestrians in a crowd remains a challenging problem since the pedestrians often gather together and occlude each other in real-world scenarios. In this paper, we first explore how a state-of-the-art pedestrian detector is harmed by crowd occlusion via experimentation, providing insights into the crowd occlusion problem. Then, we propose a novel bounding box regression loss specifically designed for crowd scenes, termed repulsion loss. This loss is driven by two motivations: the attraction by target, and the repulsion by other surrounding objects. The repulsion term prevents the proposal from shifting to surrounding objects thus leading to more crowd-robust localization. Our detector trained by repulsion loss outperforms all the state-of-the-art methods with a significant improvement in occlusion cases.Comment: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 201
    • …
    corecore