416 research outputs found

    TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation

    Full text link
    LiDAR semantic segmentation plays a crucial role in enabling autonomous driving and robots to understand their surroundings accurately and robustly. There are different types of methods, such as point-based, range-image-based, polar-based, and hybrid methods. Among these, range-image-based methods are widely used due to their efficiency. However, they face a significant challenge known as the ``many-to-one'' problem caused by the range image's limited horizontal and vertical angular resolution. As a result, around 20\% of the 3D points can be occluded. In this paper, we present TFNet, a range-image-based LiDAR semantic segmentation method that utilizes temporal information to address this issue. Specifically, we incorporate a temporal fusion layer to extract useful information from previous scans and integrate it with the current scan. We then design a max-voting-based post-processing technique to correct false predictions, particularly those caused by the ``many-to-one'' issue. We evaluated the approach on two benchmarks and demonstrate that the post-processing technique is generic and can be applied to various networks. We will release our code and models

    Deconvolutional networks for point-cloud vehicle detection and tracking in driving scenarios

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Vehicle detection and tracking is a core ingredient for developing autonomous driving applications in urban scenarios. Recent image-based Deep Learning (DL) techniques are obtaining breakthrough results in these perceptive tasks. However, DL research has not yet advanced much towards processing 3D point clouds from lidar range-finders. These sensors are very common in autonomous vehicles since, despite not providing as semantically rich information as images, their performance is more robust under harsh weather conditions than vision sensors. In this paper we present a full vehicle detection and tracking system that works with 3D lidar information only. Our detection step uses a Convolutional Neural Network (CNN) that receives as input a featured representation of the 3D information provided by a Velodyne HDL-64 sensor and returns a per-point classification of whether it belongs to a vehicle or not. The classified point cloud is then geometrically processed to generate observations for a multi-object tracking system implemented via a number of Multi-Hypothesis Extended Kalman Filters (MH-EKF) that estimate the position and velocity of the surrounding vehicles. The system is thoroughly evaluated on the KITTI tracking dataset, and we show the performance boost provided by our CNN-based vehicle detector over a standard geometric approach. Our lidar-based approach uses about a 4% of the data needed for an image-based detector with similarly competitive results.Peer ReviewedPostprint (author's final draft

    Comparative Study of Model-Based and Learning-Based Disparity Map Fusion Methods

    Get PDF
    Creating an accurate depth map has several, valuable applications including augmented/virtual reality, autonomous navigation, indoor/outdoor mapping, object segmentation, and aerial topography. Current hardware solutions for precise 3D scanning are relatively expensive. To combat hardware costs, software alternatives based on stereoscopic images have previously been proposed. However, software solutions are less accurate than hardware solutions, such as laser scanning, and are subject to a variety of irregularities. Notably, disparity maps generated from stereo images typically fall short in cases of occlusion, near object boundaries, and on repetitive texture regions or texture-less regions. Several post-processing methods are examined in an effort to combine strong algorithm results and alleviate erroneous disparity regions. These methods include basic statistical combinations, histogram-based voting, edge detection guidance, support vector machines (SVMs), and bagged trees. Individual errors and average errors are compared between the newly introduced fusion methods and the existing disparity algorithms. Several acceptable solutions are identified to bridge the gap between 3D scanning and stereo imaging. It is shown that fusing disparity maps can result in lower error rates than individual algorithms across the dataset while maintaining a high level of robustness
    • …
    corecore