2,163 research outputs found

    SkipcrossNets: Adaptive Skip-cross Fusion for Road Detection

    Full text link
    Multi-modal fusion is increasingly being used for autonomous driving tasks, as images from different modalities provide unique information for feature extraction. However, the existing two-stream networks are only fused at a specific network layer, which requires a lot of manual attempts to set up. As the CNN goes deeper, the two modal features become more and more advanced and abstract, and the fusion occurs at the feature level with a large gap, which can easily hurt the performance. In this study, we propose a novel fusion architecture called skip-cross networks (SkipcrossNets), which combines adaptively LiDAR point clouds and camera images without being bound to a certain fusion epoch. Specifically, skip-cross connects each layer to each layer in a feed-forward manner, and for each layer, the feature maps of all previous layers are used as input and its own feature maps are used as input to all subsequent layers for the other modality, enhancing feature propagation and multi-modal features fusion. This strategy facilitates selection of the most similar feature layers from two data pipelines, providing a complementary effect for sparse point cloud features during fusion processes. The network is also divided into several blocks to reduce the complexity of feature fusion and the number of model parameters. The advantages of skip-cross fusion were demonstrated through application to the KITTI and A2D2 datasets, achieving a MaxF score of 96.85% on KITTI and an F1 score of 84.84% on A2D2. The model parameters required only 2.33 MB of memory at a speed of 68.24 FPS, which could be viable for mobile terminals and embedded devices

    Towards Safe Autonomous Driving: Capture Uncertainty in the Deep Neural Network For Lidar 3D Vehicle Detection

    Full text link
    To assure that an autonomous car is driving safely on public roads, its object detection module should not only work correctly, but show its prediction confidence as well. Previous object detectors driven by deep learning do not explicitly model uncertainties in the neural network. We tackle with this problem by presenting practical methods to capture uncertainties in a 3D vehicle detector for Lidar point clouds. The proposed probabilistic detector represents reliable epistemic uncertainty and aleatoric uncertainty in classification and localization tasks. Experimental results show that the epistemic uncertainty is related to the detection accuracy, whereas the aleatoric uncertainty is influenced by vehicle distance and occlusion. The results also show that we can improve the detection performance by 1%-5% by modeling the aleatoric uncertainty.Comment: Accepted to present in the 21st IEEE International Conference on Intelligent Transportation Systems (ITSC 2018

    Multi-Sensor Fusion for 3D Object Detection

    Get PDF
    Sensing and modelling of the surrounding environment is crucial for solving many of the problems in intelligent machines like self-driving cars, autonomous robots, and augmented reality displays. Performance, reliability and safety of the autonomous agents rely heavily on the way the environment is modelled. Two-dimensional models are inadequate to capture the three-dimensional nature of real-world scenes. Three-dimensional models are necessary to achieve the standards required by the autonomy stack for intelligent agents to work alongside humans. Data driven deep learning methodologies for three-dimensional scene modelling has evolved greatly in the past few years because of the availability of huge amounts of data from variety of sensors in the form of well-designed datasets. 3D object detection and localization are two of the key requirements for tasks such as obstacle avoidance, agent-to-agent interaction, and path planning. Most methodologies for object detection work on a single sensor data like camera or LiDAR. Camera sensors provide feature rich scene data and LiDAR provides us 3D geometrical information. Advanced object detection and localization can be achieved by leveraging the information from both camera and LiDAR sensors. In order to effectively quantify the uncertainty of each sensor channel, an appropriate fusion strategy is needed to fuse the independently encoded point clouds from LiDAR with the RGB images from standard vision cameras. In this work, we introduce a fusion strategy and develop a multimodal pipeline which utilizes existing state-of-the-art deep learning based data encoders to produce robust 3D object detection and localization in real-time. The performance of the proposed fusion model is evaluated on the popular KITTI 3D benchmark dataset

    Distant Vehicle Detection Using Radar and Vision

    Full text link
    For autonomous vehicles to be able to operate successfully they need to be aware of other vehicles with sufficient time to make safe, stable plans. Given the possible closing speeds between two vehicles, this necessitates the ability to accurately detect distant vehicles. Many current image-based object detectors using convolutional neural networks exhibit excellent performance on existing datasets such as KITTI. However, the performance of these networks falls when detecting small (distant) objects. We demonstrate that incorporating radar data can boost performance in these difficult situations. We also introduce an efficient automated method for training data generation using cameras of different focal lengths
    • …
    corecore