14 research outputs found

    LIDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks

    Full text link
    In this work, a deep learning approach has been developed to carry out road detection by fusing LIDAR point clouds and camera images. An unstructured and sparse point cloud is first projected onto the camera image plane and then upsampled to obtain a set of dense 2D images encoding spatial information. Several fully convolutional neural networks (FCNs) are then trained to carry out road detection, either by using data from a single sensor, or by using three fusion strategies: early, late, and the newly proposed cross fusion. Whereas in the former two fusion approaches, the integration of multimodal information is carried out at a predefined depth level, the cross fusion FCN is designed to directly learn from data where to integrate information; this is accomplished by using trainable cross connections between the LIDAR and the camera processing branches. To further highlight the benefits of using a multimodal system for road detection, a data set consisting of visually challenging scenes was extracted from driving sequences of the KITTI raw data set. It was then demonstrated that, as expected, a purely camera-based FCN severely underperforms on this data set. A multimodal system, on the other hand, is still able to provide high accuracy. Finally, the proposed cross fusion FCN was evaluated on the KITTI road benchmark where it achieved excellent performance, with a MaxF score of 96.03%, ranking it among the top-performing approaches

    Enhanced free space detection in multiple lanes based on single CNN with scene identification

    Full text link
    Many systems for autonomous vehicles' navigation rely on lane detection. Traditional algorithms usually estimate only the position of the lanes on the road, but an autonomous control system may also need to know if a lane marking can be crossed or not, and what portion of space inside the lane is free from obstacles, to make safer control decisions. On the other hand, free space detection algorithms only detect navigable areas, without information about lanes. State-of-the-art algorithms use CNNs for both tasks, with significant consumption of computing resources. We propose a novel approach that estimates the free space inside each lane, with a single CNN. Additionally, adding only a small requirement concerning GPU RAM, we infer the road type, that will be useful for path planning. To achieve this result, we train a multi-task CNN. Then, we further elaborate the output of the network, to extract polygons that can be effectively used in navigation control. Finally, we provide a computationally efficient implementation, based on ROS, that can be executed in real time. Our code and trained models are available online.Comment: Will appear in the 2019 IEEE Intelligent Vehicles Symposium (IV 2019

    Instance Segmentation and Object Detection in Road Scenes using Inverse Perspective Mapping of 3D Point Clouds and 2D Images

    Get PDF
    The instance segmentation and object detection are important tasks in smart car applications. Recently, a variety of neural network-based approaches have been proposed. One of the challenges is that there are various scales of objects in a scene, and it requires the neural network to have a large receptive field to deal with the scale variations. In other words, the neural network must have deep architectures which slow down computation. In smart car applications, the accuracy of detection and segmentation of vehicle and pedestrian is hugely critical. Besides, 2D images do not have distance information but enough visual appearance. On the other hand, 3D point clouds have strong evidence of existence of objects. The fusion of 2D images and 3D point clouds can provide more information to seek out objects in a scene. This paper proposes a series of fronto-parallel virtual planes and inverse perspective mapping of an input image to the planes, to deal with scale variations. I use 3D point clouds obtained from LiDAR sensor and 2D images obtained from stereo cameras on top of a vehicle to estimate the ground area of the scene and to define virtual planes. Certain height from the ground area in 2D images is cropped to focus on objects on flat roads. Then, the point cloud is used to filter out false-alarms among the over-detection results generated by an off-the-shelf deep neural network, Mask RCNN. The experimental result showed that the proposed approach outperforms Mask RCNN without pre-processing on a benchmark dataset, KITTI dataset [9]

    A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection

    Full text link
    Object detection in camera images, using deep learning has been proven successfully in recent years. Rising detection rates and computationally efficient network structures are pushing this technique towards application in production vehicles. Nevertheless, the sensor quality of the camera is limited in severe weather conditions and through increased sensor noise in sparsely lit areas and at night. Our approach enhances current 2D object detection networks by fusing camera data and projected sparse radar data in the network layers. The proposed CameraRadarFusionNet (CRF-Net) automatically learns at which level the fusion of the sensor data is most beneficial for the detection result. Additionally, we introduce BlackIn, a training strategy inspired by Dropout, which focuses the learning on a specific sensor type. We show that the fusion network is able to outperform a state-of-the-art image-only network for two different datasets. The code for this research will be made available to the public at: https://github.com/TUMFTM/CameraRadarFusionNet.Comment: Accepted at 2019 Sensor Data Fusion: Trends, Solutions, Applications (SDF

    SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection

    Full text link
    Freespace detection is an essential component of visual perception for self-driving cars. The recent efforts made in data-fusion convolutional neural networks (CNNs) have significantly improved semantic driving scene segmentation. Freespace can be hypothesized as a ground plane, on which the points have similar surface normals. Hence, in this paper, we first introduce a novel module, named surface normal estimator (SNE), which can infer surface normal information from dense depth/disparity images with high accuracy and efficiency. Furthermore, we propose a data-fusion CNN architecture, referred to as RoadSeg, which can extract and fuse features from both RGB images and the inferred surface normal information for accurate freespace detection. For research purposes, we publish a large-scale synthetic freespace detection dataset, named Ready-to-Drive (R2D) road dataset, collected under different illumination and weather conditions. The experimental results demonstrate that our proposed SNE module can benefit all the state-of-the-art CNNs for freespace detection, and our SNE-RoadSeg achieves the best overall performance among different datasets.Comment: ECCV 202

    Lidar–camera semi-supervised learning for semantic segmentation

    Get PDF
    In this work, we investigated two issues: (1) How the fusion of lidar and camera data can improve semantic segmentation performance compared with the individual sensor modalities in a supervised learning context; and (2) How fusion can also be leveraged for semi-supervised learning in order to further improve performance and to adapt to new domains without requiring any additional labelled data. A comparative study was carried out by providing an experimental evaluation on networks trained in different setups using various scenarios from sunny days to rainy night scenes. The networks were tested for challenging, and less common, scenarios where cameras or lidars individually would not provide a reliable prediction. Our results suggest that semi-supervised learning and fusion techniques increase the overall performance of the network in challenging scenarios using less data annotations

    LOCATOR: Low-power ORB accelerator for autonomous cars

    Get PDF
    Simultaneous Localization And Mapping (SLAM) is crucial for autonomous navigation. ORB-SLAM is a state-of-the-art Visual SLAM system based on cameras used for self-driving cars. In this paper, we propose a high-performance, energy-efficient, and functionally accurate hardware accelerator for ORB-SLAM, focusing on its most time-consuming stage: Oriented FAST and Rotated BRIEF (ORB) feature extraction. The Rotated BRIEF (rBRIEF) descriptor generation is the main bottleneck in ORB computation, as it exhibits highly irregular access patterns to local on-chip memories causing a high-performance penalty due to bank conflicts. We introduce a technique to find an optimal static pattern to perform parallel accesses to banks based on a genetic algorithm. Furthermore, we propose the combination of an rBRIEF pixel duplication cache, selective ports replication, and pipelining to reduce latency without compromising cost. The accelerator achieves a reduction in energy consumption of 14597× and 9609×, with respect to high-end CPU and GPU platforms, respectively.This work has been supported by the CoCoUnit ERC Advanced Grant of the EU’s Horizon 2020 program (grant No 833057), the Spanish State Research Agency (MCIN/AEI) under grant PID2020- 113172RB-I00, the ICREA Academia program and the FPU grant FPU18/04413Peer ReviewedPostprint (published version
    corecore