67 research outputs found

    Optimal Sensor Data Fusion Architecture for Object Detection in Adverse Weather Conditions

    Full text link
    A good and robust sensor data fusion in diverse weather conditions is a quite challenging task. There are several fusion architectures in the literature, e.g. the sensor data can be fused right at the beginning (Early Fusion), or they can be first processed separately and then concatenated later (Late Fusion). In this work, different fusion architectures are compared and evaluated by means of object detection tasks, in which the goal is to recognize and localize predefined objects in a stream of data. Usually, state-of-the-art object detectors based on neural networks are highly optimized for good weather conditions, since the well-known benchmarks only consist of sensor data recorded in optimal weather conditions. Therefore, the performance of these approaches decreases enormously or even fails in adverse weather conditions. In this work, different sensor fusion architectures are compared for good and adverse weather conditions for finding the optimal fusion architecture for diverse weather situations. A new training strategy is also introduced such that the performance of the object detector is greatly enhanced in adverse weather scenarios or if a sensor fails. Furthermore, the paper responds to the question if the detection accuracy can be increased further by providing the neural network with a-priori knowledge such as the spatial calibration of the sensors.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Multi-View 3D Object Detection Network for Autonomous Driving

    Full text link
    This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the bird's eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 10.3% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.Comment: To appear in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 201

    Pedestrian Detection at Day/Night Time with Visible and FIR Cameras : A Comparison

    Get PDF
    Altres ajuts: DGT (SPIP2014-01352)Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and nighttime. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images; (b) just infrared images; and (c) both of them. In order to obtain results for the last item, we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset that we have built for this purpose as well as on the publicly available KAIST multispectral dataset

    Connecting the dots for real-time LiDAR-based object detection with YOLO

    Full text link
    © 2018 Australasian Robotics and Automation Association. All rights reserved. In this paper we introduce a generic method for people and vehicle detection using LiDAR data only, leveraging a pre-trained Convolutional Neural Network (CNN) from the RGB domain. Typically with machine learning algorithms, there is an inherent trade-off between the amount of training data available and the need for engineered features. The current state-of-the-art object detection and classification heavily rely on deep CNNs trained on enormous RGB image datasets. To take advantage of this inbuilt knowledge, we propose to fine-tune You only look once (YOLO) network transferring its understanding about object shapes to upsampled LiDAR images. Our method creates a dense depth/intensity map, which highlights object contours, from the 3D-point cloud of a LiDAR scan. The proposed method is hardware agnostic, hence can be used with any LiDAR data, independently on the number of channels or beams. Overall, the proposed pipeline exploits the notable similarity between upsampled LiDAR images and RGB images preventing the need to train a deep CNN from scratch. This transfer learning makes our method data efficient while avoiding the creation of heavily engineered features. Evaluation results show that our proposed LiDAR-only detection model has equivalent performance to its RGB-only counterpart

    Simultaneous fusion, classification, andtraction of moving obstacles by LIDAR and camera using Bayesian algorithm

    Get PDF
    In the near future, preventing collisions with fixed or moving, alive, and inanimate obstacles will appear to be a severe challenge due to the increased use of Unmanned Ground Vehicles (UGVs). Light Detection and Ranging (LIDAR) sensors and cameras are usually used in UGV to detect obstacles. The definite tracing and classification of moving obstacles is a significant dimension in developed driver assistance systems. It is believed that the perceived model of the situation can be improved by incorporating the obstacle classification. The present study indicated a multi-hypotheses monitoring and classifying approach, which allows solving ambiguities rising with the last methods of associating and classifying targets and tracks in a highly volatile vehicular situation. This method was tested through real data from various driving scenarios and focusing on two obstacles of interest vehicle, pedestrian.In the near future, preventing collisions with fixed or moving, alive, and inanimate obstacles will appear to be a severe challenge due to the increased use of Unmanned Ground Vehicles (UGVs). Light Detection and Ranging (LIDAR) sensors and cameras are usually used in UGV to detect obstacles. The definite tracing and classification of moving obstacles is a significant dimension in developed driver assistance systems. It is believed that the perceived model of the situation can be improved by incorporating the obstacle classification. The present study indicated a multi-hypotheses monitoring and classifying approach, which allows solving ambiguities rising with the last methods of associating and classifying targets and tracks in a highly volatile vehicular situation. This method was tested through real data from various driving scenarios and focusing on two obstacles of interest vehicle, pedestrian
    corecore