67 research outputs found
Optimal Sensor Data Fusion Architecture for Object Detection in Adverse Weather Conditions
A good and robust sensor data fusion in diverse weather conditions is a quite
challenging task. There are several fusion architectures in the literature,
e.g. the sensor data can be fused right at the beginning (Early Fusion), or
they can be first processed separately and then concatenated later (Late
Fusion). In this work, different fusion architectures are compared and
evaluated by means of object detection tasks, in which the goal is to recognize
and localize predefined objects in a stream of data. Usually, state-of-the-art
object detectors based on neural networks are highly optimized for good weather
conditions, since the well-known benchmarks only consist of sensor data
recorded in optimal weather conditions. Therefore, the performance of these
approaches decreases enormously or even fails in adverse weather conditions. In
this work, different sensor fusion architectures are compared for good and
adverse weather conditions for finding the optimal fusion architecture for
diverse weather situations. A new training strategy is also introduced such
that the performance of the object detector is greatly enhanced in adverse
weather scenarios or if a sensor fails. Furthermore, the paper responds to the
question if the detection accuracy can be increased further by providing the
neural network with a-priori knowledge such as the spatial calibration of the
sensors.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Multi-View 3D Object Detection Network for Autonomous Driving
This paper aims at high-accuracy 3D object detection in autonomous driving
scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework
that takes both LIDAR point cloud and RGB images as input and predicts oriented
3D bounding boxes. We encode the sparse 3D point cloud with a compact
multi-view representation. The network is composed of two subnetworks: one for
3D object proposal generation and another for multi-view feature fusion. The
proposal network generates 3D candidate boxes efficiently from the bird's eye
view representation of 3D point cloud. We design a deep fusion scheme to
combine region-wise features from multiple views and enable interactions
between intermediate layers of different paths. Experiments on the challenging
KITTI benchmark show that our approach outperforms the state-of-the-art by
around 25% and 30% AP on the tasks of 3D localization and 3D detection. In
addition, for 2D detection, our approach obtains 10.3% higher AP than the
state-of-the-art on the hard data among the LIDAR-based methods.Comment: To appear in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
Pedestrian Detection at Day/Night Time with Visible and FIR Cameras : A Comparison
Altres ajuts: DGT (SPIP2014-01352)Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and nighttime. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images; (b) just infrared images; and (c) both of them. In order to obtain results for the last item, we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset that we have built for this purpose as well as on the publicly available KAIST multispectral dataset
Connecting the dots for real-time LiDAR-based object detection with YOLO
© 2018 Australasian Robotics and Automation Association. All rights reserved. In this paper we introduce a generic method for people and vehicle detection using LiDAR data only, leveraging a pre-trained Convolutional Neural Network (CNN) from the RGB domain. Typically with machine learning algorithms, there is an inherent trade-off between the amount of training data available and the need for engineered features. The current state-of-the-art object detection and classification heavily rely on deep CNNs trained on enormous RGB image datasets. To take advantage of this inbuilt knowledge, we propose to fine-tune You only look once (YOLO) network transferring its understanding about object shapes to upsampled LiDAR images. Our method creates a dense depth/intensity map, which highlights object contours, from the 3D-point cloud of a LiDAR scan. The proposed method is hardware agnostic, hence can be used with any LiDAR data, independently on the number of channels or beams. Overall, the proposed pipeline exploits the notable similarity between upsampled LiDAR images and RGB images preventing the need to train a deep CNN from scratch. This transfer learning makes our method data efficient while avoiding the creation of heavily engineered features. Evaluation results show that our proposed LiDAR-only detection model has equivalent performance to its RGB-only counterpart
Simultaneous fusion, classification, andtraction of moving obstacles by LIDAR and camera using Bayesian algorithm
In the near future, preventing collisions with fixed or moving, alive, and inanimate obstacles will appear to be a severe challenge due to the increased use of Unmanned Ground Vehicles (UGVs). Light Detection and Ranging (LIDAR) sensors and cameras are usually used in UGV to detect obstacles. The definite tracing and classification of moving obstacles is a significant dimension in developed driver assistance systems. It is believed that the perceived model of the situation can be improved by incorporating the obstacle classification. The present study indicated a multi-hypotheses monitoring and classifying approach, which allows solving ambiguities rising with the last methods of associating and classifying targets and tracks in a highly volatile vehicular situation. This method was tested through real data from various driving scenarios and focusing on two obstacles of interest vehicle, pedestrian.In the near future, preventing collisions with fixed or moving, alive, and inanimate obstacles will appear to be a severe challenge due to the increased use of Unmanned Ground Vehicles (UGVs). Light Detection and Ranging (LIDAR) sensors and cameras are usually used in UGV to detect obstacles. The definite tracing and classification of moving obstacles is a significant dimension in developed driver assistance systems. It is believed that the perceived model of the situation can be improved by incorporating the obstacle classification. The present study indicated a multi-hypotheses monitoring and classifying approach, which allows solving ambiguities rising with the last methods of associating and classifying targets and tracks in a highly volatile vehicular situation. This method was tested through real data from various driving scenarios and focusing on two obstacles of interest vehicle, pedestrian
- …