3,334 research outputs found
PIXOR: Real-time 3D Object Detection from Point Clouds
We address the problem of real-time 3D object detection from point clouds in
the context of autonomous driving. Computation speed is critical as detection
is a necessary component for safety. Existing approaches are, however,
expensive in computation due to high dimensionality of point clouds. We utilize
the 3D data more efficiently by representing the scene from the Bird's Eye View
(BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs
oriented 3D object estimates decoded from pixel-wise neural network
predictions. The input representation, network architecture, and model
optimization are especially designed to balance high accuracy and real-time
efficiency. We validate PIXOR on two datasets: the KITTI BEV object detection
benchmark, and a large-scale 3D vehicle detection benchmark. In both datasets
we show that the proposed detector surpasses other state-of-the-art methods
notably in terms of Average Precision (AP), while still runs at >28 FPS.Comment: Update of CVPR2018 paper: correct timing, fix typos, add
acknowledgemen
Towards Safe Autonomous Driving: Capture Uncertainty in the Deep Neural Network For Lidar 3D Vehicle Detection
To assure that an autonomous car is driving safely on public roads, its
object detection module should not only work correctly, but show its prediction
confidence as well. Previous object detectors driven by deep learning do not
explicitly model uncertainties in the neural network. We tackle with this
problem by presenting practical methods to capture uncertainties in a 3D
vehicle detector for Lidar point clouds. The proposed probabilistic detector
represents reliable epistemic uncertainty and aleatoric uncertainty in
classification and localization tasks. Experimental results show that the
epistemic uncertainty is related to the detection accuracy, whereas the
aleatoric uncertainty is influenced by vehicle distance and occlusion. The
results also show that we can improve the detection performance by 1%-5% by
modeling the aleatoric uncertainty.Comment: Accepted to present in the 21st IEEE International Conference on
Intelligent Transportation Systems (ITSC 2018
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
In this paper, we address semantic segmentation of road-objects from 3D LiDAR
point clouds. In particular, we wish to detect and categorize instances of
interest, such as cars, pedestrians and cyclists. We formulate this problem as
a point- wise classification problem, and propose an end-to-end pipeline called
SqueezeSeg based on convolutional neural networks (CNN): the CNN takes a
transformed LiDAR point cloud as input and directly outputs a point-wise label
map, which is then refined by a conditional random field (CRF) implemented as a
recurrent layer. Instance-level labels are then obtained by conventional
clustering algorithms. Our CNN model is trained on LiDAR point clouds from the
KITTI dataset, and our point-wise segmentation labels are derived from 3D
bounding boxes from KITTI. To obtain extra training data, we built a LiDAR
simulator into Grand Theft Auto V (GTA-V), a popular video game, to synthesize
large amounts of realistic training data. Our experiments show that SqueezeSeg
achieves high accuracy with astonishingly fast and stable runtime (8.7 ms per
frame), highly desirable for autonomous driving applications. Furthermore,
additionally training on synthesized data boosts validation accuracy on
real-world data. Our source code and synthesized data will be open-sourced
Deep Generative Modeling of LiDAR Data
Building models capable of generating structured output is a key challenge
for AI and robotics. While generative models have been explored on many types
of data, little work has been done on synthesizing lidar scans, which play a
key role in robot mapping and localization. In this work, we show that one can
adapt deep generative models for this task by unravelling lidar scans into a 2D
point map. Our approach can generate high quality samples, while simultaneously
learning a meaningful latent representation of the data. We demonstrate
significant improvements against state-of-the-art point cloud generation
methods. Furthermore, we propose a novel data representation that augments the
2D signal with absolute positional information. We show that this helps
robustness to noisy and imputed input; the learned model can recover the
underlying lidar scan from seemingly uninformative dataComment: Presented at IROS 201
LIDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection by fusing LIDAR point clouds and camera images. An unstructured and
sparse point cloud is first projected onto the camera image plane and then
upsampled to obtain a set of dense 2D images encoding spatial information.
Several fully convolutional neural networks (FCNs) are then trained to carry
out road detection, either by using data from a single sensor, or by using
three fusion strategies: early, late, and the newly proposed cross fusion.
Whereas in the former two fusion approaches, the integration of multimodal
information is carried out at a predefined depth level, the cross fusion FCN is
designed to directly learn from data where to integrate information; this is
accomplished by using trainable cross connections between the LIDAR and the
camera processing branches.
To further highlight the benefits of using a multimodal system for road
detection, a data set consisting of visually challenging scenes was extracted
from driving sequences of the KITTI raw data set. It was then demonstrated
that, as expected, a purely camera-based FCN severely underperforms on this
data set. A multimodal system, on the other hand, is still able to provide high
accuracy. Finally, the proposed cross fusion FCN was evaluated on the KITTI
road benchmark where it achieved excellent performance, with a MaxF score of
96.03%, ranking it among the top-performing approaches
Training a Fast Object Detector for LiDAR Range Images Using Labeled Data from Sensors with Higher Resolution
In this paper, we describe a strategy for training neural networks for object
detection in range images obtained from one type of LiDAR sensor using labeled
data from a different type of LiDAR sensor. Additionally, an efficient model
for object detection in range images for use in self-driving cars is presented.
Currently, the highest performing algorithms for object detection from LiDAR
measurements are based on neural networks. Training these networks using
supervised learning requires large annotated datasets. Therefore, most research
using neural networks for object detection from LiDAR point clouds is conducted
on a very small number of publicly available datasets. Consequently, only a
small number of sensor types are used. We use an existing annotated dataset to
train a neural network that can be used with a LiDAR sensor that has a lower
resolution than the one used for recording the annotated dataset. This is done
by simulating data from the lower resolution LiDAR sensor based on the higher
resolution dataset. Furthermore, improvements to models that use LiDAR range
images for object detection are presented. The results are validated using both
simulated sensor data and data from an actual lower resolution sensor mounted
to a research vehicle. It is shown that the model can detect objects from
360{\deg} range images in real time
- …