6,485 research outputs found
SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
Vision-based vehicle detection approaches achieve incredible success in
recent years with the development of deep convolutional neural network (CNN).
However, existing CNN based algorithms suffer from the problem that the
convolutional features are scale-sensitive in object detection task but it is
common that traffic images and videos contain vehicles with a large variance of
scales. In this paper, we delve into the source of scale sensitivity, and
reveal two key issues: 1) existing RoI pooling destroys the structure of small
scale objects, 2) the large intra-class distance for a large variance of scales
exceeds the representation capability of a single network. Based on these
findings, we present a scale-insensitive convolutional neural network (SINet)
for fast detecting vehicles with a large variance of scales. First, we present
a context-aware RoI pooling to maintain the contextual information and original
structure of small scale objects. Second, we present a multi-branch decision
network to minimize the intra-class distance of features. These lightweight
techniques bring zero extra time complexity but prominent detection accuracy
improvement. The proposed techniques can be equipped with any deep network
architectures and keep them trained end-to-end. Our SINet achieves
state-of-the-art performance in terms of accuracy and speed (up to 37 FPS) on
the KITTI benchmark and a new highway dataset, which contains a large variance
of scales and extremely small objects.Comment: Accepted by IEEE Transactions on Intelligent Transportation Systems
(T-ITS
Fast LIDAR-based Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection using only LIDAR data. Starting from an unstructured point cloud,
top-view images encoding several basic statistics such as mean elevation and
density are generated. By considering a top-view representation, road detection
is reduced to a single-scale problem that can be addressed with a simple and
fast fully convolutional neural network (FCN). The FCN is specifically designed
for the task of pixel-wise semantic segmentation by combining a large receptive
field with high-resolution feature maps. The proposed system achieved excellent
performance and it is among the top-performing algorithms on the KITTI road
benchmark. Its fast inference makes it particularly suitable for real-time
applications
LIDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection by fusing LIDAR point clouds and camera images. An unstructured and
sparse point cloud is first projected onto the camera image plane and then
upsampled to obtain a set of dense 2D images encoding spatial information.
Several fully convolutional neural networks (FCNs) are then trained to carry
out road detection, either by using data from a single sensor, or by using
three fusion strategies: early, late, and the newly proposed cross fusion.
Whereas in the former two fusion approaches, the integration of multimodal
information is carried out at a predefined depth level, the cross fusion FCN is
designed to directly learn from data where to integrate information; this is
accomplished by using trainable cross connections between the LIDAR and the
camera processing branches.
To further highlight the benefits of using a multimodal system for road
detection, a data set consisting of visually challenging scenes was extracted
from driving sequences of the KITTI raw data set. It was then demonstrated
that, as expected, a purely camera-based FCN severely underperforms on this
data set. A multimodal system, on the other hand, is still able to provide high
accuracy. Finally, the proposed cross fusion FCN was evaluated on the KITTI
road benchmark where it achieved excellent performance, with a MaxF score of
96.03%, ranking it among the top-performing approaches
An Empirical Evaluation of Deep Learning on Highway Driving
Numerous groups have applied a variety of deep learning techniques to
computer vision problems in highway perception scenarios. In this paper, we
presented a number of empirical evaluations of recent deep learning advances.
Computer vision, combined with deep learning, has the potential to bring about
a relatively inexpensive, robust solution to autonomous driving. To prepare
deep learning for industry uptake and practical applications, neural networks
will require large data sets that represent all possible driving environments
and scenarios. We collect a large data set of highway data and apply deep
learning and computer vision algorithms to problems such as car and lane
detection. We show how existing convolutional neural networks (CNNs) can be
used to perform lane and vehicle detection while running at frame rates
required for a real-time system. Our results lend credence to the hypothesis
that deep learning holds promise for autonomous driving.Comment: Added a video for lane detectio
Driver Distraction Identification with an Ensemble of Convolutional Neural Networks
The World Health Organization (WHO) reported 1.25 million deaths yearly due
to road traffic accidents worldwide and the number has been continuously
increasing over the last few years. Nearly fifth of these accidents are caused
by distracted drivers. Existing work of distracted driver detection is
concerned with a small set of distractions (mostly, cell phone usage).
Unreliable ad-hoc methods are often used.In this paper, we present the first
publicly available dataset for driver distraction identification with more
distraction postures than existing alternatives. In addition, we propose a
reliable deep learning-based solution that achieves a 90% accuracy. The system
consists of a genetically-weighted ensemble of convolutional neural networks,
we show that a weighted ensemble of classifiers using a genetic algorithm
yields in a better classification confidence. We also study the effect of
different visual elements in distraction detection by means of face and hand
localizations, and skin segmentation. Finally, we present a thinned version of
our ensemble that could achieve 84.64% classification accuracy and operate in a
real-time environment.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0949
- …