8,997 research outputs found
Driver Distraction Identification with an Ensemble of Convolutional Neural Networks
The World Health Organization (WHO) reported 1.25 million deaths yearly due
to road traffic accidents worldwide and the number has been continuously
increasing over the last few years. Nearly fifth of these accidents are caused
by distracted drivers. Existing work of distracted driver detection is
concerned with a small set of distractions (mostly, cell phone usage).
Unreliable ad-hoc methods are often used.In this paper, we present the first
publicly available dataset for driver distraction identification with more
distraction postures than existing alternatives. In addition, we propose a
reliable deep learning-based solution that achieves a 90% accuracy. The system
consists of a genetically-weighted ensemble of convolutional neural networks,
we show that a weighted ensemble of classifiers using a genetic algorithm
yields in a better classification confidence. We also study the effect of
different visual elements in distraction detection by means of face and hand
localizations, and skin segmentation. Finally, we present a thinned version of
our ensemble that could achieve 84.64% classification accuracy and operate in a
real-time environment.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0949
Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection
Pavement crack detection is a critical task for insuring road safety. Manual
crack detection is extremely time-consuming. Therefore, an automatic road crack
detection method is required to boost this progress. However, it remains a
challenging task due to the intensity inhomogeneity of cracks and complexity of
the background, e.g., the low contrast with surrounding pavements and possible
shadows with similar intensity. Inspired by recent advances of deep learning in
computer vision, we propose a novel network architecture, named Feature Pyramid
and Hierarchical Boosting Network (FPHBN), for pavement crack detection. The
proposed network integrates semantic information to low-level features for
crack detection in a feature pyramid way. And, it balances the contribution of
both easy and hard samples to loss by nested sample reweighting in a
hierarchical way. To demonstrate the superiority and generality of the proposed
method, we evaluate the proposed method on five crack datasets and compare it
with state-of-the-art crack detection, edge detection, semantic segmentation
methods. Extensive experiments show that the proposed method outperforms these
state-of-the-art methods in terms of accuracy and generality
Driver Gaze Zone Estimation using Convolutional Neural Networks: A General Framework and Ablative Analysis
Driver gaze has been shown to be an excellent surrogate for driver attention
in intelligent vehicles. With the recent surge of highly autonomous vehicles,
driver gaze can be useful for determining the handoff time to a human driver.
While there has been significant improvement in personalized driver gaze zone
estimation systems, a generalized system which is invariant to different
subjects, perspectives and scales is still lacking. We take a step towards this
generalized system using Convolutional Neural Networks (CNNs). We finetune 4
popular CNN architectures for this task, and provide extensive comparisons of
their outputs. We additionally experiment with different input image patches,
and also examine how image size affects performance. For training and testing
the networks, we collect a large naturalistic driving dataset comprising of 11
long drives, driven by 10 subjects in two different cars. Our best performing
model achieves an accuracy of 95.18% during cross-subject testing,
outperforming current state of the art techniques for this task. Finally, we
evaluate our best performing model on the publicly available Columbia Gaze
Dataset comprising of images from 56 subjects with varying head pose and gaze
directions. Without any training, our model successfully encodes the different
gaze directions on this diverse dataset, demonstrating good generalization
capabilities
A Methodological Review of Visual Road Recognition Procedures for Autonomous Driving Applications
The current research interest in autonomous driving is growing at a rapid
pace, attracting great investments from both the academic and corporate
sectors. In order for vehicles to be fully autonomous, it is imperative that
the driver assistance system is adapt in road and lane keeping. In this paper,
we present a methodological review of techniques with a focus on visual road
detection and recognition. We adopt a pragmatic outlook in presenting this
review, whereby the procedures of road recognition is emphasised with respect
to its practical implementations. The contribution of this review hence covers
the topic in two parts -- the first part describes the methodological approach
to conventional road detection, which covers the algorithms and approaches
involved to classify and segregate roads from non-road regions; and the other
part focuses on recent state-of-the-art machine learning techniques that are
applied to visual road recognition, with an emphasis on methods that
incorporate convolutional neural networks and semantic segmentation. A
subsequent overview of recent implementations in the commercial sector is also
presented, along with some recent research works pertaining to road detections.Comment: 14 pages, 6 Figures, 2 Tables. Permission to reprint granted from
original figure author
Multimodal Polynomial Fusion for Detecting Driver Distraction
Distracted driving is deadly, claiming 3,477 lives in the U.S. in 2015 alone.
Although there has been a considerable amount of research on modeling the
distracted behavior of drivers under various conditions, accurate automatic
detection using multiple modalities and especially the contribution of using
the speech modality to improve accuracy has received little attention. This
paper introduces a new multimodal dataset for distracted driving behavior and
discusses automatic distraction detection using features from three modalities:
facial expression, speech and car signals. Detailed multimodal feature analysis
shows that adding more modalities monotonically increases the predictive
accuracy of the model. Finally, a simple and effective multimodal fusion
technique using a polynomial fusion layer shows superior distraction detection
results compared to the baseline SVM and neural network models.Comment: INTERSPEECH 201
Self-Driving Cars: A Survey
We survey research on self-driving cars published in the literature focusing
on autonomous cars developed since the DARPA challenges, which are equipped
with an autonomy system that can be categorized as SAE level 3 or higher. The
architecture of the autonomy system of self-driving cars is typically organized
into the perception system and the decision-making system. The perception
system is generally divided into many subsystems responsible for tasks such as
self-driving-car localization, static obstacles mapping, moving obstacles
detection and tracking, road mapping, traffic signalization detection and
recognition, among others. The decision-making system is commonly partitioned
as well into many subsystems responsible for tasks such as route planning, path
planning, behavior selection, motion planning, and control. In this survey, we
present the typical architecture of the autonomy system of self-driving cars.
We also review research on relevant methods for perception and decision making.
Furthermore, we present a detailed description of the architecture of the
autonomy system of the self-driving car developed at the Universidade Federal
do Esp\'irito Santo (UFES), named Intelligent Autonomous Robotics Automobile
(IARA). Finally, we list prominent self-driving car research platforms
developed by academia and technology companies, and reported in the media
Intentions of Vulnerable Road Users - Detection and Forecasting by Means of Machine Learning
Avoiding collisions with vulnerable road users (VRUs) using sensor-based
early recognition of critical situations is one of the manifold opportunities
provided by the current development in the field of intelligent vehicles. As
especially pedestrians and cyclists are very agile and have a variety of
movement options, modeling their behavior in traffic scenes is a challenging
task. In this article we propose movement models based on machine learning
methods, in particular artificial neural networks, in order to classify the
current motion state and to predict the future trajectory of VRUs. Both model
types are also combined to enable the application of specifically trained
motion predictors based on a continuously updated pseudo probabilistic state
classification. Furthermore, the architecture is used to evaluate
motion-specific physical models for starting and stopping and video-based
pedestrian motion classification. A comprehensive dataset consisting of 1068
pedestrian and 494 cyclist scenes acquired at an urban intersection is used for
optimization, training, and evaluation of the different models. The results
show substantial higher classification rates and the ability to earlier
recognize motion state changes with the machine learning approaches compared to
interacting multiple model (IMM) Kalman Filtering. The trajectory prediction
quality is also improved for all kinds of test scenes, especially when starting
and stopping motions are included. Here, 37\% and 41\% lower position errors
were achieved on average, respectively
Towards Full Automated Drive in Urban Environments: A Demonstration in GoMentum Station, California
Each year, millions of motor vehicle traffic accidents all over the world
cause a large number of fatalities, injuries and significant material loss.
Automated Driving (AD) has potential to drastically reduce such accidents. In
this work, we focus on the technical challenges that arise from AD in urban
environments. We present the overall architecture of an AD system and describe
in detail the perception and planning modules. The AD system, built on a
modified Acura RLX, was demonstrated in a course in GoMentum Station in
California. We demonstrated autonomous handling of 4 scenarios: traffic lights,
cross-traffic at intersections, construction zones and pedestrians. The AD
vehicle displayed safe behavior and performed consistently in repeated
demonstrations with slight variations in conditions. Overall, we completed 44
runs, encompassing 110km of automated driving with only 3 cases where the
driver intervened the control of the vehicle, mostly due to error in GPS
positioning. Our demonstration showed that robust and consistent behavior in
urban scenarios is possible, yet more investigation is necessary for full scale
roll-out on public roads.Comment: Accepted to Intelligent Vehicles Conference (IV 2017
Learning to Detect Vehicles by Clustering Appearance Patterns
This paper studies efficient means for dealing with intra-category diversity
in object detection. Strategies for occlusion and orientation handling are
explored by learning an ensemble of detection models from visual and
geometrical clusters of object instances. An AdaBoost detection scheme is
employed with pixel lookup features for fast detection. The analysis provides
insight into the design of a robust vehicle detection system, showing promise
in terms of detection performance and orientation estimation accuracy.Comment: Preprint version of our T-ITS 2015 pape
No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras & LiDARs
Online multi-object tracking (MOT) is extremely important for high-level
spatial reasoning and path planning for autonomous and highly-automated
vehicles. In this paper, we present a modular framework for tracking multiple
objects (vehicles), capable of accepting object proposals from different sensor
modalities (vision and range) and a variable number of sensors, to produce
continuous object tracks. This work is a generalization of the MDP framework
for MOT, with some key extensions - First, we track objects across multiple
cameras and across different sensor modalities. This is done by fusing object
proposals across sensors accurately and efficiently. Second, the objects of
interest (targets) are tracked directly in the real world. This is a departure
from traditional techniques where objects are simply tracked in the image
plane. Doing so allows the tracks to be readily used by an autonomous agent for
navigation and related tasks.
To verify the effectiveness of our approach, we test it on real world highway
data collected from a heavily sensorized testbed capable of capturing
full-surround information. We demonstrate that our framework is well-suited to
track objects through entire maneuvers around the ego-vehicle, some of which
take more than a few minutes to complete. We also leverage the modularity of
our approach by comparing the effects of including/excluding different sensors,
changing the total number of sensors, and the quality of object proposals on
the final tracking result
- …