4,982 research outputs found
SUPER: A Novel Lane Detection System
AI-based lane detection algorithms were actively studied over the last few
years. Many have demonstrated superior performance compared with traditional
feature-based methods. The accuracy, however, is still generally in the low 80%
or high 90%, or even lower when challenging images are used. In this paper, we
propose a real-time lane detection system, called Scene Understanding
Physics-Enhanced Real-time (SUPER) algorithm. The proposed method consists of
two main modules: 1) a hierarchical semantic segmentation network as the scene
feature extractor and 2) a physics enhanced multi-lane parameter optimization
module for lane inference. We train the proposed system using heterogeneous
data from Cityscapes, Vistas and Apollo, and evaluate the performance on four
completely separate datasets (that were never seen before), including Tusimple,
Caltech, URBAN KITTI-ROAD, and X-3000. The proposed approach performs the same
or better than lane detection models already trained on the same dataset and
performs well even on datasets it was never trained on. Real-world vehicle
tests were also conducted. Preliminary test results show promising real-time
lane-detection performance compared with the Mobileye
Monocular Vehicle Self-localization method based on Compact Semantic Map
High precision localization is a crucial requirement for the autonomous
driving system. Traditional positioning methods have some limitations in
providing stable and accurate vehicle poses, especially in an urban
environment. Herein, we propose a novel self-localizing method using a
monocular camera and a 3D compact semantic map. Pre-collected information of
the road landmarks is stored in a self-defined map with a minimal amount of
data. We recognize landmarks using a deep neural network, followed with a
geometric feature extraction process which promotes the measurement accuracy.
The vehicle location and posture are estimated by minimizing a self-defined
re-projection residual error to evaluate the map-to-image registration,
together with a robust association method. We validate the effectiveness of our
approach by applying this method to localize a vehicle in an open dataset,
achieving the RMS accuracy of 0.345 meter with reduced sensor setup and map
storage compared to the state of art approaches. We also evaluate some key
steps and discuss the contribution of the subsystems
The ApolloScape Open Dataset for Autonomous Driving and its Application
Autonomous driving has attracted tremendous attention especially in the past
few years. The key techniques for a self-driving car include solving tasks like
3D map construction, self-localization, parsing the driving road and
understanding objects, which enable vehicles to reason and act. However, large
scale data set for training and system evaluation is still a bottleneck for
developing robust perception models. In this paper, we present the ApolloScape
dataset [1] and its applications for autonomous driving. Compared with existing
public datasets from real scenes, e.g. KITTI [2] or Cityscapes [3], ApolloScape
contains much large and richer labelling including holistic semantic dense
point cloud for each site, stereo, per-pixel semantic labelling, lanemark
labelling, instance segmentation, 3D car instance, high accurate location for
every frame in various driving videos from multiple sites, cities and daytimes.
For each task, it contains at lease 15x larger amount of images than SOTA
datasets. To label such a complete dataset, we develop various tools and
algorithms specified for each task to accelerate the labelling process, such as
3D-2D segment labeling tools, active labelling in videos etc. Depend on
ApolloScape, we are able to develop algorithms jointly consider the learning
and inference of multiple tasks. In this paper, we provide a sensor fusion
scheme integrating camera videos, consumer-grade motion sensors (GPS/IMU), and
a 3D semantic map in order to achieve robust self-localization and semantic
segmentation for autonomous driving. We show that practically, sensor fusion
and joint learning of multiple tasks are beneficial to achieve a more robust
and accurate system. We expect our dataset and proposed relevant algorithms can
support and motivate researchers for further development of multi-sensor fusion
and multi-task learning in the field of computer vision.Comment: Version 4: Accepted by TPAMI. Version 3: 17 pages, 10 tables, 11
figures, added the application (DeLS-3D) based on the ApolloScape Dataset.
Version 2: 7 pages, 6 figures, added comparison with BDD100K datase
A Methodological Review of Visual Road Recognition Procedures for Autonomous Driving Applications
The current research interest in autonomous driving is growing at a rapid
pace, attracting great investments from both the academic and corporate
sectors. In order for vehicles to be fully autonomous, it is imperative that
the driver assistance system is adapt in road and lane keeping. In this paper,
we present a methodological review of techniques with a focus on visual road
detection and recognition. We adopt a pragmatic outlook in presenting this
review, whereby the procedures of road recognition is emphasised with respect
to its practical implementations. The contribution of this review hence covers
the topic in two parts -- the first part describes the methodological approach
to conventional road detection, which covers the algorithms and approaches
involved to classify and segregate roads from non-road regions; and the other
part focuses on recent state-of-the-art machine learning techniques that are
applied to visual road recognition, with an emphasis on methods that
incorporate convolutional neural networks and semantic segmentation. A
subsequent overview of recent implementations in the commercial sector is also
presented, along with some recent research works pertaining to road detections.Comment: 14 pages, 6 Figures, 2 Tables. Permission to reprint granted from
original figure author
Efficient Road Lane Marking Detection with Deep Learning
Lane mark detection is an important element in the road scene analysis for
Advanced Driver Assistant System (ADAS). Limited by the onboard computing
power, it is still a challenge to reduce system complexity and maintain high
accuracy at the same time. In this paper, we propose a Lane Marking Detector
(LMD) using a deep convolutional neural network to extract robust lane marking
features. To improve its performance with a target of lower complexity, the
dilated convolution is adopted. A shallower and thinner structure is designed
to decrease the computational cost. Moreover, we also design post-processing
algorithms to construct 3rd-order polynomial models to fit into the curved
lanes. Our system shows promising results on the captured road scenes.Comment: Accepted at International Conference on Digital Signal Processing
(DSP) 201
Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps
Lidar has become an essential sensor for autonomous driving as it provides
reliable depth estimation. Lidar is also the primary sensor used in building 3D
maps which can be used even in the case of low-cost systems which do not use
Lidar. Computation on Lidar point clouds is intensive as it requires processing
of millions of points per second. Additionally there are many subsequent tasks
such as clustering, detection, tracking and classification which makes
real-time execution challenging. In this paper, we discuss real-time dynamic
object detection algorithms which leverages previously mapped Lidar point
clouds to reduce processing. The prior 3D maps provide a static background
model and we formulate dynamic object detection as a background subtraction
problem. Computation and modeling challenges in the mapping and online
execution pipeline are described. We propose a rejection cascade architecture
to subtract road regions and other 3D regions separately. We implemented an
initial version of our proposed algorithm and evaluated the accuracy on CARLA
simulator.Comment: Preprint Submission to ECCVW AutoNUE 2018 - v2 author name accent
correctio
Can we unify monocular detectors for autonomous driving by using the pixel-wise semantic segmentation of CNNs?
Autonomous driving is a challenging topic that requires complex solutions in
perception tasks such as recognition of road, lanes, traffic signs or lights,
vehicles and pedestrians. Through years of research, computer vision has grown
capable of tackling these tasks with monocular detectors that can provide
remarkable detection rates with relatively low processing times. However, the
recent appearance of Convolutional Neural Networks (CNNs) has revolutionized
the computer vision field and has made possible approaches to perform full
pixel-wise semantic segmentation in times close to real time (even on hardware
that can be carried on a vehicle). In this paper, we propose to use full image
segmentation as an approach to simplify and unify most of the detection tasks
required in the perception module of an autonomous vehicle, analyzing major
concerns such as computation time and detection performance.Comment: Extended abstract presented in IV16-WS Deepdriving
(http://iv2016.berkeleyvision.org/
Driving Policy Transfer via Modularity and Abstraction
End-to-end approaches to autonomous driving have high sample complexity and
are difficult to scale to realistic urban driving. Simulation can help
end-to-end driving systems by providing a cheap, safe, and diverse training
environment. Yet training driving policies in simulation brings up the problem
of transferring such policies to the real world. We present an approach to
transferring driving policies from simulation to reality via modularity and
abstraction. Our approach is inspired by classic driving systems and aims to
combine the benefits of modular architectures and end-to-end deep learning
approaches. The key idea is to encapsulate the driving policy such that it is
not directly exposed to raw perceptual input or low-level vehicle dynamics. We
evaluate the presented approach in simulated urban environments and in the real
world. In particular, we transfer a driving policy trained in simulation to a
1/5-scale robotic truck that is deployed in a variety of conditions, with no
finetuning, on two continents. The supplementary video can be viewed at
https://youtu.be/BrMDJqI6H5UComment: Accepted at Conference on Robotic Learning (CoRL'18)
http://proceedings.mlr.press/v87/mueller18a.htm
A Dataset for Lane Instance Segmentation in Urban Environments
Autonomous vehicles require knowledge of the surrounding road layout, which
can be predicted by state-of-the-art CNNs. This work addresses the current lack
of data for determining lane instances, which are needed for various driving
manoeuvres. The main issue is the time-consuming manual labelling process,
typically applied per image. We notice that driving the car is itself a form of
annotation. Therefore, we propose a semi-automated method that allows for
efficient labelling of image sequences by utilising an estimated road plane in
3D based on where the car has driven and projecting labels from this plane into
all images of the sequence. The average labelling time per image is reduced to
5 seconds and only an inexpensive dash-cam is required for data capture. We are
releasing a dataset of 24,000 images and additionally show experimental
semantic segmentation and instance segmentation results.Comment: ECCV camera read
Affordance Learning In Direct Perception for Autonomous Driving
Recent development in autonomous driving involves high-level computer vision
and detailed road scene understanding. Today, most autonomous vehicles are
using mediated perception approach for path planning and control, which highly
rely on high-definition 3D maps and real time sensors. Recent research efforts
aim to substitute the massive HD maps with coarse road attributes. In this
paper, we follow the direct perception based method to train a deep neural
network for affordance learning in autonomous driving. Our goal in this work is
to develop the affordance learning model based on freely available Google
Street View panoramas and Open Street Map road vector attributes. Driving scene
understanding can be achieved by learning affordances from the images captured
by car-mounted cameras. Such scene understanding by learning affordances may be
useful for corroborating base maps such as HD maps so that the required data
storage space is minimized and available for processing in real time. We
compare capability in road attribute identification between human volunteers
and our model by experimental evaluation. Our results indicate that this method
could act as a cheaper way for training data collection in autonomous driving.
The cross validation results also indicate the effectiveness of our model.Comment: 9 pages, 13 figure
- …