4,043 research outputs found
Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges
Semantic segmentation was seen as a challenging computer vision problem few
years ago. Due to recent advancements in deep learning, relatively accurate
solutions are now possible for its use in automated driving. In this paper, the
semantic segmentation problem is explored from the perspective of automated
driving. Most of the current semantic segmentation algorithms are designed for
generic images and do not incorporate prior structure and end goal for
automated driving. First, the paper begins with a generic taxonomic survey of
semantic segmentation algorithms and then discusses how it fits in the context
of automated driving. Second, the particular challenges of deploying it into a
safety system which needs high level of accuracy and robustness are listed.
Third, different alternatives instead of using an independent semantic
segmentation module are explored. Finally, an empirical evaluation of various
semantic segmentation architectures was performed on CamVid dataset in terms of
accuracy and speed. This paper is a preliminary shorter version of a more
detailed survey which is work in progress.Comment: To appear in IEEE ITSC 201
Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review
Recently, the advancement of deep learning in discriminative feature learning
from 3D LiDAR data has led to rapid development in the field of autonomous
driving. However, automated processing uneven, unstructured, noisy, and massive
3D point clouds is a challenging and tedious task. In this paper, we provide a
systematic review of existing compelling deep learning architectures applied in
LiDAR point clouds, detailing for specific tasks in autonomous driving such as
segmentation, detection, and classification. Although several published
research papers focus on specific topics in computer vision for autonomous
vehicles, to date, no general survey on deep learning applied in LiDAR point
clouds for autonomous vehicles exists. Thus, the goal of this paper is to
narrow the gap in this topic. More than 140 key contributions in the recent
five years are summarized in this survey, including the milestone 3D deep
architectures, the remarkable deep learning applications in 3D semantic
segmentation, object detection, and classification; specific datasets,
evaluation metrics, and the state of the art performance. Finally, we conclude
the remaining challenges and future researches.Comment: 21 pages, submitted to IEEE Transactions on Neural Networks and
Learning System
DSNet for Real-Time Driving Scene Semantic Segmentation
We focus on the very challenging task of semantic segmentation for autonomous
driving system. It must deliver decent semantic segmentation result for traffic
critical objects real-time. In this paper, we propose a very efficient yet
powerful deep neural network for driving scene semantic segmentation termed as
Driving Segmentation Network (DSNet). DSNet achieves state-of-the-art balance
between accuracy and inference speed through efficient units and architecture
design inspired by ShuffleNet V2 and ENet. More importantly, DSNet highlights
classes most critical with driving decision making through our novel Driving
Importance-weighted Loss. We evaluate DSNet on Cityscapes dataset, our DSNet
achieves 71.8% mean Intersection-over-Union (IoU) on validation set and 69.3%
on test set. Class-wise IoU scores show that Driving Importance-weighted Loss
could improve most driving critical classes by a large margin. Compared with
ENet, DSNet is 18.9% more accurate and 1.1+ times faster which implies great
potential for autonomous driving application.Comment: We have discovered some reported numbers unreproducible, and decided
to redesign the methods, and rewrite most of the pape
A Review on Deep Learning Techniques Applied to Semantic Segmentation
Image semantic segmentation is more and more being of interest for computer
vision and machine learning researchers. Many applications on the rise need
accurate and efficient segmentation mechanisms: autonomous driving, indoor
navigation, and even virtual or augmented reality systems to name a few. This
demand coincides with the rise of deep learning approaches in almost every
field or application target related to computer vision, including semantic
segmentation or scene understanding. This paper provides a review on deep
learning methods for semantic segmentation applied to various application
areas. Firstly, we describe the terminology of this field as well as mandatory
background concepts. Next, the main datasets and challenges are exposed to help
researchers decide which are the ones that best suit their needs and their
targets. Then, existing methods are reviewed, highlighting their contributions
and their significance in the field. Finally, quantitative results are given
for the described methods and the datasets in which they were evaluated,
following up with a discussion of the results. At last, we point out a set of
promising future works and draw our own conclusions about the state of the art
of semantic segmentation using deep learning techniques.Comment: Submitted to TPAMI on Apr. 22, 201
DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation
This paper introduces an extremely efficient CNN architecture named DFANet
for semantic segmentation under resource constraints. Our proposed network
starts from a single lightweight backbone and aggregates discriminative
features through sub-network and sub-stage cascade respectively. Based on the
multi-scale feature propagation, DFANet substantially reduces the number of
parameters, but still obtains sufficient receptive field and enhances the model
learning ability, which strikes a balance between the speed and segmentation
performance. Experiments on Cityscapes and CamVid datasets demonstrate the
superior performance of DFANet with 8 less FLOPs and 2 faster
than the existing state-of-the-art real-time semantic segmentation methods
while providing comparable accuracy. Specifically, it achieves 70.3\% Mean IOU
on the Cityscapes test dataset with only 1.7 GFLOPs and a speed of 160 FPS on
one NVIDIA Titan X card, and 71.3\% Mean IOU with 3.4 GFLOPs while inferring on
a higher resolution image
Light-Weight RefineNet for Real-Time Semantic Segmentation
We consider an important task of effective and efficient semantic image
segmentation. In particular, we adapt a powerful semantic segmentation
architecture, called RefineNet, into the more compact one, suitable even for
tasks requiring real-time performance on high-resolution inputs. To this end,
we identify computationally expensive blocks in the original setup, and propose
two modifications aimed to decrease the number of parameters and floating point
operations. By doing that, we achieve more than twofold model reduction, while
keeping the performance levels almost intact. Our fastest model undergoes a
significant speed-up boost from 20 FPS to 55 FPS on a generic GPU card on
512x512 inputs with solid 81.1% mean iou performance on the test set of PASCAL
VOC, while our slowest model with 32 FPS (from original 17 FPS) shows 82.7%
mean iou on the same dataset. Alternatively, we showcase that our approach is
easily mixable with light-weight classification networks: we attain 79.2% mean
iou on PASCAL VOC using a model that contains only 3.3M parameters and performs
only 9.3B floating point operations.Comment: Models are available here:
https://github.com/drsleep/light-weight-refinenet, BMVC 201
Efficient Road Lane Marking Detection with Deep Learning
Lane mark detection is an important element in the road scene analysis for
Advanced Driver Assistant System (ADAS). Limited by the onboard computing
power, it is still a challenge to reduce system complexity and maintain high
accuracy at the same time. In this paper, we propose a Lane Marking Detector
(LMD) using a deep convolutional neural network to extract robust lane marking
features. To improve its performance with a target of lower complexity, the
dilated convolution is adopted. A shallower and thinner structure is designed
to decrease the computational cost. Moreover, we also design post-processing
algorithms to construct 3rd-order polynomial models to fit into the curved
lanes. Our system shows promising results on the captured road scenes.Comment: Accepted at International Conference on Digital Signal Processing
(DSP) 201
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
We study the problem of efficient semantic segmentation for large-scale 3D
point clouds. By relying on expensive sampling techniques or computationally
heavy pre/post-processing steps, most existing approaches are only able to be
trained and operate over small-scale point clouds. In this paper, we introduce
RandLA-Net, an efficient and lightweight neural architecture to directly infer
per-point semantics for large-scale point clouds. The key to our approach is to
use random point sampling instead of more complex point selection approaches.
Although remarkably computation and memory efficient, random sampling can
discard key features by chance. To overcome this, we introduce a novel local
feature aggregation module to progressively increase the receptive field for
each 3D point, thereby effectively preserving geometric details. Extensive
experiments show that our RandLA-Net can process 1 million points in a single
pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net
clearly surpasses state-of-the-art approaches for semantic segmentation on two
large-scale benchmarks Semantic3D and SemanticKITTI.Comment: CVPR 2020 Oral. Code and data are available at:
https://github.com/QingyongHu/RandLA-Ne
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
The ability to perform pixel-wise semantic segmentation in real-time is of
paramount importance in mobile applications. Recent deep neural networks aimed
at this task have the disadvantage of requiring a large number of floating
point operations and have long run-times that hinder their usability. In this
paper, we propose a novel deep neural network architecture named ENet
(efficient neural network), created specifically for tasks requiring low
latency operation. ENet is up to 18 faster, requires 75 less
FLOPs, has 79 less parameters, and provides similar or better accuracy
to existing models. We have tested it on CamVid, Cityscapes and SUN datasets
and report on comparisons with existing state-of-the-art methods, and the
trade-offs between accuracy and processing time of a network. We present
performance measurements of the proposed architecture on embedded systems and
suggest possible software improvements that could make ENet even faster
CGNet: A Light-weight Context Guided Network for Semantic Segmentation
The demand of applying semantic segmentation model on mobile devices has been
increasing rapidly. Current state-of-the-art networks have enormous amount of
parameters hence unsuitable for mobile devices, while other small memory
footprint models follow the spirit of classification network and ignore the
inherent characteristic of semantic segmentation. To tackle this problem, we
propose a novel Context Guided Network (CGNet), which is a light-weight and
efficient network for semantic segmentation. We first propose the Context
Guided (CG) block, which learns the joint feature of both local feature and
surrounding context, and further improves the joint feature with the global
context. Based on the CG block, we develop CGNet which captures contextual
information in all stages of the network and is specially tailored for
increasing segmentation accuracy. CGNet is also elaborately designed to reduce
the number of parameters and save memory footprint. Under an equivalent number
of parameters, the proposed CGNet significantly outperforms existing
segmentation networks. Extensive experiments on Cityscapes and CamVid datasets
verify the effectiveness of the proposed approach. Specifically, without any
post-processing and multi-scale testing, the proposed CGNet achieves 64.8% mean
IoU on Cityscapes with less than 0.5 M parameters. The source code for the
complete system can be found at https://github.com/wutianyiRosun/CGNet.Comment: Code: https://github.com/wutianyiRosun/CGNe
- …