1,793 research outputs found
SalsaNet: Fast Road and Vehicle Segmentation in LiDAR Point Clouds for Autonomous Driving
In this paper, we introduce a deep encoder-decoder network, named SalsaNet,
for efficient semantic segmentation of 3D LiDAR point clouds. SalsaNet segments
the road, i.e. drivable free-space, and vehicles in the scene by employing the
Bird-Eye-View (BEV) image projection of the point cloud. To overcome the lack
of annotated point cloud data, in particular for the road segments, we
introduce an auto-labeling process which transfers automatically generated
labels from the camera to LiDAR. We also explore the role of imagelike
projection of LiDAR data in semantic segmentation by comparing BEV with
spherical-front-view projection and show that SalsaNet is projection-agnostic.
We perform quantitative and qualitative evaluations on the KITTI dataset, which
demonstrate that the proposed SalsaNet outperforms other state-of-the-art
semantic segmentation networks in terms of accuracy and computation time. Our
code and data are publicly available at
https://gitlab.com/aksoyeren/salsanet.git
Multi-stream CNN based Video Semantic Segmentation for Automated Driving
Majority of semantic segmentation algorithms operate on a single frame even
in the case of videos. In this work, the goal is to exploit temporal
information within the algorithm model for leveraging motion cues and temporal
consistency. We propose two simple high-level architectures based on Recurrent
FCN (RFCN) and Multi-Stream FCN (MSFCN) networks. In case of RFCN, a recurrent
network namely LSTM is inserted between the encoder and decoder. MSFCN combines
the encoders of different frames into a fused encoder via 1x1 channel-wise
convolution. We use a ResNet50 network as the baseline encoder and construct
three networks namely MSFCN of order 2 & 3 and RFCN of order 2. MSFCN-3
produces the best results with an accuracy improvement of 9% and 15% for
Highway and New York-like city scenarios in the SYNTHIA-CVPR'16 dataset using
mean IoU metric. MSFCN-3 also produced 11% and 6% for SegTrack V2 and DAVIS
datasets over the baseline FCN network. We also designed an efficient version
of MSFCN-2 and RFCN-2 using weight sharing among the two encoders. The
efficient MSFCN-2 provided an improvement of 11% and 5% for KITTI and SYNTHIA
with negligible increase in computational complexity compared to the baseline
version.Comment: Accepted for Oral Presentation at VISAPP 201
Enhanced free space detection in multiple lanes based on single CNN with scene identification
Many systems for autonomous vehicles' navigation rely on lane detection.
Traditional algorithms usually estimate only the position of the lanes on the
road, but an autonomous control system may also need to know if a lane marking
can be crossed or not, and what portion of space inside the lane is free from
obstacles, to make safer control decisions. On the other hand, free space
detection algorithms only detect navigable areas, without information about
lanes. State-of-the-art algorithms use CNNs for both tasks, with significant
consumption of computing resources. We propose a novel approach that estimates
the free space inside each lane, with a single CNN. Additionally, adding only a
small requirement concerning GPU RAM, we infer the road type, that will be
useful for path planning. To achieve this result, we train a multi-task CNN.
Then, we further elaborate the output of the network, to extract polygons that
can be effectively used in navigation control. Finally, we provide a
computationally efficient implementation, based on ROS, that can be executed in
real time. Our code and trained models are available online.Comment: Will appear in the 2019 IEEE Intelligent Vehicles Symposium (IV 2019
- …