13,282 research outputs found
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
In this paper, we address semantic segmentation of road-objects from 3D LiDAR
point clouds. In particular, we wish to detect and categorize instances of
interest, such as cars, pedestrians and cyclists. We formulate this problem as
a point- wise classification problem, and propose an end-to-end pipeline called
SqueezeSeg based on convolutional neural networks (CNN): the CNN takes a
transformed LiDAR point cloud as input and directly outputs a point-wise label
map, which is then refined by a conditional random field (CRF) implemented as a
recurrent layer. Instance-level labels are then obtained by conventional
clustering algorithms. Our CNN model is trained on LiDAR point clouds from the
KITTI dataset, and our point-wise segmentation labels are derived from 3D
bounding boxes from KITTI. To obtain extra training data, we built a LiDAR
simulator into Grand Theft Auto V (GTA-V), a popular video game, to synthesize
large amounts of realistic training data. Our experiments show that SqueezeSeg
achieves high accuracy with astonishingly fast and stable runtime (8.7 ms per
frame), highly desirable for autonomous driving applications. Furthermore,
additionally training on synthesized data boosts validation accuracy on
real-world data. Our source code and synthesized data will be open-sourced
SEGCloud: Semantic Segmentation of 3D Point Clouds
3D semantic scene labeling is fundamental to agents operating in the real
world. In particular, labeling raw 3D point sets from sensors provides
fine-grained semantics. Recent works leverage the capabilities of Neural
Networks (NNs), but are limited to coarse voxel predictions and do not
explicitly enforce global consistency. We present SEGCloud, an end-to-end
framework to obtain 3D point-level segmentation that combines the advantages of
NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields
(FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are
transferred back to the raw 3D points via trilinear interpolation. Then the
FC-CRF enforces global consistency and provides fine-grained semantics on the
points. We implement the latter as a differentiable Recurrent NN to allow joint
optimization. We evaluate the framework on two indoor and two outdoor 3D
datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance
comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision
(3DV 2017
SegICP: Integrated Deep Semantic Segmentation and Pose Estimation
Recent robotic manipulation competitions have highlighted that sophisticated
robots still struggle to achieve fast and reliable perception of task-relevant
objects in complex, realistic scenarios. To improve these systems' perceptive
speed and robustness, we present SegICP, a novel integrated solution to object
recognition and pose estimation. SegICP couples convolutional neural networks
and multi-hypothesis point cloud registration to achieve both robust pixel-wise
semantic segmentation as well as accurate and real-time 6-DOF pose estimation
for relevant objects. Our architecture achieves 1cm position error and
<5^\circ$ angle error in real time without an initial seed. We evaluate and
benchmark SegICP against an annotated dataset generated by motion capture.Comment: IROS camera-read
Fast LIDAR-based Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection using only LIDAR data. Starting from an unstructured point cloud,
top-view images encoding several basic statistics such as mean elevation and
density are generated. By considering a top-view representation, road detection
is reduced to a single-scale problem that can be addressed with a simple and
fast fully convolutional neural network (FCN). The FCN is specifically designed
for the task of pixel-wise semantic segmentation by combining a large receptive
field with high-resolution feature maps. The proposed system achieved excellent
performance and it is among the top-performing algorithms on the KITTI road
benchmark. Its fast inference makes it particularly suitable for real-time
applications
- …