10,750 research outputs found
Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review
Recently, the advancement of deep learning in discriminative feature learning
from 3D LiDAR data has led to rapid development in the field of autonomous
driving. However, automated processing uneven, unstructured, noisy, and massive
3D point clouds is a challenging and tedious task. In this paper, we provide a
systematic review of existing compelling deep learning architectures applied in
LiDAR point clouds, detailing for specific tasks in autonomous driving such as
segmentation, detection, and classification. Although several published
research papers focus on specific topics in computer vision for autonomous
vehicles, to date, no general survey on deep learning applied in LiDAR point
clouds for autonomous vehicles exists. Thus, the goal of this paper is to
narrow the gap in this topic. More than 140 key contributions in the recent
five years are summarized in this survey, including the milestone 3D deep
architectures, the remarkable deep learning applications in 3D semantic
segmentation, object detection, and classification; specific datasets,
evaluation metrics, and the state of the art performance. Finally, we conclude
the remaining challenges and future researches.Comment: 21 pages, submitted to IEEE Transactions on Neural Networks and
Learning System
Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN
Recent advances in deep convolutional neural networks (CNNs) have motivated
researchers to adapt CNNs to directly model points in 3D point clouds. Modeling
local structure has been proven to be important for the success of
convolutional architectures, and researchers exploited the modeling of local
point sets in the feature extraction hierarchy. However, limited attention has
been paid to explicitly model the geometric structure amongst points in a local
region. To address this problem, we propose Geo-CNN, which applies a generic
convolution-like operation dubbed as GeoConv to each point and its local
neighborhood. Local geometric relationships among points are captured when
extracting edge features between the center and its neighboring points. We
first decompose the edge feature extraction process onto three orthogonal
bases, and then aggregate the extracted features based on the angles between
the edge vector and the bases. This encourages the network to preserve the
geometric structure in Euclidean space throughout the feature extraction
hierarchy. GeoConv is a generic and efficient operation that can be easily
integrated into 3D point cloud analysis pipelines for multiple applications. We
evaluate Geo-CNN on ModelNet40 and KITTI and achieve state-of-the-art
performance
A state of the art of urban reconstruction: street, street network, vegetation, urban feature
World population is raising, especially the part of people living in cities.
With increased population and complex roles regarding their inhabitants and
their surroundings, cities concentrate difficulties for design, planning and
analysis. These tasks require a way to reconstruct/model a city. Traditionally,
much attention has been given to buildings reconstruction, yet an essential
part of city were neglected: streets. Streets reconstruction has been seldom
researched. Streets are also complex compositions of urban features, and have a
unique role for transportation (as they comprise roads). We aim at completing
the recent state of the art for building reconstruction (Musialski2012) by
considering all other aspect of urban reconstruction. We introduce the need for
city models. Because reconstruction always necessitates data, we first analyse
which data are available. We then expose a state of the art of street
reconstruction, street network reconstruction, urban features
reconstruction/modelling, vegetation , and urban objects
reconstruction/modelling.
Although reconstruction strategies vary widely, we can order them by the role
the model plays, from data driven approach, to model-based approach, to inverse
procedural modelling and model catalogue matching. The main challenges seems to
come from the complex nature of urban environment and from the limitations of
the available data. Urban features have strong relationships, between them, and
to their surrounding, as well as in hierarchical relations. Procedural
modelling has the power to express these relations, and could be applied to the
reconstruction of urban features via the Inverse Procedural Modelling paradigm.Comment: Extracted from PhD (chap1
PnPNet: End-to-End Perception and Prediction with Tracking in the Loop
We tackle the problem of joint perception and motion forecasting in the
context of self-driving vehicles. Towards this goal we propose PnPNet, an
end-to-end model that takes as input sequential sensor data, and outputs at
each time step object tracks and their future trajectories. The key component
is a novel tracking module that generates object tracks online from detections
and exploits trajectory level features for motion forecasting. Specifically,
the object tracks get updated at each time step by solving both the data
association problem and the trajectory estimation problem. Importantly, the
whole model is end-to-end trainable and benefits from joint optimization of all
tasks. We validate PnPNet on two large-scale driving datasets, and show
significant improvements over the state-of-the-art with better occlusion
recovery and more accurate future prediction.Comment: CVPR202
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
We study the problem of efficient semantic segmentation for large-scale 3D
point clouds. By relying on expensive sampling techniques or computationally
heavy pre/post-processing steps, most existing approaches are only able to be
trained and operate over small-scale point clouds. In this paper, we introduce
RandLA-Net, an efficient and lightweight neural architecture to directly infer
per-point semantics for large-scale point clouds. The key to our approach is to
use random point sampling instead of more complex point selection approaches.
Although remarkably computation and memory efficient, random sampling can
discard key features by chance. To overcome this, we introduce a novel local
feature aggregation module to progressively increase the receptive field for
each 3D point, thereby effectively preserving geometric details. Extensive
experiments show that our RandLA-Net can process 1 million points in a single
pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net
clearly surpasses state-of-the-art approaches for semantic segmentation on two
large-scale benchmarks Semantic3D and SemanticKITTI.Comment: CVPR 2020 Oral. Code and data are available at:
https://github.com/QingyongHu/RandLA-Ne
Holistic Parameteric Reconstruction of Building Models from Point Clouds
Building models are conventionally reconstructed by building roof points
planar segmentation and then using a topology graph to group the planes
together. Roof edges and vertices are then mathematically represented by
intersecting segmented planes. Technically, such solution is based on
sequential local fitting, i.e., the entire data of one building are not
simultaneously participating in determining the building model. As a
consequence, the solution is lack of topological integrity and geometric rigor.
Fundamentally different from this traditional approach, we propose a holistic
parametric reconstruction method which means taking into consideration the
entire point clouds of one building simultaneously. In our work, building
models are reconstructed from predefined parametric (roof) primitives. We first
use a well-designed deep neural network to segment and identify primitives in
the given building point clouds. A holistic optimization strategy is then
introduced to simultaneously determine the parameters of a segmented primitive.
In the last step, the optimal parameters are used to generate a watertight
building model in CityGML format. The airborne LiDAR dataset RoofN3D with
predefined roof types is used for our test. It is shown that PointNet++ applied
to the entire dataset can achieve an accuracy of 83% for primitive
classification. For a subset of 910 buildings in RoofN3D, the holistic approach
is then used to determine the parameters of primitives and reconstruct the
buildings. The achieved overall quality of reconstruction is 0.08 meters for
point-surface-distance or 0.7 times RMSE of the input LiDAR points. The study
demonstrates the efficiency and capability of the proposed approach and its
potential to handle large scale urban point clouds
Parsing Geometry Using Structure-Aware Shape Templates
Real-life man-made objects often exhibit strong and easily-identifiable
structure, as a direct result of their design or their intended functionality.
Structure typically appears in the form of individual parts and their
arrangement. Knowing about object structure can be an important cue for object
recognition and scene understanding - a key goal for various AR and robotics
applications. However, commodity RGB-D sensors used in these scenarios only
produce raw, unorganized point clouds, without structural information about the
captured scene. Moreover, the generated data is commonly partial and
susceptible to artifacts and noise, which makes inferring the structure of
scanned objects challenging. In this paper, we organize large shape collections
into parameterized shape templates to capture the underlying structure of the
objects. The templates allow us to transfer the structural information onto new
objects and incomplete scans. We employ a deep neural network that matches the
partial scan with one of the shape templates, then match and fit it to complete
and detailed models from the collection. This allows us to faithfully label its
parts and to guide the reconstruction of the scanned object. We showcase the
effectiveness of our method by comparing it to other state-of-the-art
approaches
ConvPoint: Continuous Convolutions for Point Cloud Processing
Point clouds are unstructured and unordered data, as opposed to images. Thus,
most machine learning approach developed for image cannot be directly
transferred to point clouds. In this paper, we propose a generalization of
discrete convolutional neural networks (CNNs) in order to deal with point
clouds by replacing discrete kernels by continuous ones. This formulation is
simple, allows arbitrary point cloud sizes and can easily be used for designing
neural networks similarly to 2D CNNs. We present experimental results with
various architectures, highlighting the flexibility of the proposed approach.
We obtain competitive results compared to the state-of-the-art on shape
classification, part segmentation and semantic segmentation for large-scale
point clouds.Comment: 12 page
SeqLPD: Sequence Matching Enhanced Loop-Closure Detection Based on Large-Scale Point Cloud Description for Self-Driving Vehicles
Place recognition and loop-closure detection are main challenges in the
localization, mapping and navigation tasks of self-driving vehicles. In this
paper, we solve the loop-closure detection problem by incorporating the
deep-learning based point cloud description method and the coarse-to-fine
sequence matching strategy. More specifically, we propose a deep neural network
to extract a global descriptor from the original large-scale 3D point cloud,
then based on which, a typical place analysis approach is presented to
investigate the feature space distribution of the global descriptors and select
several super keyframes. Finally, a coarse-to-fine strategy, which includes a
super keyframe based coarse matching stage and a local sequence matching stage,
is presented to ensure the loop-closure detection accuracy and real-time
performance simultaneously. Thanks to the sequence matching operation, the
proposed approach obtains an improvement against the existing deep-learning
based methods. Experiment results on a self-driving vehicle validate the
effectiveness of the proposed loop-closure detection algorithm.Comment: This paper has been accepted by IROS-201
Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features
Learning on point cloud is eagerly in demand because the point cloud is a
common type of geometric data and can aid robots to understand environments
robustly. However, the point cloud is sparse, unstructured, and unordered,
which cannot be recognized accurately by a traditional convolutional neural
network (CNN) nor a recurrent neural network (RNN). Fortunately, a graph
convolutional neural network (Graph CNN) can process sparse and unordered data.
Hence, we propose a linked dynamic graph CNN (LDGCNN) to classify and segment
point cloud directly in this paper. We remove the transformation network, link
hierarchical features from dynamic graphs, freeze feature extractor, and
retrain the classifier to increase the performance of LDGCNN. We explain our
network using theoretical analysis and visualization. Through experiments, we
show that the proposed LDGCNN achieves state-of-art performance on two standard
datasets: ModelNet40 and ShapeNet
- …