4,829 research outputs found
Dynamic Graph CNN for Learning on Point Clouds
Point clouds provide a flexible geometric representation suitable for
countless applications in computer graphics; they also comprise the raw output
of most 3D data acquisition devices. While hand-designed features on point
clouds have long been proposed in graphics and vision, however, the recent
overwhelming success of convolutional neural networks (CNNs) for image analysis
suggests the value of adapting insight from CNN to the point cloud world. Point
clouds inherently lack topological information so designing a model to recover
topology can enrich the representation power of point clouds. To this end, we
propose a new neural network module dubbed EdgeConv suitable for CNN-based
high-level tasks on point clouds including classification and segmentation.
EdgeConv acts on graphs dynamically computed in each layer of the network. It
is differentiable and can be plugged into existing architectures. Compared to
existing modules operating in extrinsic space or treating each point
independently, EdgeConv has several appealing properties: It incorporates local
neighborhood information; it can be stacked applied to learn global shape
properties; and in multi-layer systems affinity in feature space captures
semantic characteristics over potentially long distances in the original
embedding. We show the performance of our model on standard benchmarks
including ModelNet40, ShapeNetPart, and S3DIS
Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review
Recently, the advancement of deep learning in discriminative feature learning
from 3D LiDAR data has led to rapid development in the field of autonomous
driving. However, automated processing uneven, unstructured, noisy, and massive
3D point clouds is a challenging and tedious task. In this paper, we provide a
systematic review of existing compelling deep learning architectures applied in
LiDAR point clouds, detailing for specific tasks in autonomous driving such as
segmentation, detection, and classification. Although several published
research papers focus on specific topics in computer vision for autonomous
vehicles, to date, no general survey on deep learning applied in LiDAR point
clouds for autonomous vehicles exists. Thus, the goal of this paper is to
narrow the gap in this topic. More than 140 key contributions in the recent
five years are summarized in this survey, including the milestone 3D deep
architectures, the remarkable deep learning applications in 3D semantic
segmentation, object detection, and classification; specific datasets,
evaluation metrics, and the state of the art performance. Finally, we conclude
the remaining challenges and future researches.Comment: 21 pages, submitted to IEEE Transactions on Neural Networks and
Learning System
PointConv: Deep Convolutional Networks on 3D Point Clouds
Unlike images which are represented in regular dense grids, 3D point clouds
are irregular and unordered, hence applying convolution on them can be
difficult. In this paper, we extend the dynamic filter to a new convolution
operation, named PointConv. PointConv can be applied on point clouds to build
deep convolutional networks. We treat convolution kernels as nonlinear
functions of the local coordinates of 3D points comprised of weight and density
functions. With respect to a given point, the weight functions are learned with
multi-layer perceptron networks and density functions through kernel density
estimation. The most important contribution of this work is a novel
reformulation proposed for efficiently computing the weight functions, which
allowed us to dramatically scale up the network and significantly improve its
performance. The learned convolution kernel can be used to compute
translation-invariant and permutation-invariant convolution on any point set in
the 3D space. Besides, PointConv can also be used as deconvolution operators to
propagate features from a subsampled point cloud back to its original
resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep
convolutional neural networks built on PointConv are able to achieve
state-of-the-art on challenging semantic segmentation benchmarks on 3D point
clouds. Besides, our experiments converting CIFAR-10 into a point cloud showed
that networks built on PointConv can match the performance of convolutional
networks in 2D images of a similar structure
3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions
In this paper, we propose a novel generative adversarial network (GAN) for 3D
point clouds generation, which is called tree-GAN. To achieve state-of-the-art
performance for multi-class 3D point cloud generation, a tree-structured graph
convolution network (TreeGCN) is introduced as a generator for tree-GAN.
Because TreeGCN performs graph convolutions within a tree, it can use ancestor
information to boost the representation power for features. To evaluate GANs
for 3D point clouds accurately, we develop a novel evaluation metric called
Frechet point cloud distance (FPD). Experimental results demonstrate that the
proposed tree-GAN outperforms state-of-the-art GANs in terms of both
conventional metrics and FPD, and can generate point clouds for different
semantic parts without prior knowledge.Comment: 10 page
Octree guided CNN with Spherical Kernels for 3D Point Clouds
We propose an octree guided neural network architecture and spherical
convolutional kernel for machine learning from arbitrary 3D point clouds. The
network architecture capitalizes on the sparse nature of irregular point
clouds, and hierarchically coarsens the data representation with space
partitioning. At the same time, the proposed spherical kernels systematically
quantize point neighborhoods to identify local geometric structures in the
data, while maintaining the properties of translation-invariance and asymmetry.
We specify spherical kernels with the help of network neurons that in turn are
associated with spatial locations. We exploit this association to avert dynamic
kernel generation during network training that enables efficient learning with
high resolution point clouds. The effectiveness of the proposed technique is
established on the benchmark tasks of 3D object classification and
segmentation, achieving new state-of-the-art on ShapeNet and RueMonge2014
datasets.Comment: Accepted in IEEE CVPR 2019. arXiv admin note: substantial text
overlap with arXiv:1805.0787
Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds
It has witnessed a growing demand for efficient representation learning on
point clouds in many 3D computer vision applications. Behind the success story
of convolutional neural networks (CNNs) is that the data (e.g., images) are
Euclidean structured. However, point clouds are irregular and unordered.
Various point neural networks have been developed with isotropic filters or
using weighting matrices to overcome the structure inconsistency on point
clouds. However, isotropic filters or weighting matrices limit the
representation power. In this paper, we propose a permutable anisotropic
convolutional operation (PAI-Conv) that calculates soft-permutation matrices
for each point using dot-product attention according to a set of evenly
distributed kernel points on a sphere's surface and performs shared anisotropic
filters. In fact, dot product with kernel points is by analogy with the
dot-product with keys in Transformer as widely used in natural language
processing (NLP). From this perspective, PAI-Conv can be regarded as the
transformer for point clouds, which is physically meaningful and is robust to
cooperate with the efficient random point sampling method. Comprehensive
experiments on point clouds demonstrate that PAI-Conv produces competitive
results in classification and semantic segmentation tasks compared to
state-of-the-art methods
MeshCNN: A Network with an Edge
Polygonal meshes provide an efficient representation for 3D shapes. They
explicitly capture both shape surface and topology, and leverage non-uniformity
to represent large flat regions as well as sharp, intricate features. This
non-uniformity and irregularity, however, inhibits mesh analysis efforts using
neural networks that combine convolution and pooling operations. In this paper,
we utilize the unique properties of the mesh for a direct analysis of 3D shapes
using MeshCNN, a convolutional neural network designed specifically for
triangular meshes. Analogous to classic CNNs, MeshCNN combines specialized
convolution and pooling layers that operate on the mesh edges, by leveraging
their intrinsic geodesic connections. Convolutions are applied on edges and the
four edges of their incident triangles, and pooling is applied via an edge
collapse operation that retains surface topology, thereby, generating new mesh
connectivity for the subsequent convolutions. MeshCNN learns which edges to
collapse, thus forming a task-driven process where the network exposes and
expands the important features while discarding the redundant ones. We
demonstrate the effectiveness of our task-driven pooling on various learning
tasks applied to 3D meshes.Comment: For a two-minute explanation video see https://bit.ly/meshcnnvide
SAWNet: A Spatially Aware Deep Neural Network for 3D Point Cloud Processing
Deep neural networks have established themselves as the state-of-the-art
methodology in almost all computer vision tasks to date. But their application
to processing data lying on non-Euclidean domains is still a very active area
of research. One such area is the analysis of point cloud data which poses a
challenge due to its lack of order. Many recent techniques have been proposed,
spearheaded by the PointNet architecture. These techniques use either global or
local information from the point clouds to extract a latent representation for
the points, which is then used for the task at hand
(classification/segmentation). In our work, we introduce a neural network layer
that combines both global and local information to produce better embeddings of
these points. We enhance our architecture with residual connections, to pass
information between the layers, which also makes the network easier to train.
We achieve state-of-the-art results on the ModelNet40 dataset with our
architecture, and our results are also highly competitive with the
state-of-the-art on the ShapeNet part segmentation dataset and the indoor scene
segmentation dataset. We plan to open source our pre-trained models on github
to encourage the research community to test our networks on their data, or
simply use them for benchmarking purposes
ConvPoint: Continuous Convolutions for Point Cloud Processing
Point clouds are unstructured and unordered data, as opposed to images. Thus,
most machine learning approach developed for image cannot be directly
transferred to point clouds. In this paper, we propose a generalization of
discrete convolutional neural networks (CNNs) in order to deal with point
clouds by replacing discrete kernels by continuous ones. This formulation is
simple, allows arbitrary point cloud sizes and can easily be used for designing
neural networks similarly to 2D CNNs. We present experimental results with
various architectures, highlighting the flexibility of the proposed approach.
We obtain competitive results compared to the state-of-the-art on shape
classification, part segmentation and semantic segmentation for large-scale
point clouds.Comment: 12 page
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
We study the problem of efficient semantic segmentation for large-scale 3D
point clouds. By relying on expensive sampling techniques or computationally
heavy pre/post-processing steps, most existing approaches are only able to be
trained and operate over small-scale point clouds. In this paper, we introduce
RandLA-Net, an efficient and lightweight neural architecture to directly infer
per-point semantics for large-scale point clouds. The key to our approach is to
use random point sampling instead of more complex point selection approaches.
Although remarkably computation and memory efficient, random sampling can
discard key features by chance. To overcome this, we introduce a novel local
feature aggregation module to progressively increase the receptive field for
each 3D point, thereby effectively preserving geometric details. Extensive
experiments show that our RandLA-Net can process 1 million points in a single
pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net
clearly surpasses state-of-the-art approaches for semantic segmentation on two
large-scale benchmarks Semantic3D and SemanticKITTI.Comment: CVPR 2020 Oral. Code and data are available at:
https://github.com/QingyongHu/RandLA-Ne
- …