883 research outputs found
Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds
We propose a spherical kernel for efficient graph convolution of 3D point
clouds. Our metric-based kernels systematically quantize the local 3D space to
identify distinctive geometric relationships in the data. Similar to the
regular grid CNN kernels, the spherical kernel maintains translation-invariance
and asymmetry properties, where the former guarantees weight sharing among
similar local structures in the data and the latter facilitates fine geometric
learning. The proposed kernel is applied to graph neural networks without
edge-dependent filter generation, making it computationally attractive for
large point clouds. In our graph networks, each vertex is associated with a
single point location and edges connect the neighborhood points within a
defined range. The graph gets coarsened in the network with farthest point
sampling. Analogous to the standard CNNs, we define pooling and unpooling
operations for our network. We demonstrate the effectiveness of the proposed
spherical kernel with graph neural networks for point cloud classification and
semantic segmentation using ModelNet, ShapeNet, RueMonge2014, ScanNet and S3DIS
datasets. The source code and the trained models can be downloaded from
https://github.com/hlei-ziyan/SPH3D-GCN.Comment: Accepted to TPAM
Spherical CNNs on Unstructured Grids
We present an efficient convolution kernel for Convolutional Neural Networks
(CNNs) on unstructured grids using parameterized differential operators while
focusing on spherical signals such as panorama images or planetary signals. To
this end, we replace conventional convolution kernels with linear combinations
of differential operators that are weighted by learnable parameters.
Differential operators can be efficiently estimated on unstructured grids using
one-ring neighbors, and learnable parameters can be optimized through standard
back-propagation. As a result, we obtain extremely efficient neural networks
that match or outperform state-of-the-art network architectures in terms of
performance but with a significantly lower number of network parameters. We
evaluate our algorithm in an extensive series of experiments on a variety of
computer vision and climate science tasks, including shape classification,
climate pattern segmentation, and omnidirectional image semantic segmentation.
Overall, we present (1) a novel CNN approach on unstructured grids using
parameterized differential operators for spherical signals, and (2) we show
that our unique kernel parameterization allows our model to achieve the same or
higher accuracy with significantly fewer network parameters.Comment: Accepted as a conference paper at ICLR 2019. Codes available at
https://github.com/maxjiang93/ugscn
VV-Net: Voxel VAE Net with Group Convolutions for Point Cloud Segmentation
We present a novel algorithm for point cloud segmentation. Our approach
transforms unstructured point clouds into regular voxel grids, and further uses
a kernel-based interpolated variational autoencoder (VAE) architecture to
encode the local geometry within each voxel. Traditionally, the voxel
representation only comprises Boolean occupancy information which fails to
capture the sparsely distributed points within voxels in a compact manner. In
order to handle sparse distributions of points, we further employ radial basis
functions (RBF) to compute a local, continuous representation within each
voxel. Our approach results in a good volumetric representation that
effectively tackles noisy point cloud datasets and is more robust for learning.
Moreover, we further introduce group equivariant CNN to 3D, by defining the
convolution operator on a symmetry group acting on and its
isomorphic sets. This improves the expressive capacity without increasing
parameters, leading to more robust segmentation results. We highlight the
performance on standard benchmarks and show that our approach outperforms
state-of-the-art segmentation algorithms on the ShapeNet and S3DIS datasets.Comment: Accepted by International Conference on Computer Vision (ICCV) 201
To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
3D object detection is vital for many robotics applications. For tasks where
a 2D perspective range image exists, we propose to learn a 3D representation
directly from this range image view. To this end, we designed a 2D
convolutional network architecture that carries the 3D spherical coordinates of
each pixel throughout the network. Its layers can consume any arbitrary
convolution kernel in place of the default inner product kernel and exploit the
underlying local geometry around each pixel. We outline four such kernels: a
dense kernel according to the bag-of-words paradigm, and three graph kernels
inspired by recent graph neural network advances: the Transformer, the
PointNet, and the Edge Convolution. We also explore cross-modality fusion with
the camera image, facilitated by operating in the perspective range image view.
Our method performs competitively on the Waymo Open Dataset and improves the
state-of-the-art AP for pedestrian detection from 69.7% to 75.5%. It is also
efficient in that our smallest model, which still outperforms the popular
PointPillars in quality, requires 180 times fewer FLOPS and model parameter
Sim2Real 3D Object Classification using Spherical Kernel Point Convolution and a Deep Center Voting Scheme
While object semantic understanding is essential for most service robotic
tasks, 3D object classification is still an open problem. Learning from
artificial 3D models alleviates the cost of annotation necessary to approach
this problem, but most methods still struggle with the differences existing
between artificial and real 3D data. We conjecture that the cause of those
issue is the fact that many methods learn directly from point coordinates,
instead of the shape, as the former is hard to center and to scale under
variable occlusions reliably. We introduce spherical kernel point convolutions
that directly exploit the object surface, represented as a graph, and a voting
scheme to limit the impact of poor segmentation on the classification results.
Our proposed approach improves upon state-of-the-art methods by up to 36% when
transferring from artificial objects to real objects
A-CNN: Annularly Convolutional Neural Networks on Point Clouds
Analyzing the geometric and semantic properties of 3D point clouds through
the deep networks is still challenging due to the irregularity and sparsity of
samplings of their geometric structures. This paper presents a new method to
define and compute convolution directly on 3D point clouds by the proposed
annular convolution. This new convolution operator can better capture the local
neighborhood geometry of each point by specifying the (regular and dilated)
ring-shaped structures and directions in the computation. It can adapt to the
geometric variability and scalability at the signal processing level. We apply
it to the developed hierarchical neural networks for object classification,
part segmentation, and semantic segmentation in large-scale scenes. The
extensive experiments and comparisons demonstrate that our approach outperforms
the state-of-the-art methods on a variety of standard benchmark datasets (e.g.,
ModelNet10, ModelNet40, ShapeNet-part, S3DIS, and ScanNet).Comment: 17 pages, 14 figures. To appear, Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), June 201
Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review
Recently, the advancement of deep learning in discriminative feature learning
from 3D LiDAR data has led to rapid development in the field of autonomous
driving. However, automated processing uneven, unstructured, noisy, and massive
3D point clouds is a challenging and tedious task. In this paper, we provide a
systematic review of existing compelling deep learning architectures applied in
LiDAR point clouds, detailing for specific tasks in autonomous driving such as
segmentation, detection, and classification. Although several published
research papers focus on specific topics in computer vision for autonomous
vehicles, to date, no general survey on deep learning applied in LiDAR point
clouds for autonomous vehicles exists. Thus, the goal of this paper is to
narrow the gap in this topic. More than 140 key contributions in the recent
five years are summarized in this survey, including the milestone 3D deep
architectures, the remarkable deep learning applications in 3D semantic
segmentation, object detection, and classification; specific datasets,
evaluation metrics, and the state of the art performance. Finally, we conclude
the remaining challenges and future researches.Comment: 21 pages, submitted to IEEE Transactions on Neural Networks and
Learning System
Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes
Existing networks directly learn feature representations on 3D point clouds
for shape analysis. We argue that 3D point clouds are highly redundant and hold
irregular (permutation-invariant) structure, which makes it difficult to
achieve inter-class discrimination efficiently. In this paper, we propose a
two-faceted solution to this problem that is seamlessly integrated in a single
`Blended Convolution and Synthesis' layer. This fully differentiable layer
performs two critical tasks in succession. In the first step, it projects the
input 3D point clouds into a latent 3D space to synthesize a highly compact and
more inter-class discriminative point cloud representation. Since, 3D point
clouds do not follow a Euclidean topology, standard 2/3D Convolutional Neural
Networks offer limited representation capability. Therefore, in the second
step, it uses a novel 3D convolution operator functioning inside the unit ball
() to extract useful volumetric features. We extensively derive
formulae to achieve both translation and rotation of our novel convolution
kernels. Finally, using the proposed techniques we present an extremely
light-weight, end-to-end architecture that achieves compelling results on 3D
shape recognition and retrieval.Comment: 10 pages: corrected typos and added affiliations. The IEEE Winter
Conference on Applications of Computer Vision. 202
Hausdorff Point Convolution with Geometric Priors
Without a shape-aware response, it is hard to characterize the 3D geometry of
a point cloud efficiently with a compact set of kernels. In this paper, we
advocate the use of Hausdorff distance as a shape-aware distance measure for
calculating point convolutional responses. The technique we present, coined
Hausdorff Point Convolution (HPC), is shape-aware. We show that HPC constitutes
a powerful point feature learning with a rather compact set of only four types
of geometric priors as kernels. We further develop a HPC-based deep neural
network (HPC-DNN). Task-specific learning can be achieved by tuning the network
weights for combining the shortest distances between input and kernel point
sets. We also realize hierarchical feature learning by designing a multi-kernel
HPC for multi-scale feature encoding. Extensive experiments demonstrate that
HPC-DNN outperforms strong point convolution baselines (e.g., KPConv),
achieving 2.8% mIoU performance boost on S3DIS and 1.5% on SemanticKITTI for
semantic segmentation task.Comment: 10 pages, 8 figure
Multiresolution Tree Networks for 3D Point Cloud Processing
We present multiresolution tree-structured networks to process point clouds
for 3D shape understanding and generation tasks. Our network represents a 3D
shape as a set of locality-preserving 1D ordered list of points at multiple
resolutions. This allows efficient feed-forward processing through 1D
convolutions, coarse-to-fine analysis through a multi-grid architecture, and it
leads to faster convergence and small memory footprint during training. The
proposed tree-structured encoders can be used to classify shapes and outperform
existing point-based architectures on shape classification benchmarks, while
tree-structured decoders can be used for generating point clouds directly and
they outperform existing approaches for image-to-shape inference tasks learned
using the ShapeNet dataset. Our model also allows unsupervised learning of
point-cloud based shapes by using a variational autoencoder, leading to
higher-quality generated shapes.Comment: Accepted to ECCV 2018. 23 pages, including supplemental materia
- …