569 research outputs found
Addressing Overfitting on Pointcloud Classification using Atrous XCRF
Advances in techniques for automated classification of pointcloud data
introduce great opportunities for many new and existing applications. However,
with a limited number of labeled points, automated classification by a machine
learning model is prone to overfitting and poor generalization. The present
paper addresses this problem by inducing controlled noise (on a trained model)
generated by invoking conditional random field similarity penalties using
nearby features. The method is called Atrous XCRF and works by forcing a
trained model to respect the similarity penalties provided by unlabeled data.
In a benchmark study carried out using the ISPRS 3D labeling dataset, our
technique achieves 84.97% in term of overall accuracy, and 71.05% in term of F1
score. The result is on par with the current best model for the benchmark
dataset and has the highest value in term of F1 score
3D Object Recognition with Ensemble Learning --- A Study of Point Cloud-Based Deep Learning Models
In this study, we present an analysis of model-based ensemble learning for 3D
point-cloud object classification and detection. An ensemble of multiple model
instances is known to outperform a single model instance, but there is little
study of the topic of ensemble learning for 3D point clouds. First, an ensemble
of multiple model instances trained on the same part of the
dataset was tested for seven deep learning, point
cloud-based classification algorithms: ,
, , ,
, , and . Second, the
ensemble of different architectures was tested. Results of our experiments show
that the tested ensemble learning methods improve over state-of-the-art on the
dataset, from to for the ensemble of
single architecture instances, for two different architectures, and
for five different architectures. We show that the ensemble of two
models with different architectures can be as effective as the ensemble of 10
models with the same architecture. Third, a study on classic bagging i.e. with
different subsets used for training multiple model instances) was tested and
sources of ensemble accuracy growth were investigated for best-performing
architecture, i.e. . We also investigate the ensemble learning
of approach in the task of 3D object detection,
increasing the average precision of 3D box detection on the
dataset from to using only three model instances. We measure
the inference time of all 3D classification architectures on a , a common embedded computer for mobile robots, to allude to the
use of these models in real-life applications
ConvPoint: Continuous Convolutions for Point Cloud Processing
Point clouds are unstructured and unordered data, as opposed to images. Thus,
most machine learning approach developed for image cannot be directly
transferred to point clouds. In this paper, we propose a generalization of
discrete convolutional neural networks (CNNs) in order to deal with point
clouds by replacing discrete kernels by continuous ones. This formulation is
simple, allows arbitrary point cloud sizes and can easily be used for designing
neural networks similarly to 2D CNNs. We present experimental results with
various architectures, highlighting the flexibility of the proposed approach.
We obtain competitive results compared to the state-of-the-art on shape
classification, part segmentation and semantic segmentation for large-scale
point clouds.Comment: 12 page
A-CNN: Annularly Convolutional Neural Networks on Point Clouds
Analyzing the geometric and semantic properties of 3D point clouds through
the deep networks is still challenging due to the irregularity and sparsity of
samplings of their geometric structures. This paper presents a new method to
define and compute convolution directly on 3D point clouds by the proposed
annular convolution. This new convolution operator can better capture the local
neighborhood geometry of each point by specifying the (regular and dilated)
ring-shaped structures and directions in the computation. It can adapt to the
geometric variability and scalability at the signal processing level. We apply
it to the developed hierarchical neural networks for object classification,
part segmentation, and semantic segmentation in large-scale scenes. The
extensive experiments and comparisons demonstrate that our approach outperforms
the state-of-the-art methods on a variety of standard benchmark datasets (e.g.,
ModelNet10, ModelNet40, ShapeNet-part, S3DIS, and ScanNet).Comment: 17 pages, 14 figures. To appear, Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), June 201
Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling
Geometric deep learning is increasingly important thanks to the popularity of
3D sensors. Inspired by the recent advances in NLP domain, the self-attention
transformer is introduced to consume the point clouds. We develop Point
Attention Transformers (PATs), using a parameter-efficient Group Shuffle
Attention (GSA) to replace the costly Multi-Head Attention. We demonstrate its
ability to process size-varying inputs, and prove its permutation equivariance.
Besides, prior work uses heuristics dependence on the input data (e.g.,
Furthest Point Sampling) to hierarchically select subsets of input points.
Thereby, we for the first time propose an end-to-end learnable and
task-agnostic sampling operation, named Gumbel Subset Sampling (GSS), to select
a representative subset of input points. Equipped with Gumbel-Softmax, it
produces a "soft" continuous subset in training phase, and a "hard" discrete
subset in test phase. By selecting representative subsets in a hierarchical
fashion, the networks learn a stronger representation of the input sets with
lower computation cost. Experiments on classification and segmentation
benchmarks show the effectiveness and efficiency of our methods. Furthermore,
we propose a novel application, to process event camera stream as point clouds,
and achieve a state-of-the-art performance on DVS128 Gesture Dataset.Comment: CVPR'201
Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55
We introduce a large-scale 3D shape understanding benchmark using data and
annotation from ShapeNet 3D object database. The benchmark consists of two
tasks: part-level segmentation of 3D shapes and 3D reconstruction from single
view images. Ten teams have participated in the challenge and the best
performing teams have outperformed state-of-the-art approaches on both tasks. A
few novel deep learning architectures have been proposed on various 3D
representations on both tasks. We report the techniques used by each team and
the corresponding performances. In addition, we summarize the major discoveries
from the reported results and possible trends for the future work in the field
Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds
It has witnessed a growing demand for efficient representation learning on
point clouds in many 3D computer vision applications. Behind the success story
of convolutional neural networks (CNNs) is that the data (e.g., images) are
Euclidean structured. However, point clouds are irregular and unordered.
Various point neural networks have been developed with isotropic filters or
using weighting matrices to overcome the structure inconsistency on point
clouds. However, isotropic filters or weighting matrices limit the
representation power. In this paper, we propose a permutable anisotropic
convolutional operation (PAI-Conv) that calculates soft-permutation matrices
for each point using dot-product attention according to a set of evenly
distributed kernel points on a sphere's surface and performs shared anisotropic
filters. In fact, dot product with kernel points is by analogy with the
dot-product with keys in Transformer as widely used in natural language
processing (NLP). From this perspective, PAI-Conv can be regarded as the
transformer for point clouds, which is physically meaningful and is robust to
cooperate with the efficient random point sampling method. Comprehensive
experiments on point clouds demonstrate that PAI-Conv produces competitive
results in classification and semantic segmentation tasks compared to
state-of-the-art methods
Unsupervised Detection of Distinctive Regions on 3D Shapes
This paper presents a novel approach to learn and detect distinctive regions
on 3D shapes. Unlike previous works, which require labeled data, our method is
unsupervised. We conduct the analysis on point sets sampled from 3D shapes,
then formulate and train a deep neural network for an unsupervised shape
clustering task to learn local and global features for distinguishing shapes
with respect to a given shape set. To drive the network to learn in an
unsupervised manner, we design a clustering-based nonparametric softmax
classifier with an iterative re-clustering of shapes, and an adapted
contrastive loss for enhancing the feature embedding quality and stabilizing
the learning process. By then, we encourage the network to learn the point
distinctiveness on the input shapes. We extensively evaluate various aspects of
our approach and present its applications for distinctiveness-guided shape
retrieval, sampling, and view selection in 3D scenes.Comment: Accepted by ACM TO
BAE-NET: Branched Autoencoder for Shape Co-Segmentation
We treat shape co-segmentation as a representation learning problem and
introduce BAE-NET, a branched autoencoder network, for the task. The
unsupervised BAE-NET is trained with a collection of un-segmented shapes, using
a shape reconstruction loss, without any ground-truth labels. Specifically, the
network takes an input shape and encodes it using a convolutional neural
network, whereas the decoder concatenates the resulting feature code with a
point coordinate and outputs a value indicating whether the point is
inside/outside the shape. Importantly, the decoder is branched: each branch
learns a compact representation for one commonly recurring part of the shape
collection, e.g., airplane wings. By complementing the shape reconstruction
loss with a label loss, BAE-NET is easily tuned for one-shot learning. We show
unsupervised, weakly supervised, and one-shot learning results by BAE-NET,
demonstrating that using only a couple of exemplars, our network can generally
outperform state-of-the-art supervised methods trained on hundreds of segmented
shapes. Code is available at https://github.com/czq142857/BAE-NET.Comment: Accepted to ICCV 2019. Code: https://github.com/czq142857/BAE-NET
Supplementary material: https://www.sfu.ca/~zhiqinc/imseg/sup.pd
LSANet: Feature Learning on Point Sets by Local Spatial Aware Layer
Directly learning features from the point cloud has become an active research
direction in 3D understanding. Existing learning-based methods usually
construct local regions from the point cloud and extract the corresponding
features. However, most of these processes do not adequately take the spatial
distribution of the point cloud into account, limiting the ability to perceive
fine-grained patterns. We design a novel Local Spatial Aware (LSA) layer, which
can learn to generate Spatial Distribution Weights (SDWs) hierarchically based
on the spatial relationship in local region for spatial independent operations,
to establish the relationship between these operations and spatial
distribution, thus capturing the local geometric structure sensitively.We
further propose the LSANet, which is based on LSA layer, aggregating the
spatial information with associated features in each layer of the network
better in network design.The experiments show that our LSANet can achieve on
par or better performance than the state-of-the-art methods when evaluating on
the challenging benchmark datasets. For example, our LSANet can achieve 93.2%
accuracy on ModelNet40 dataset using only 1024 points, significantly higher
than other methods under the same conditions. The source code is available at
https://github.com/LinZhuoChen/LSANet
- âŠ