8 research outputs found
Homogeneous vector bundles and G-equivariant convolutional neural networks
G-equivariant convolutional neural networks (GCNNs) is a geometric deep learning model for data defined on a homogeneous G-space M. GCNNs are designed to respect the global symmetry in M, thereby facilitating learning. In this paper, we analyze GCNNs on homogeneous spaces M= G/ K in the case of unimodular Lie groups G and compact subgroups K≤ G. We demonstrate that homogeneous vector bundles are the natural setting for GCNNs. We also use reproducing kernel Hilbert spaces (RKHS) to obtain a sufficient criterion for expressing G-equivariant layers as convolutional layers. Finally, stronger results are obtained for some groups via a connection between RKHS and bandwidth
Disentangling Geometric Deformation Spaces in Generative Latent Shape Models
A complete representation of 3D objects requires characterizing the space of
deformations in an interpretable manner, from articulations of a single
instance to changes in shape across categories. In this work, we improve on a
prior generative model of geometric disentanglement for 3D shapes, wherein the
space of object geometry is factorized into rigid orientation, non-rigid pose,
and intrinsic shape. The resulting model can be trained from raw 3D shapes,
without correspondences, labels, or even rigid alignment, using a combination
of classical spectral geometry and probabilistic disentanglement of a
structured latent representation space. Our improvements include more
sophisticated handling of rotational invariance and the use of a diffeomorphic
flow network to bridge latent and spectral space. The geometric structuring of
the latent space imparts an interpretable characterization of the deformation
space of an object. Furthermore, it enables tasks like pose transfer and
pose-aware retrieval without requiring supervision. We evaluate our model on
its generative modelling, representation learning, and disentanglement
performance, showing improved rotation invariance and intrinsic-extrinsic
factorization quality over the prior model.Comment: 22 page
Advancements in Point Cloud Data Augmentation for Deep Learning: A Survey
Point cloud has a wide range of applications in areas such as autonomous
driving, mapping, navigation, scene reconstruction, and medical imaging. Due to
its great potentials in these applications, point cloud processing has gained
great attention in the field of computer vision. Among various point cloud
processing techniques, deep learning (DL) has become one of the mainstream and
effective methods for tasks such as detection, segmentation and classification.
To reduce overfitting during training DL models and improve model performance
especially when the amount and/or diversity of training data are limited,
augmentation is often crucial. Although various point cloud data augmentation
methods have been widely used in different point cloud processing tasks, there
are currently no published systematic surveys or reviews of these methods.
Therefore, this article surveys and discusses these methods and categorizes
them into a taxonomy framework. Through the comprehensive evaluation and
comparison of the augmentation methods, this article identifies their
potentials and limitations and suggests possible future research directions.
This work helps researchers gain a holistic understanding of the current status
of point cloud data augmentation and promotes its wider application and
development
Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds
This thesis focuses on the challenges and opportunities that come with deep learning in the extraction of 3D information from point clouds. To achieve this, 3D information such as point-based or object-based attributes needs to be extracted from highly-accurate and information-rich 3D data, which are commonly collected by LiDAR or RGB-D cameras from real-world environments. Driven by the breakthroughs brought by deep learning techniques and the accessibility of reliable 3D datasets, 3D deep learning frameworks have been investigated with a string of empirical successes. However, two main challenges lead to the complexity of deep learning based per-point labeling and object detection in real scenes. First, the variation of sensing conditions and unconstrained environments result in unevenly distributed point clouds with various geometric patterns and incomplete shapes. Second, the irregular data format and the requirements for both accurate and efficient algorithms pose problems for deep learning models.
To deal with the above two challenges, this doctoral dissertation mainly considers the following four features when constructing 3D deep models for point-based or object-based information extraction: (1) the exploration of geometric correlations between local points when defining convolution kernels, (2) the hierarchical local and global feature learning within an end-to-end trainable framework, (3) the relation feature learning from nearby objects, and (4) 2D image leveraging for 3D object detection from point clouds. Correspondingly, this doctoral thesis proposes a set of deep learning frameworks to deal with the 3D information extraction specific for scene segmentation and object detection from indoor and outdoor point clouds.
Firstly, an end-to-end geometric graph convolution architecture on the graph representation of a point cloud is proposed for semantic scene segmentation. Secondly, a 3D proposal-based object detection framework is constructed to extract the geometric information of objects and relation features among proposals for bounding box reasoning. Thirdly, a 2D-driven approach is proposed to detect 3D objects from point clouds in indoor and outdoor scenes. Both semantic features from 2D images and the context information in 3D space are explicitly exploited to enhance the 3D detection performance. Qualitative and quantitative experiments compared with existing state-of-the-art models on indoor and outdoor datasets demonstrate the effectiveness of the proposed frameworks. A list of remaining challenges and future research issues that help to advance the development of deep learning approaches for the extraction of 3D information from point clouds are addressed at the end of this thesis