Search CORE

8 research outputs found

Homogeneous vector bundles and G-equivariant convolutional neural networks

Author: Aronsson Jimmy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

G-equivariant convolutional neural networks (GCNNs) is a geometric deep learning model for data defined on a homogeneous G-space M. GCNNs are designed to respect the global symmetry in M, thereby facilitating learning. In this paper, we analyze GCNNs on homogeneous spaces M= G/ K in the case of unimodular Lie groups G and compact subgroups K≤ G. We demonstrate that homogeneous vector bundles are the natural setting for GCNNs. We also use reproducing kernel Hilbert spaces (RKHS) to obtain a sufficient criterion for expressing G-equivariant layers as convolutional layers. Finally, stronger results are obtained for some groups via a connection between RKHS and bandwidth

Chalmers Research

Disentangling Geometric Deformation Spaces in Generative Latent Shape Models

Author: Aumentado-Armstrong Tristan
Dickinson Sven
Jepson Allan
Tsogkas Stavros
Publication venue
Publication date: 27/02/2021
Field of study

A complete representation of 3D objects requires characterizing the space of deformations in an interpretable manner, from articulations of a single instance to changes in shape across categories. In this work, we improve on a prior generative model of geometric disentanglement for 3D shapes, wherein the space of object geometry is factorized into rigid orientation, non-rigid pose, and intrinsic shape. The resulting model can be trained from raw 3D shapes, without correspondences, labels, or even rigid alignment, using a combination of classical spectral geometry and probabilistic disentanglement of a structured latent representation space. Our improvements include more sophisticated handling of rotational invariance and the use of a diffeomorphic flow network to bridge latent and spectral space. The geometric structuring of the latent space imparts an interpretable characterization of the deformation space of an object. Furthermore, it enables tasks like pose transfer and pose-aware retrieval without requiring supervision. We evaluate our model on its generative modelling, representation learning, and disentanglement performance, showing improved rotation invariance and intrinsic-extrinsic factorization quality over the prior model.Comment: 22 page

arXiv.org e-Print Archive

Advancements in Point Cloud Data Augmentation for Deep Learning: A Survey

Author: Fan Lei
Weng Ningxin
Zhu Qinfeng
Publication venue
Publication date: 07/10/2023
Field of study

Point cloud has a wide range of applications in areas such as autonomous driving, mapping, navigation, scene reconstruction, and medical imaging. Due to its great potentials in these applications, point cloud processing has gained great attention in the field of computer vision. Among various point cloud processing techniques, deep learning (DL) has become one of the mainstream and effective methods for tasks such as detection, segmentation and classification. To reduce overfitting during training DL models and improve model performance especially when the amount and/or diversity of training data are limited, augmentation is often crucial. Although various point cloud data augmentation methods have been widely used in different point cloud processing tasks, there are currently no published systematic surveys or reviews of these methods. Therefore, this article surveys and discusses these methods and categorizes them into a taxonomy framework. Through the comprehensive evaluation and comparison of the augmentation methods, this article identifies their potentials and limitations and suggests possible future research directions. This work helps researchers gain a holistic understanding of the current status of point cloud data augmentation and promotes its wider application and development

arXiv.org e-Print Archive

Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds

Author: Li Ying
Publication venue: 'University of Waterloo'
Publication date: 23/12/2020
Field of study

This thesis focuses on the challenges and opportunities that come with deep learning in the extraction of 3D information from point clouds. To achieve this, 3D information such as point-based or object-based attributes needs to be extracted from highly-accurate and information-rich 3D data, which are commonly collected by LiDAR or RGB-D cameras from real-world environments. Driven by the breakthroughs brought by deep learning techniques and the accessibility of reliable 3D datasets, 3D deep learning frameworks have been investigated with a string of empirical successes. However, two main challenges lead to the complexity of deep learning based per-point labeling and object detection in real scenes. First, the variation of sensing conditions and unconstrained environments result in unevenly distributed point clouds with various geometric patterns and incomplete shapes. Second, the irregular data format and the requirements for both accurate and efficient algorithms pose problems for deep learning models. To deal with the above two challenges, this doctoral dissertation mainly considers the following four features when constructing 3D deep models for point-based or object-based information extraction: (1) the exploration of geometric correlations between local points when defining convolution kernels, (2) the hierarchical local and global feature learning within an end-to-end trainable framework, (3) the relation feature learning from nearby objects, and (4) 2D image leveraging for 3D object detection from point clouds. Correspondingly, this doctoral thesis proposes a set of deep learning frameworks to deal with the 3D information extraction specific for scene segmentation and object detection from indoor and outdoor point clouds. Firstly, an end-to-end geometric graph convolution architecture on the graph representation of a point cloud is proposed for semantic scene segmentation. Secondly, a 3D proposal-based object detection framework is constructed to extract the geometric information of objects and relation features among proposals for bounding box reasoning. Thirdly, a 2D-driven approach is proposed to detect 3D objects from point clouds in indoor and outdoor scenes. Both semantic features from 2D images and the context information in 3D space are explicitly exploited to enhance the 3D detection performance. Qualitative and quantitative experiments compared with existing state-of-the-art models on indoor and outdoor datasets demonstrate the effectiveness of the proposed frameworks. A list of remaining challenges and future research issues that help to advance the development of deep learning approaches for the extraction of 3D information from point clouds are addressed at the end of this thesis

University of Waterloo's Institutional Repository