Search CORE

20 research outputs found

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

Author: A Geiger
D Stamoulis
H Cai
K Wang
PS Wang
PS Wang
R Pagh
Y Wang
Y Yan
Publication venue
Publication date: 13/08/2020
Field of study

Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely. Given the limited hardware resources, existing 3D perception models are not able to recognize small instances (e.g., pedestrians, cyclists) very well due to the low-resolution voxelization and aggressive downsampling. To this end, we propose Sparse Point-Voxel Convolution (SPVConv), a lightweight 3D module that equips the vanilla Sparse Convolution with the high-resolution point-based branch. With negligible overhead, this point-based branch is able to preserve the fine details even from large outdoor scenes. To explore the spectrum of efficient 3D models, we first define a flexible architecture design space based on SPVConv, and we then present 3D Neural Architecture Search (3D-NAS) to search the optimal network architecture over this diverse design space efficiently and effectively. Experimental results validate that the resulting SPVNAS model is fast and accurate: it outperforms the state-of-the-art MinkowskiNet by 3.3%, ranking 1st on the competitive SemanticKITTI leaderboard. It also achieves 8x computation reduction and 3x measured speedup over MinkowskiNet with higher accuracy. Finally, we transfer our method to 3D object detection, and it achieves consistent improvements over the one-stage detection baseline on KITTI.Comment: ECCV 2020. The first two authors contributed equally to this work. Project page: http://spvnas.mit.edu

arXiv.org e-Print Archive

Crossref

Improving 3D Semantic Segmentation withTwin-Representation Networks

Author: Duerr Fabian
Publication venue: KIT Scientific Publishing
Publication date: 06/07/2021
Field of study

The growing importance of 3d scene understanding and interpretation is inher-ently connected to the rise of autonomous driving and robotics. Semanticsegmentation of 3d point clouds is a key enabler for this task, providing geo-metric information enhanced with semantics. To use Convolutional NeuralNetworks, a proper representation of the point clouds must be chosen. Variousrepresentations have been proposed, with different advantages and disadvantages.In this work, we present a twin-representation architecture, which is composedof a 3d point-based and a 2d range image branch, to efficiently extract and refinepoint-wise features, supported by strong context information. Additionally, afeature propagation strategy is proposed to connect both branches. The approachis evaluated on the challenging SemanticKITTI dataset [2] and considerablyoutperforms the baseline overall as well as for every individual class. Especiallythe predictions for distant points are significantly improved

KITopen

Panoster: End-to-end Panoptic Segmentation of LiDAR Point Clouds

Author: Gasperini Stefano
Mahani Mohammad-Ali Nikouei
Marcos-Ramiro Alvaro
Navab Nassir
Tombari Federico
Publication venue
Publication date: 12/02/2021
Field of study

Panoptic segmentation has recently unified semantic and instance segmentation, previously addressed separately, thus taking a step further towards creating more comprehensive and efficient perception systems. In this paper, we present Panoster, a novel proposal-free panoptic segmentation method for LiDAR point clouds. Unlike previous approaches relying on several steps to group pixels or points into objects, Panoster proposes a simplified framework incorporating a learning-based clustering solution to identify instances. At inference time, this acts as a class-agnostic segmentation, allowing Panoster to be fast, while outperforming prior methods in terms of accuracy. Without any post-processing, Panoster reached state-of-the-art results among published approaches on the challenging SemanticKITTI benchmark, and further increased its lead by exploiting heuristic techniques. Additionally, we showcase how our method can be flexibly and effectively applied on diverse existing semantic architectures to deliver panoptic predictions.Comment: Preprint of IEEE RA-L articl

arXiv.org e-Print Archive

COARSE3D: Class-Prototypes for Contrastive Learning in Weakly-Supervised 3D Point Cloud Segmentation

Author: Cao Anh-Quan
de Charette Raoul
Li Rong
Publication venue
Publication date: 07/10/2022
Field of study

Annotation of large-scale 3D data is notoriously cumbersome and costly. As an alternative, weakly-supervised learning alleviates such a need by reducing the annotation by several order of magnitudes. We propose COARSE3D, a novel architecture-agnostic contrastive learning strategy for 3D segmentation. Since contrastive learning requires rich and diverse examples as keys and anchors, we leverage a prototype memory bank capturing class-wise global dataset information efficiently into a small number of prototypes acting as keys. An entropy-driven sampling technique then allows us to select good pixels from predictions as anchors. Experiments on three projection-based backbones show we outperform baselines on three challenging real-world outdoor datasets, working with as low as 0.001% annotations

arXiv.org e-Print Archive

Two Heads are Better than One: Geometric-Latent Attention for Point Cloud Classification and Segmentation

Author: Cuevas-Velasquez Hanz
Fisher Robert B
Gallego Antonio Javier
Publication venue
Publication date: 30/10/2021
Field of study

We present an innovative two-headed attention layer that combines geometric and latent features to segment a 3D scene into semantically meaningful subsets. Each head combines local and global information, using either the geometric or latent features, of a neighborhood of points and uses this information to learn better local relationships. This Geometric-Latent attention layer (Ge-Latto) is combined with a sub-sampling strategy to capture global features. Our method is invariant to permutation thanks to the use of shared-MLP layers, and it can also be used with point clouds with varying densities because the local attention layer does not depend on the neighbor order. Our proposal is simple yet robust, which allows it to achieve competitive results in the ShapeNetPart and ModelNet40 datasets, and the state-of-the-art when segmenting the complex dataset S3DIS, with 69.2% IoU on Area 5, and 89.7% overall accuracy using K-fold cross-validation on the 6 areas.Comment: Accepted in BMVC 202

arXiv.org e-Print Archive

Edinburgh Research Explorer

Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion

Author: Cui Shuguang
Gao Jiantao
Huang Rui
Li Jie
Li Zhen
Yan Xu
Zhang Ruimao
Publication venue
Publication date: 07/12/2020
Field of study

LiDAR point cloud analysis is a core task for 3D computer vision, especially for autonomous driving. However, due to the severe sparsity and noise interference in the single sweep LiDAR point cloud, the accurate semantic segmentation is non-trivial to achieve. In this paper, we propose a novel sparse LiDAR point cloud semantic segmentation framework assisted by learned contextual shape priors. In practice, an initial semantic segmentation (SS) of a single sweep point cloud can be achieved by any appealing network and then flows into the semantic scene completion (SSC) module as the input. By merging multiple frames in the LiDAR sequence as supervision, the optimized SSC module has learned the contextual shape priors from sequential LiDAR data, completing the sparse single sweep point cloud to the dense one. Thus, it inherently improves SS optimization through fully end-to-end training. Besides, a Point-Voxel Interaction (PVI) module is proposed to further enhance the knowledge fusion between SS and SSC tasks, i.e., promoting the interaction of incomplete local geometry of point cloud and complete voxel-wise global structure. Furthermore, the auxiliary SSC and PVI modules can be discarded during inference without extra burden for SS. Extensive experiments confirm that our JS3C-Net achieves superior performance on both SemanticKITTI and SemanticPOSS benchmarks, i.e., 4% and 3% improvement correspondingly.Comment: To appear in AAAI 2021. Codes are available at https://github.com/yanx27/JS3C-Ne

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

COARSE3D: Class-Prototypes for Contrastive Learning in Weakly-Supervised 3D Point Cloud Segmentation

Author: Cao Anh-Quan
de Charette Raoul
Li Rong
Publication venue: HAL CCSD
Publication date: 21/11/2022
Field of study

International audienceAnnotation of large-scale 3D data is notoriously cumbersome and costly. As an alternative, weakly-supervised learning alleviates such a need by reducing the annotation by several order of magnitudes. We propose COARSE3D, a novel architecture-agnostic contrastive learning strategy for 3D segmentation. Since contrastive learning requires rich and diverse examples as keys and anchors, we leverage a prototype memory bank capturing class-wise global dataset information efficiently into a small number of prototypes acting as keys. An entropy-driven sampling technique then allows us to select good pixels from predictions as anchors. Experiments on three projection-based backbones show we outperform baselines on three challenging real-world outdoor datasets, working with as low as 0.001% annotations

INRIA a CCSD electronic archive server