14 research outputs found
Regularized Evolutionary Algorithm for Dynamic Neural Topology Search
Designing neural networks for object recognition requires considerable
architecture engineering. As a remedy, neuro-evolutionary network architecture
search, which automatically searches for optimal network architectures using
evolutionary algorithms, has recently become very popular. Although very
effective, evolutionary algorithms rely heavily on having a large population of
individuals (i.e., network architectures) and is therefore memory expensive. In
this work, we propose a Regularized Evolutionary Algorithm with low memory
footprint to evolve a dynamic image classifier. In details, we introduce novel
custom operators that regularize the evolutionary process of a micro-population
of 10 individuals. We conduct experiments on three different digits datasets
(MNIST, USPS, SVHN) and show that our evolutionary method obtains competitive
results with the current state-of-the-art
Walking Your LiDOG: A Journey Through Multiple Domains for LiDAR Semantic Segmentation
The ability to deploy robots that can operate safely in diverse environments
is crucial for developing embodied intelligent agents. As a community, we have
made tremendous progress in within-domain LiDAR semantic segmentation. However,
do these methods generalize across domains? To answer this question, we design
the first experimental setup for studying domain generalization (DG) for LiDAR
semantic segmentation (DG-LSS). Our results confirm a significant gap between
methods, evaluated in a cross-domain setting: for example, a model trained on
the source dataset (SemanticKITTI) obtains mIoU on the target data,
compared to mIoU obtained by the model trained on the target domain
(nuScenes). To tackle this gap, we propose the first method specifically
designed for DG-LSS, which obtains mIoU on the target domain,
outperforming all baselines. Our method augments a sparse-convolutional
encoder-decoder 3D segmentation network with an additional, dense 2D
convolutional decoder that learns to classify a birds-eye view of the point
cloud. This simple auxiliary task encourages the 3D network to learn features
that are robust to sensor placement shifts and resolution, and are transferable
across domains. With this work, we aim to inspire the community to develop and
evaluate future models in such cross-domain conditions.Comment: Accepted at ICCV 202
Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation
Deep-learning models for 3D point cloud semantic segmentation exhibit limited
generalization capabilities when trained and tested on data captured with
different sensors or in varying environments due to domain shift. Domain
adaptation methods can be employed to mitigate this domain shift, for instance,
by simulating sensor noise, developing domain-agnostic generators, or training
point cloud completion networks. Often, these methods are tailored for range
view maps or necessitate multi-modal input. In contrast, domain adaptation in
the image domain can be executed through sample mixing, which emphasizes input
data manipulation rather than employing distinct adaptation modules. In this
study, we introduce compositional semantic mixing for point cloud domain
adaptation, representing the first unsupervised domain adaptation technique for
point cloud segmentation based on semantic and geometric sample mixing. We
present a two-branch symmetric network architecture capable of concurrently
processing point clouds from a source domain (e.g. synthetic) and point clouds
from a target domain (e.g. real-world). Each branch operates within one domain
by integrating selected data fragments from the other domain and utilizing
semantic information derived from source labels and target (pseudo) labels.
Additionally, our method can leverage a limited number of human point-level
annotations (semi-supervised) to further enhance performance. We assess our
approach in both synthetic-to-real and real-to-real scenarios using LiDAR
datasets and demonstrate that it significantly outperforms state-of-the-art
methods in both unsupervised and semi-supervised settings.Comment: TPAMI. arXiv admin note: text overlap with arXiv:2207.0977
Novel Class Discovery for 3D Point Cloud Semantic Segmentation
Novel class discovery (NCD) for semantic segmentation is the task of learning a model that can segment unlabelled (novel) classes using only the supervision from labelled (base) classes. This problem has recently been pioneered for 2D image data, but no work exists for 3D point cloud data. In fact, the assumptions made for 2D are loosely applicable to 3D in this case. This paper is presented to advance the state of the art on point cloud data analysis in four directions. Firstly, we address the new problem of NCD for point cloud semantic segmentation. Secondly, we show that the transposition of the only existing NCD method for 2D semantic segmentation to 3D data is sub-optimal. Thirdly, we present a new method for NCD based on online clustering that exploits uncertainty quantification to produce prototypes for pseudo-labelling the points of the novel classes. Lastly, we introduce a new evaluation protocol to assess the performance of NCD for point cloud semantic segmentation. We thoroughly evaluate our method on SemanticKITTI and SemanticPOSS datasets, showing that it can significantly outperform the baseline. Project page: https://github.com/LuigiRiz/NOPS
SF-UDA-3D: Source-Free Unsupervised Domain Adaptation for LiDAR-Based 3D Object Detection
3D object detectors based only on LiDAR point clouds hold the state-of-the-art on modern street-view benchmarks. However, LiDAR-based detectors poorly generalize across domains due to domain shift. In the case of LiDAR, in fact, domain shift is not only due to changes in the environment and in the object appearances, as for visual data from RGB cameras, but is also related to the geometry of the point clouds (e.g., point density variations). This paper proposes SF-UDA-3D, the first Source-Free Unsupervised Domain Adaptation (SF-UDA) framework to domain-adapt the state-of-the-art PointRCNN 3D detector to target domains for which we have no annotations (unsupervised), neither we hold images nor annotations of the source domain (source-free). SF-UDA-3D is novel on both aspects. Our approach is based on pseudo-annotations, reversible scale-transformations and motion coherency. SF-UDA-3D outperforms both previous domain adaptation techniques based on features alignment and state-of-the-art 3D object detection methods which additionally use few-shot target annotations or target annotation statistics. This is demonstrated by extensive experiments on two large-scale datasets, i.e., KITTI and nuScenes
CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation
3D LiDAR semantic segmentation is fundamental for autonomous driving. Several Unsupervised Domain Adaptation (UDA) methods for point cloud data have been recently proposed to improve model generalization for different sensors and environments. Researchers working on UDA problems in the image domain have shown that sample mixing can mitigate domain shift. We propose a new approach of sample mixing for point cloud UDA, namely Compositional Semantic Mix (CoSMix), the first UDA approach for point cloud segmentation based on sample mixing. CoSMix consists of a two-branch symmetric network that can process labelled synthetic data (source) and real-world unlabelled point clouds (target) concurrently. Each branch operates on one domain by mixing selected pieces of data from the other one, and by using the semantic information derived from source labels and target pseudo-labels. We evaluate CoSMix on two large-scale datasets, showing that it outperforms state-of-the-art methods by a large margin (Our code is available at https://github.com/saltoricristiano/cosmix-uda)
Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation
Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud completion networks. Often, these methods are tailored for range view maps or necessitate multi-modal input. In contrast, domain adaptation in the image domain can be executed through sample mixing, which emphasizes input data manipulation rather than employing distinct adaptation modules. In this study, we introduce compositional semantic mixing for point cloud domain adaptation, representing the first unsupervised domain adaptation technique for point cloud segmentation based on semantic and geometric sample mixing. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world). Each branch operates within one domain by integrating selected data fragments from the other domain and utilizing semantic information derived from source labels and target (pseudo) labels. Additionally, our method can leverage a limited number of human point-level annotations (semi-supervised) to further enhance performance. We assess our approach in both synthetic-to-real and real-to-real scenarios using LiDAR datasets and demonstrate that it significantly outperforms state-of-the-art methods in both unsupervised and semi-supervised settings
Overlap-guided Gaussian Mixture Models for Point Cloud Registration
Probabilistic 3D point cloud registration methods have shown competitive performance in overcoming noise, outliers, and density variations. However, registering point cloud pairs in the case of partial overlap is still a challenge. This paper proposes a novel overlap-guided probabilistic registration approach that computes the optimal transformation from matched Gaussian Mixture Model (GMM) parameters. We reformulate the registration problem as the problem of aligning two Gaussian mixtures such that a statistical discrepancy measure between the two corresponding mixtures is minimized. We introduce a Transformer-based detection module to detect overlapping regions, and represent the input point clouds using GMMs by guiding their alignment through overlap scores computed by this detection module. Experiments show that our method achieves superior registration accuracy and efficiency than state-of-the-art methods when handling point clouds with partial overlap and different densities on synthetic and real-world datasets. https://github.com/gfmei/ogm
Data Augmentation-free Unsupervised Learning for 3D Point Cloud Understanding
Unsupervised learning on 3D point clouds has undergone a rapid evolution, especially thanks to data augmentation-based contrastive methods. However, data augmentation is not ideal as it requires a careful selection of the type of augmentations to perform, which in turn can affect the geometric and semantic information learned by the network during self-training. To overcome this issue, we propose an augmentation-free unsupervised approach for point clouds to learn transferable point-level features via soft clustering, named SoftClu.
SoftClu assumes that the points belonging to a cluster should be close to each other in both geometric and feature spaces. This differs from typical contrastive learning, which builds similar representations for a whole point cloud and its augmented versions. We exploit the affiliation of points to their clusters as a proxy to enable self-training through a pseudo-label prediction task. Under the constraint that these pseudo-labels induce the equipartition of the point cloud, we cast SoftClu as an optimal transport problem. We formulate an unsupervised loss to minimize the standard cross-entropy between pseudo-labels and predicted labels. Experiments on downstream applications, such as 3D object classification, part segmentation, and semantic segmentation, show the effectiveness of our framework in outperforming state-of-the-art techniques