24 research outputs found
Regularized Evolutionary Algorithm for Dynamic Neural Topology Search
Designing neural networks for object recognition requires considerable
architecture engineering. As a remedy, neuro-evolutionary network architecture
search, which automatically searches for optimal network architectures using
evolutionary algorithms, has recently become very popular. Although very
effective, evolutionary algorithms rely heavily on having a large population of
individuals (i.e., network architectures) and is therefore memory expensive. In
this work, we propose a Regularized Evolutionary Algorithm with low memory
footprint to evolve a dynamic image classifier. In details, we introduce novel
custom operators that regularize the evolutionary process of a micro-population
of 10 individuals. We conduct experiments on three different digits datasets
(MNIST, USPS, SVHN) and show that our evolutionary method obtains competitive
results with the current state-of-the-art
Walking Your LiDOG: A Journey Through Multiple Domains for LiDAR Semantic Segmentation
The ability to deploy robots that can operate safely in diverse environments
is crucial for developing embodied intelligent agents. As a community, we have
made tremendous progress in within-domain LiDAR semantic segmentation. However,
do these methods generalize across domains? To answer this question, we design
the first experimental setup for studying domain generalization (DG) for LiDAR
semantic segmentation (DG-LSS). Our results confirm a significant gap between
methods, evaluated in a cross-domain setting: for example, a model trained on
the source dataset (SemanticKITTI) obtains mIoU on the target data,
compared to mIoU obtained by the model trained on the target domain
(nuScenes). To tackle this gap, we propose the first method specifically
designed for DG-LSS, which obtains mIoU on the target domain,
outperforming all baselines. Our method augments a sparse-convolutional
encoder-decoder 3D segmentation network with an additional, dense 2D
convolutional decoder that learns to classify a birds-eye view of the point
cloud. This simple auxiliary task encourages the 3D network to learn features
that are robust to sensor placement shifts and resolution, and are transferable
across domains. With this work, we aim to inspire the community to develop and
evaluate future models in such cross-domain conditions.Comment: Accepted at ICCV 202
Novel class discovery meets foundation models for 3D semantic segmentation
The task of Novel Class Discovery (NCD) in semantic segmentation entails
training a model able to accurately segment unlabelled (novel) classes, relying
on the available supervision from annotated (base) classes. Although
extensively investigated in 2D image data, the extension of the NCD task to the
domain of 3D point clouds represents a pioneering effort, characterized by
assumptions and challenges that are not present in the 2D case. This paper
represents an advancement in the analysis of point cloud data in four
directions. Firstly, it introduces the novel task of NCD for point cloud
semantic segmentation. Secondly, it demonstrates that directly transposing the
only existing NCD method for 2D image semantic segmentation to 3D data yields
suboptimal results. Thirdly, a new NCD approach based on online clustering,
uncertainty estimation, and semantic distillation is presented. Lastly, a novel
evaluation protocol is proposed to rigorously assess the performance of NCD in
point cloud semantic segmentation. Through comprehensive evaluations on the
SemanticKITTI, SemanticPOSS, and S3DIS datasets, the paper demonstrates
substantial superiority of the proposed method over the considered baselines.Comment: arXiv admin note: substantial text overlap with arXiv:2303.1161
Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation
Deep-learning models for 3D point cloud semantic segmentation exhibit limited
generalization capabilities when trained and tested on data captured with
different sensors or in varying environments due to domain shift. Domain
adaptation methods can be employed to mitigate this domain shift, for instance,
by simulating sensor noise, developing domain-agnostic generators, or training
point cloud completion networks. Often, these methods are tailored for range
view maps or necessitate multi-modal input. In contrast, domain adaptation in
the image domain can be executed through sample mixing, which emphasizes input
data manipulation rather than employing distinct adaptation modules. In this
study, we introduce compositional semantic mixing for point cloud domain
adaptation, representing the first unsupervised domain adaptation technique for
point cloud segmentation based on semantic and geometric sample mixing. We
present a two-branch symmetric network architecture capable of concurrently
processing point clouds from a source domain (e.g. synthetic) and point clouds
from a target domain (e.g. real-world). Each branch operates within one domain
by integrating selected data fragments from the other domain and utilizing
semantic information derived from source labels and target (pseudo) labels.
Additionally, our method can leverage a limited number of human point-level
annotations (semi-supervised) to further enhance performance. We assess our
approach in both synthetic-to-real and real-to-real scenarios using LiDAR
datasets and demonstrate that it significantly outperforms state-of-the-art
methods in both unsupervised and semi-supervised settings.Comment: TPAMI. arXiv admin note: text overlap with arXiv:2207.0977
Novel Class Discovery for 3D Point Cloud Semantic Segmentation
Novel class discovery (NCD) for semantic segmentation is the task of learning a model that can segment unlabelled (novel) classes using only the supervision from labelled (base) classes. This problem has recently been pioneered for 2D image data, but no work exists for 3D point cloud data. In fact, the assumptions made for 2D are loosely applicable to 3D in this case. This paper is presented to advance the state of the art on point cloud data analysis in four directions. Firstly, we address the new problem of NCD for point cloud semantic segmentation. Secondly, we show that the transposition of the only existing NCD method for 2D semantic segmentation to 3D data is sub-optimal. Thirdly, we present a new method for NCD based on online clustering that exploits uncertainty quantification to produce prototypes for pseudo-labelling the points of the novel classes. Lastly, we introduce a new evaluation protocol to assess the performance of NCD for point cloud semantic segmentation. We thoroughly evaluate our method on SemanticKITTI and SemanticPOSS datasets, showing that it can significantly outperform the baseline. Project page: https://github.com/LuigiRiz/NOPS
SF-UDA-3D: Source-Free Unsupervised Domain Adaptation for LiDAR-Based 3D Object Detection
3D object detectors based only on LiDAR point clouds hold the state-of-the-art on modern street-view benchmarks. However, LiDAR-based detectors poorly generalize across domains due to domain shift. In the case of LiDAR, in fact, domain shift is not only due to changes in the environment and in the object appearances, as for visual data from RGB cameras, but is also related to the geometry of the point clouds (e.g., point density variations). This paper proposes SF-UDA-3D, the first Source-Free Unsupervised Domain Adaptation (SF-UDA) framework to domain-adapt the state-of-the-art PointRCNN 3D detector to target domains for which we have no annotations (unsupervised), neither we hold images nor annotations of the source domain (source-free). SF-UDA-3D is novel on both aspects. Our approach is based on pseudo-annotations, reversible scale-transformations and motion coherency. SF-UDA-3D outperforms both previous domain adaptation techniques based on features alignment and state-of-the-art 3D object detection methods which additionally use few-shot target annotations or target annotation statistics. This is demonstrated by extensive experiments on two large-scale datasets, i.e., KITTI and nuScenes
Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation
Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud completion networks. Often, these methods are tailored for range view maps or necessitate multi-modal input. In contrast, domain adaptation in the image domain can be executed through sample mixing, which emphasizes input data manipulation rather than employing distinct adaptation modules. In this study, we introduce compositional semantic mixing for point cloud domain adaptation, representing the first unsupervised domain adaptation technique for point cloud segmentation based on semantic and geometric sample mixing. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world). Each branch operates within one domain by integrating selected data fragments from the other domain and utilizing semantic information derived from source labels and target (pseudo) labels. Additionally, our method can leverage a limited number of human point-level annotations (semi-supervised) to further enhance performance. We assess our approach in both synthetic-to-real and real-to-real scenarios using LiDAR datasets and demonstrate that it significantly outperforms state-of-the-art methods in both unsupervised and semi-supervised settings