34,510 research outputs found
Panoptic Vision-Language Feature Fields
Recently, methods have been proposed for 3D open-vocabulary semantic
segmentation. Such methods are able to segment scenes into arbitrary classes
given at run-time using their text description. In this paper, we propose to
our knowledge the first algorithm for open-vocabulary panoptic segmentation,
simultaneously performing both semantic and instance segmentation. Our
algorithm, Panoptic Vision-Language Feature Fields (PVLFF) learns a feature
field of the scene, jointly learning vision-language features and hierarchical
instance features through a contrastive loss function from 2D instance segment
proposals on input frames. Our method achieves comparable performance against
the state-of-the-art close-set 3D panoptic systems on the HyperSim, ScanNet and
Replica dataset and outperforms current 3D open-vocabulary systems in terms of
semantic segmentation. We additionally ablate our method to demonstrate the
effectiveness of our model architecture. Our code will be available at
https://github.com/ethz-asl/autolabel.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Investigate Indistinguishable Points in Semantic Segmentation of 3D Point Cloud
This paper investigates the indistinguishable points (difficult to predict
label) in semantic segmentation for large-scale 3D point clouds. The
indistinguishable points consist of those located in complex boundary, points
with similar local textures but different categories, and points in isolate
small hard areas, which largely harm the performance of 3D semantic
segmentation. To address this challenge, we propose a novel Indistinguishable
Area Focalization Network (IAF-Net), which selects indistinguishable points
adaptively by utilizing the hierarchical semantic features and enhances
fine-grained features for points especially those indistinguishable points. We
also introduce multi-stage loss to improve the feature representation in a
progressive way. Moreover, in order to analyze the segmentation performances of
indistinguishable areas, we propose a new evaluation metric called
Indistinguishable Points Based Metric (IPBM). Our IAF-Net achieves the
comparable results with state-of-the-art performance on several popular 3D
point cloud datasets e.g. S3DIS and ScanNet, and clearly outperforms other
methods on IPBM.Comment: AAAI202
Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation
We propose an approach to discover class-specific pixels for the
weakly-supervised semantic segmentation task. We show that properly combining
saliency and attention maps allows us to obtain reliable cues capable of
significantly boosting the performance. First, we propose a simple yet powerful
hierarchical approach to discover the class-agnostic salient regions, obtained
using a salient object detector, which otherwise would be ignored. Second, we
use fully convolutional attention maps to reliably localize the class-specific
regions in a given image. We combine these two cues to discover class-specific
pixels which are then used as an approximate ground truth for training a CNN.
While solving the weakly supervised semantic segmentation task, we ensure that
the image-level classification task is also solved in order to enforce the CNN
to assign at least one pixel to each object present in the image.
Experimentally, on the PASCAL VOC12 val and test sets, we obtain the mIoU of
60.8% and 61.9%, achieving the performance gains of 5.1% and 5.2% compared to
the published state-of-the-art results. The code is made publicly available
- …