1,209 research outputs found
3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration
In this paper, we propose the 3DFeat-Net which learns both 3D feature
detector and descriptor for point cloud matching using weak supervision. Unlike
many existing works, we do not require manual annotation of matching point
clusters. Instead, we leverage on alignment and attention mechanisms to learn
feature correspondences from GPS/INS tagged 3D point clouds without explicitly
specifying them. We create training and benchmark outdoor Lidar datasets, and
experiments show that 3DFeat-Net obtains state-of-the-art performance on these
gravity-aligned datasets.Comment: 17 pages, 6 figures. Accepted in ECCV 201
Weakly Supervised Point Clouds Transformer for 3D Object Detection
The annotation of 3D datasets is required for semantic-segmentation and
object detection in scene understanding. In this paper we present a framework
for the weakly supervision of a point clouds transformer that is used for 3D
object detection. The aim is to decrease the required amount of supervision
needed for training, as a result of the high cost of annotating a 3D datasets.
We propose an Unsupervised Voting Proposal Module, which learns randomly preset
anchor points and uses voting network to select prepared anchor points of high
quality. Then it distills information into student and teacher network. In
terms of student network, we apply ResNet network to efficiently extract local
characteristics. However, it also can lose much global information. To provide
the input which incorporates the global and local information as the input of
student networks, we adopt the self-attention mechanism of transformer to
extract global features, and the ResNet layers to extract region proposals. The
teacher network supervises the classification and regression of the student
network using the pre-trained model on ImageNet. On the challenging KITTI
datasets, the experimental results have achieved the highest level of average
precision compared with the most recent weakly supervised 3D object detectors.Comment: International Conference on Intelligent Transportation Systems
(ITSC), 202
Context-Aware Transformer for 3D Point Cloud Automatic Annotation
3D automatic annotation has received increased attention since manually
annotating 3D point clouds is laborious. However, existing methods are usually
complicated, e.g., pipelined training for 3D foreground/background
segmentation, cylindrical object proposals, and point completion. Furthermore,
they often overlook the inter-object feature relation that is particularly
informative to hard samples for 3D annotation. To this end, we propose a simple
yet effective end-to-end Context-Aware Transformer (CAT) as an automated 3D-box
labeler to generate precise 3D box annotations from 2D boxes, trained with a
small number of human annotations. We adopt the general encoder-decoder
architecture, where the CAT encoder consists of an intra-object encoder (local)
and an inter-object encoder (global), performing self-attention along the
sequence and batch dimensions, respectively. The former models intra-object
interactions among points, and the latter extracts feature relations among
different objects, thus boosting scene-level understanding. Via local and
global encoders, CAT can generate high-quality 3D box annotations with a
streamlined workflow, allowing it to outperform existing state-of-the-art by up
to 1.79% 3D AP on the hard task of the KITTI test set
- …