80,881 research outputs found
Methods for Feature Detection in Point Clouds
This paper gives an overview over several techniques for detection of features, and in particular sharp features, on point-sampled geometry. In addition, a new technique using the Gauss map is shown. Given an unstructured point cloud, this method computes a Gauss map clustering on local neighborhoods in order to discard all points that are unlikely to belong to a sharp feature. A single parameter is used in this stage to control the sensitivity of the feature detection
A Generalized Multi-Modal Fusion Detection Framework
LiDAR point clouds have become the most common data source in autonomous
driving. However, due to the sparsity of point clouds, accurate and reliable
detection cannot be achieved in specific scenarios. Because of their
complementarity with point clouds, images are getting increasing attention.
Although with some success, existing fusion methods either perform hard fusion
or do not fuse in a direct manner. In this paper, we propose a generic 3D
detection framework called MMFusion, using multi-modal features. The framework
aims to achieve accurate fusion between LiDAR and images to improve 3D
detection in complex scenes. Our framework consists of two separate streams:
the LiDAR stream and the camera stream, which can be compatible with any
single-modal feature extraction network. The Voxel Local Perception Module in
the LiDAR stream enhances local feature representation, and then the
Multi-modal Feature Fusion Module selectively combines feature output from
different streams to achieve better fusion. Extensive experiments have shown
that our framework not only outperforms existing benchmarks but also improves
their detection, especially for detecting cyclists and pedestrians on KITTI
benchmarks, with strong robustness and generalization capabilities. Hopefully,
our work will stimulate more research into multi-modal fusion for autonomous
driving tasks
MS23D: A 3D Object Detection Method Using Multi-Scale Semantic Feature Points to Construct 3D Feature Layer
Lidar point clouds, as a type of data with accurate distance perception, can
effectively represent the motion and posture of objects in three-dimensional
space. However, the sparsity and disorderliness of point clouds make it
challenging to extract features directly from them. Many studies have addressed
this issue by transforming point clouds into regular voxel representations.
However, these methods often lead to the loss of fine-grained local feature
information due to downsampling. Moreover, the sparsity of point clouds poses
difficulties in efficiently aggregating features in 3D feature layer using
voxel-based two-stage methods. To address these issues, this paper proposes a
two-stage 3D detection framework called MS3D. In MS3D, we utilize
small-sized voxels to extract fine-grained local features and large-sized
voxels to capture long-range local features. Additionally, we propose a method
for constructing 3D feature layer using multi-scale semantic feature points,
enabling the transformation of sparse 3D feature layer into more compact
representations. Furthermore, we compute the offset between feature points in
the 3D feature layer and the centroid of objects, aiming to bring them as close
as possible to the object's center. It significantly enhances the efficiency of
feature aggregation. To validate the effectiveness of our method, we evaluated
our method on the KITTI dataset and ONCE dataset together
SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds
Multi-class 3D object detection aims to localize and classify objects of
multiple categories from point clouds. Due to the nature of point clouds, i.e.
unstructured, sparse and noisy, some features benefit-ting multi-class
discrimination are underexploited, such as shape information. In this paper, we
propose a novel 3D shape signature to explore the shape information from point
clouds. By incorporating operations of symmetry, convex hull and chebyshev
fitting, the proposed shape sig-nature is not only compact and effective but
also robust to the noise, which serves as a soft constraint to improve the
feature capability of multi-class discrimination. Based on the proposed shape
signature, we develop the shape signature networks (SSN) for 3D object
detection, which consist of pyramid feature encoding part, shape-aware grouping
heads and explicit shape encoding objective. Experiments show that the proposed
method performs remarkably better than existing methods on two large-scale
datasets. Furthermore, our shape signature can act as a plug-and-play component
and ablation study shows its effectiveness and good scalabilityComment: Code is available at https://github.com/xinge008/SS
MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection
Accurate and reliable 3D detection is vital for many applications including
autonomous driving vehicles and service robots. In this paper, we present a
flexible and high-performance 3D detection framework, named MPPNet, for 3D
temporal object detection with point cloud sequences. We propose a novel
three-hierarchy framework with proxy points for multi-frame feature encoding
and interactions to achieve better detection. The three hierarchies conduct
per-frame feature encoding, short-clip feature fusion, and whole-sequence
feature aggregation, respectively. To enable processing long-sequence point
clouds with reasonable computational resources, intra-group feature mixing and
inter-group feature attention are proposed to form the second and third feature
encoding hierarchies, which are recurrently applied for aggregating multi-frame
trajectory features. The proxy points not only act as consistent object
representations for each frame, but also serve as the courier to facilitate
feature interaction between frames. The experiments on large Waymo Open dataset
show that our approach outperforms state-of-the-art methods with large margins
when applied to both short (e.g., 4-frame) and long (e.g., 16-frame) point
cloud sequences. Code is available at https://github.com/open-mmlab/OpenPCDet.Comment: Accepted by ECCV 202
A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation
Pairwise point cloud registration is a critical task for many applications,
which heavily depends on finding correct correspondences from the two point
clouds. However, the low overlap between input point clouds causes the
registration to fail easily, leading to mistaken overlapping and mismatched
correspondences, especially in scenes where non-overlapping regions contain
similar structures. In this paper, we present a unified bird's-eye view (BEV)
model for jointly learning of 3D local features and overlap estimation to
fulfill pairwise registration and loop closure. Feature description is
performed by a sparse UNet-like network based on BEV representation, and 3D
keypoints are extracted by a detection head for 2D locations, and a regression
head for heights. For overlap detection, a cross-attention module is applied
for interacting contextual information of input point clouds, followed by a
classification head to estimate the overlapping region. We evaluate our unified
model extensively on the KITTI dataset and Apollo-SouthBay dataset. The
experiments demonstrate that our method significantly outperforms existing
methods on overlap estimation, especially in scenes with small overlaps. It
also achieves top registration performance on both datasets in terms of
translation and rotation errors.Comment: 8 pages. Accepted by ICRA-202
- …