16 research outputs found

    SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds

    Full text link
    Multi-class 3D object detection aims to localize and classify objects of multiple categories from point clouds. Due to the nature of point clouds, i.e. unstructured, sparse and noisy, some features benefit-ting multi-class discrimination are underexploited, such as shape information. In this paper, we propose a novel 3D shape signature to explore the shape information from point clouds. By incorporating operations of symmetry, convex hull and chebyshev fitting, the proposed shape sig-nature is not only compact and effective but also robust to the noise, which serves as a soft constraint to improve the feature capability of multi-class discrimination. Based on the proposed shape signature, we develop the shape signature networks (SSN) for 3D object detection, which consist of pyramid feature encoding part, shape-aware grouping heads and explicit shape encoding objective. Experiments show that the proposed method performs remarkably better than existing methods on two large-scale datasets. Furthermore, our shape signature can act as a plug-and-play component and ablation study shows its effectiveness and good scalabilityComment: Code is available at https://github.com/xinge008/SS

    Real-time large-scale dense RGB-D SLAM with volumetric fusion

    Get PDF
    We present a new simultaneous localization and mapping (SLAM) system capable of producing high-quality globally consistent surface reconstructions over hundreds of meters in real time with only a low-cost commodity RGB-D sensor. By using a fused volumetric surface reconstruction we achieve a much higher quality map over what would be achieved using raw RGB-D point clouds. In this paper we highlight three key techniques associated with applying a volumetric fusion-based mapping system to the SLAM problem in real time. First, the use of a GPU-based 3D cyclical buffer trick to efficiently extend dense every-frame volumetric fusion of depth maps to function over an unbounded spatial region. Second, overcoming camera pose estimation limitations in a wide variety of environments by combining both dense geometric and photometric camera pose constraints. Third, efficiently updating the dense map according to place recognition and subsequent loop closure constraints by the use of an ‘as-rigid-as-possible’ space deformation. We present results on a wide variety of aspects of the system and show through evaluation on de facto standard RGB-D benchmarks that our system performs strongly in terms of trajectory estimation, map quality and computational performance in comparison to other state-of-the-art systems.Science Foundation Ireland (Strategic Research Cluster Grant 07/SRC/I1168)Irish Research Council (Embark Initiative)United States. Office of Naval Research (Grant N00014-10-1-0936)United States. Office of Naval Research (Grant N00014-11-1-0688)United States. Office of Naval Research (Grant N00014-12-1-0093)United States. Office of Naval Research (Grant N00014-12-10020)National Science Foundation (U.S.) (Grant IIS-1318392

    On the use of the tree structure of depth levels for comparing 3D object views

    Get PDF
    Today the simple availability of 3D sensory data, the evolution of 3D representations, and their application to object recognition and scene analysis tasks promise to improve autonomy and flexibility of robots in several domains. However, there has been little research into what can be gained through the explicit inclusion of the structural relations between parts of objects when quantifying similarity of their shape, and hence for shape-based object category recognition. We propose a Mathematical Morphology inspired hierarchical decomposition of 3D object views into peak components at evenly spaced depth levels, casting the 3D shape similarity problem to a tree of more elementary similarity problems. The matching of these trees of peak components is here compared to matching the individual components through optimal and greedy assignment in a simple feature space, trying to find the maximum-weight-maximal-match assignments. The matching thus achieved provides a metric of total shape similarity between object views. The three matching strategies are evaluated and compared through the category recognition accuracy on objects from a public set of 3D models. It turns out that all three methods yield similar accuracy on the simple features we used, while the greedy method is fastest
    corecore