139,263 research outputs found

    A Novel Method for Alignment of 3D Models

    Get PDF
    In this paper we present a new method for alignment of 3D models. This approach is based on symmetry properties, and uses the fact that PCA techniques have good properties with respect to the planar reflective symmetry. The fast search of the best optimal alignment axes within the PCA-eigenvectors is an essential first step in our alignment process. The plane reflection symmetry is used as a criterion for selection. This pre-processing transforms the alignment problem into an indexing scheme based on the number of the retained PCA-axes. We also introduce a local translational invariance cost (LTIC) that captures a measure of the local translational symmetries of a shape with respect to a given direction. Experimental results show that the proposed method finds the rotation that best aligns a 3D mesh

    A novel method for alignment of 3D models

    Full text link

    A Segmentation Transfer Approach for Rigid Models

    Get PDF
    International audienceIn this paper, we propose using a segmented example model to perform a semantic oriented segmentation of rigid 3D models of the same class (tables, chairs, etc.). For this, we introduce an alignment method that maps the meaningful parts of the models and we develop a novel approach based on random walks to transfer a consistent segmentation from the example to the target model. The example-driven segmentation is fast and entirely automatic. We demonstrate the effectiveness of our approach through multiple results of inter-shape segmentation transfer presented for different classes of rigid models

    Global rigid registration of CT to video in laparoscopic liver surgery

    Get PDF
    PURPOSE: Image-guidance systems have the potential to aid in laparoscopic interventions by providing sub-surface structure information and tumour localisation. The registration of a preoperative 3D image with the intraoperative laparoscopic video feed is an important component of image guidance, which should be fast, robust and cause minimal disruption to the surgical procedure. Most methods for rigid and non-rigid registration require a good initial alignment. However, in most research systems for abdominal surgery, the user has to manually rotate and translate the models, which is usually difficult to perform quickly and intuitively. METHODS: We propose a fast, global method for the initial rigid alignment between a 3D mesh derived from a preoperative CT of the liver and a surface reconstruction of the intraoperative scene. We formulate the shape matching problem as a quadratic assignment problem which minimises the dissimilarity between feature descriptors while enforcing geometrical consistency between all the feature points. We incorporate a novel constraint based on the liver contours which deals specifically with the challenges introduced by laparoscopic data. RESULTS: We validate our proposed method on synthetic data, on a liver phantom and on retrospective clinical data acquired during a laparoscopic liver resection. We show robustness over reduced partial size and increasing levels of deformation. Our results on the phantom and on the real data show good initial alignment, which can successfully converge to the correct position using fine alignment techniques. Furthermore, since we can pre-process the CT scan before surgery, the proposed method runs faster than current algorithms. CONCLUSION: The proposed shape matching method can provide a fast, global initial registration, which can be further refined by fine alignment methods. This approach will lead to a more usable and intuitive image-guidance system for laparoscopic liver surgery

    3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding

    Full text link
    The remarkable potential of multi-modal large language models (MLLMs) in comprehending both vision and language information has been widely acknowledged. However, the scarcity of 3D scenes-language pairs in comparison to their 2D counterparts, coupled with the inadequacy of existing approaches in understanding of 3D scenes by LLMs, poses a significant challenge. In response, we collect and construct an extensive dataset comprising 75K instruction-response pairs tailored for 3D scenes. This dataset addresses tasks related to 3D VQA, 3D grounding, and 3D conversation. To further enhance the integration of 3D spatial information into LLMs, we introduce a novel and efficient prompt tuning paradigm, 3DMIT. This paradigm eliminates the alignment stage between 3D scenes and language and extends the instruction prompt with the 3D modality information including the entire scene and segmented objects. We evaluate the effectiveness of our method across diverse tasks in the 3D scene domain and find that our approach serves as a strategic means to enrich LLMs' comprehension of the 3D world. Our code is available at https://github.com/staymylove/3DMIT.Comment: 9 pages, 5 figure

    OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding

    Full text link
    We introduce OpenShape, a method for learning multi-modal joint representations of text, image, and point clouds. We adopt the commonly used multi-modal contrastive learning framework for representation alignment, but with a specific focus on scaling up 3D representations to enable open-world 3D shape understanding. To achieve this, we scale up training data by ensembling multiple 3D datasets and propose several strategies to automatically filter and enrich noisy text descriptions. We also explore and compare strategies for scaling 3D backbone networks and introduce a novel hard negative mining module for more efficient training. We evaluate OpenShape on zero-shot 3D classification benchmarks and demonstrate its superior capabilities for open-world recognition. Specifically, OpenShape achieves a zero-shot accuracy of 46.8% on the 1,156-category Objaverse-LVIS benchmark, compared to less than 10% for existing methods. OpenShape also achieves an accuracy of 85.3% on ModelNet40, outperforming previous zero-shot baseline methods by 20% and performing on par with some fully-supervised methods. Furthermore, we show that our learned embeddings encode a wide range of visual and semantic concepts (e.g., subcategories, color, shape, style) and facilitate fine-grained text-3D and image-3D interactions. Due to their alignment with CLIP embeddings, our learned shape representations can also be integrated with off-the-shelf CLIP-based models for various applications, such as point cloud captioning and point cloud-conditioned image generation.Comment: Project Website: https://colin97.github.io/OpenShape

    3D registration and integrated segmentation framework for heterogeneous unmanned robotic systems

    Get PDF
    The paper proposes a novel framework for registering and segmenting 3D point clouds of large-scale natural terrain and complex environments coming from a multisensor heterogeneous robotics system, consisting of unmanned aerial and ground vehicles. This framework involves data acquisition and pre-processing, 3D heterogeneous registration and integrated multi-sensor based segmentation modules. The first module provides robust and accurate homogeneous registrations of 3D environmental models based on sensors' measurements acquired from the ground (UGV) and aerial (UAV) robots. For 3D UGV registration, we proposed a novel local minima escape ICP (LME-ICP) method, which is based on the well known iterative closest point (ICP) algorithm extending it by the introduction of our local minima estimation and local minima escape mechanisms. It did not require any prior known pose estimation information acquired from sensing systems like odometry, global positioning system (GPS), or inertial measurement units (IMU). The 3D UAV registration has been performed using the Structure from Motion (SfM) approach. In order to improve and speed up the process of outliers removal for large-scale outdoor environments, we introduced the Fast Cluster Statistical Outlier Removal (FCSOR) method. This method was used to filter out the noise and to downsample the input data, which will spare computational and memory resources for further processing steps. Then, we co-registered a point cloud acquired from a laser ranger (UGV) and a point cloud generated from images (UAV) generated by the SfM method. The 3D heterogeneous module consists of a semi-automated 3D scan registration system, developed with the aim to overcome the shortcomings of the existing fully automated 3D registration approaches. This semi-automated registration system is based on the novel Scale Invariant Registration Method (SIRM). The SIRM provides the initial scaling between two heterogenous point clouds and provides an adaptive mechanism for tuning the mean scale, based on the difference between two consecutive estimated point clouds' alignment error values. Once aligned, the resulting homogeneous ground-aerial point cloud is further processed by a segmentation module. For this purpose, we have proposed a system for integrated multi-sensor based segmentation of 3D point clouds. This system followed a two steps sequence: ground-object segmentation and color-based region-growing segmentation. The experimental validation of the proposed 3D heterogeneous registration and integrated segmentation framework was performed on large-scale datasets representing unstructured outdoor environments, demonstrating the potential and benefits of the proposed semi-automated 3D registration system in real-world environments
    corecore