51,834 research outputs found

    Efficient Feature Matching for Large-scale Images based on Cascade Hash and Local Geometric Constraint

    Get PDF
    Feature matching plays a crucial role in 3D reconstruction to provide correspondences between overlapped images. The accuracy and efficiency of feature matching significantly impact the performance of 3D reconstruction. The widely used framework with the exhaustive nearest neighbor searching (NNS) between descriptors and RANSAC-based geometric estimation is, however, low-efficient and unreliable for large-scale UAV images. Inspired by indexing-based NNS, this paper implements an efficient feature matching method for large-scale images based on Cascade Hashing and local geometric constraints. Our proposed method improves upon traditional feature matching approaches by introducing a combination of image retrieval, data scheduling, and GPU-accelerated Cascade Hashing. Besides, it utilizes a local geometric constraint to filter matching results within a matching framework. On the one hand, the GPU-accelerated Cascade Hashing technique generates compact and discriminative hash codes based on image features, facilitating the rapid completion of the initial matching process, and significantly reducing the search space and time complexity. On the other hand, after the initial matching is completed, the method employs a local geometric constraint to filter the initial matching results, enhancing the accuracy of the matching results. This forms a three-tier framework based on data scheduling, GPU-accelerated Cascade Hashing, and local geometric constraints. We conducted experiments using two sets of large-scale UAV image data, comparing our method with SIFTGPU to evaluate its performance in initial matching, outlier rejection, and 3D reconstruction. The results demonstrate that our method achieves a feature matching speed 2.0 times that of SIFTGPU while maintaining matching accuracy and producing comparable reconstruction results. This suggests that our method holds promise for efficiently addressing large-scale image matching

    Geometry-driven feature detection

    Get PDF
    Matching images taken from different viewpoints is a fundamental step for many computer vision applications including 3D reconstruction, scene recognition, virtual reality, robot localization, etc. The typical approaches detect feature keypoints based on local properties to achieve robustness to viewpoint changes, and establish correspondences between keypoints to recover the 3D geometry or determine the similarity between images. The complexity of perspective distortion challenges the detection of viewpoint invariant features; the lack of 3D geometric information about local features makes their matching inefficient. In this thesis, I explore feature detection based on 3D geometric information for improved projective invariance. The main novel research contributions of this thesis are as follows. First, I give a projective invariant feature detection method that exploits 3D structures recovered from simple stereo matching. By leveraging the rich geometric information of the detected features, I present an efficient 3D matching algorithm to handle large viewpoint changes. Second, I propose a compact high-level feature detector that robustly extracts repetitive structures in urban scenes, which allows efficient wide-baseline matching. I further introduce a novel single-view reconstruction approach to recover the 3D dense geometry of the repetition-based features

    3D Reconstruction of external anatomical structures from image sequences

    Get PDF
    Three-dimensional (3D) reconstruction of objects from images has been one of the major topics in Computer Vision. Recently, volumetric methods have been successfully used in 3D reconstruction of objects with complex shapes. Comparing with stereo-based methods, they are more efficient in building 3D models of smooth objects [1]. They work in the object volumetric space and do not require a matching process between the images used, which is usually very complex with smooth objects. Volumetric methods represent the object with a finite set of geometric primitives, usually designated by voxels [1, 2].The objective of the work here described is to build a 3D model of the object, with good precision and photorealistic appearance. For that, an image sequence around the object is acquired by an off-the-shelf CCD camera, without imposing any restriction on the motion involved. Then the camera is calibrated by using Zhangs method [3], the object is segmented in all input images and finally its 3D model is built. The employed volumetric approach uses octrees to represent the volume of the object, which is interactively refined in order to achieve the final shape of the object.Two objects were experimentally used to test the approach adopted, a parallelepiped and a human hand model, and the results obtained were quite satisfactory.The future work will be concerned with the implementation of an auto-calibration method and in the 3D reconstruction of deformable objects

    Semantic 3D Occupancy Mapping through Efficient High Order CRFs

    Full text link
    Semantic 3D mapping can be used for many applications such as robot navigation and virtual interaction. In recent years, there has been great progress in semantic segmentation and geometric 3D mapping. However, it is still challenging to combine these two tasks for accurate and large-scale semantic mapping from images. In the paper, we propose an incremental and (near) real-time semantic mapping system. A 3D scrolling occupancy grid map is built to represent the world, which is memory and computationally efficient and bounded for large scale environments. We utilize the CNN segmentation as prior prediction and further optimize 3D grid labels through a novel CRF model. Superpixels are utilized to enforce smoothness and form robust P N high order potential. An efficient mean field inference is developed for the graph optimization. We evaluate our system on the KITTI dataset and improve the segmentation accuracy by 10% over existing systems.Comment: IROS 201

    Semantically Informed Multiview Surface Refinement

    Full text link
    We present a method to jointly refine the geometry and semantic segmentation of 3D surface meshes. Our method alternates between updating the shape and the semantic labels. In the geometry refinement step, the mesh is deformed with variational energy minimization, such that it simultaneously maximizes photo-consistency and the compatibility of the semantic segmentations across a set of calibrated images. Label-specific shape priors account for interactions between the geometry and the semantic labels in 3D. In the semantic segmentation step, the labels on the mesh are updated with MRF inference, such that they are compatible with the semantic segmentations in the input images. Also, this step includes prior assumptions about the surface shape of different semantic classes. The priors induce a tight coupling, where semantic information influences the shape update and vice versa. Specifically, we introduce priors that favor (i) adaptive smoothing, depending on the class label; (ii) straightness of class boundaries; and (iii) semantic labels that are consistent with the surface orientation. The novel mesh-based reconstruction is evaluated in a series of experiments with real and synthetic data. We compare both to state-of-the-art, voxel-based semantic 3D reconstruction, and to purely geometric mesh refinement, and demonstrate that the proposed scheme yields improved 3D geometry as well as an improved semantic segmentation

    Semantic Visual Localization

    Full text link
    Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes