71,937 research outputs found

    Location Estimation of a Photo: A Geo-signature MapReduce Workflow

    Get PDF
    Location estimation of a photo is the method to find the location where the photo was taken that is a new branch of image retrieval. Since a large number of photos are shared on the social multimedia. Some photos are without geo-tagging which can be estimated their location with the help of million geo-tagged photos from the social multimedia. Recent researches about the location estimation of a photo are available. However, most of them are neglectful to define the uniqueness of one place that is able to be totally distinguished from other places. In this paper, we design a workflow named G-sigMR (Geo-signature MapReduce) for the improvement of recognition performance. Our workflow generates the uniqueness of a location named Geo-signature which is summarized from the visual synonyms with the MapReduce structure for indexing to the large-scale dataset. In light of the validity for image retrieval, our G-sigMR was quantitatively evaluated using the standard benchmark specific for location estimation; to compare with other well-known approaches (IM2GPS, SC, CS, MSER, VSA and VCG) in term of average recognition rate. From the results, G-sigMR outperformed previous approaches.Location estimation of a photo is the method to find the location where the photo was taken that is a new branch of image retrieval. Since a large number of photos are shared on the social multimedia. Some photos are without geo-tagging which can be estimated their location with the help of million geo-tagged photos from the social multimedia. Recent researches about the location estimation of a photo are available. However, most of them are neglectful to define the uniqueness of one place that is able to be totally distinguished from other places. In this paper, we design a workflow named G-sigMR (Geo-signature MapReduce) for the improvement of recognition performance. Our workflow generates the uniqueness of a location named Geo-signature which is summarized from the visual synonyms with the MapReduce structure for indexing to the large-scale dataset. In light of the validity for image retrieval, our G-sigMR was quantitatively evaluated using the standard benchmark specific for location estimation; to compare with other well-known approaches (IM2GPS, SC, CS, MSER, VSA and VCG) in term of average recognition rate. From the results, G-sigMR outperformed previous approaches

    On-device Scalable Image-based Localization via Prioritized Cascade Search and Fast One-Many RANSAC.

    Get PDF
    We present the design of an entire on-device system for large-scale urban localization using images. The proposed design integrates compact image retrieval and 2D-3D correspondence search to estimate the location in extensive city regions. Our design is GPS agnostic and does not require network connection. In order to overcome the resource constraints of mobile devices, we propose a system design that leverages the scalability advantage of image retrieval and accuracy of 3D model-based localization. Furthermore, we propose a new hashing-based cascade search for fast computation of 2D-3D correspondences. In addition, we propose a new one-many RANSAC for accurate pose estimation. The new one-many RANSAC addresses the challenge of repetitive building structures (e.g. windows, balconies) in urban localization. Extensive experiments demonstrate that our 2D-3D correspondence search achieves state-of-the-art localization accuracy on multiple benchmark datasets. Furthermore, our experiments on a large Google Street View (GSV) image dataset show the potential of large-scale localization entirely on a typical mobile device

    InLoc: Indoor Visual Localization with Dense Matching and View Synthesis

    Get PDF
    We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect to a large indoor 3D map. The contributions of this work are three-fold. First, we develop a new large-scale visual localization method targeted for indoor environments. The method proceeds along three steps: (i) efficient retrieval of candidate poses that ensures scalability to large-scale environments, (ii) pose estimation using dense matching rather than local features to deal with textureless indoor scenes, and (iii) pose verification by virtual view synthesis to cope with significant changes in viewpoint, scene layout, and occluders. Second, we collect a new dataset with reference 6DoF poses for large-scale indoor localization. Query photographs are captured by mobile phones at a different time than the reference 3D map, thus presenting a realistic indoor localization scenario. Third, we demonstrate that our method significantly outperforms current state-of-the-art indoor localization approaches on this new challenging data

    Coding local and global binary visual features extracted from video sequences

    Get PDF
    Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin

    Understanding the Limitations of CNN-based Absolute Camera Pose Regression

    Full text link
    Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201
    • …
    corecore