15,280 research outputs found

    High-Precision Localization Using Ground Texture

    Full text link
    Location-aware applications play an increasingly critical role in everyday life. However, satellite-based localization (e.g., GPS) has limited accuracy and can be unusable in dense urban areas and indoors. We introduce an image-based global localization system that is accurate to a few millimeters and performs reliable localization both indoors and outside. The key idea is to capture and index distinctive local keypoints in ground textures. This is based on the observation that ground textures including wood, carpet, tile, concrete, and asphalt may look random and homogeneous, but all contain cracks, scratches, or unique arrangements of fibers. These imperfections are persistent, and can serve as local features. Our system incorporates a downward-facing camera to capture the fine texture of the ground, together with an image processing pipeline that locates the captured texture patch in a compact database constructed offline. We demonstrate the capability of our system to robustly, accurately, and quickly locate test images on various types of outdoor and indoor ground surfaces

    Attention and Anticipation in Fast Visual-Inertial Navigation

    Get PDF
    We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of visual-inertial navigation? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate visual-inertial navigation while appearance-based feature selection fails to track robot's motion during aggressive maneuvers.Comment: 20 pages, 7 figures, 2 table

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    MLPnP - A Real-Time Maximum Likelihood Solution to the Perspective-n-Point Problem

    Get PDF
    In this paper, a statistically optimal solution to the Perspective-n-Point (PnP) problem is presented. Many solutions to the PnP problem are geometrically optimal, but do not consider the uncertainties of the observations. In addition, it would be desirable to have an internal estimation of the accuracy of the estimated rotation and translation parameters of the camera pose. Thus, we propose a novel maximum likelihood solution to the PnP problem, that incorporates image observation uncertainties and remains real-time capable at the same time. Further, the presented method is general, as is works with 3D direction vectors instead of 2D image points and is thus able to cope with arbitrary central camera models. This is achieved by projecting (and thus reducing) the covariance matrices of the observations to the corresponding vector tangent space.Comment: Submitted to the ISPRS congress (2016) in Prague. Oral Presentation. Published in ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-3, 131-13

    Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection

    Full text link
    This paper proposes a novel method to estimate the global scale of a 3D reconstructed model within a Kalman filtering-based monocular SLAM algorithm. Our Bayesian framework integrates height priors over the detected objects belonging to a set of broad predefined classes, based on recent advances in fast generic object detection. Each observation is produced on single frames, so that we do not need a data association process along video frames. This is because we associate the height priors with the image region sizes at image places where map features projections fall within the object detection regions. We present very promising results of this approach obtained on several experiments with different object classes.Comment: Int. Workshop on Visual Odometry, CVPR, (July 2017

    Efficient Constellation-Based Map-Merging for Semantic SLAM

    Full text link
    Data association in SLAM is fundamentally challenging, and handling ambiguity well is crucial to achieve robust operation in real-world environments. When ambiguous measurements arise, conservatism often mandates that the measurement is discarded or a new landmark is initialized rather than risking an incorrect association. To address the inevitable `duplicate' landmarks that arise, we present an efficient map-merging framework to detect duplicate constellations of landmarks, providing a high-confidence loop-closure mechanism well-suited for object-level SLAM. This approach uses an incrementally-computable approximation of landmark uncertainty that only depends on local information in the SLAM graph, avoiding expensive recovery of the full system covariance matrix. This enables a search based on geometric consistency (GC) (rather than full joint compatibility (JC)) that inexpensively reduces the search space to a handful of `best' hypotheses. Furthermore, we reformulate the commonly-used interpretation tree to allow for more efficient integration of clique-based pairwise compatibility, accelerating the branch-and-bound max-cardinality search. Our method is demonstrated to match the performance of full JC methods at significantly-reduced computational cost, facilitating robust object-based loop-closure over large SLAM problems.Comment: Accepted to IEEE International Conference on Robotics and Automation (ICRA) 201

    Person re-identification via efficient inference in fully connected CRF

    Full text link
    In this paper, we address the problem of person re-identification problem, i.e., retrieving instances from gallery which are generated by the same person as the given probe image. This is very challenging because the person's appearance usually undergoes significant variations due to changes in illumination, camera angle and view, background clutter, and occlusion over the camera network. In this paper, we assume that the matched gallery images should not only be similar to the probe, but also be similar to each other, under suitable metric. We express this assumption with a fully connected CRF model in which each node corresponds to a gallery and every pair of nodes are connected by an edge. A label variable is associated with each node to indicate whether the corresponding image is from target person. We define unary potential for each node using existing feature calculation and matching techniques, which reflect the similarity between probe and gallery image, and define pairwise potential for each edge in terms of a weighed combination of Gaussian kernels, which encode appearance similarity between pair of gallery images. The specific form of pairwise potential allows us to exploit an efficient inference algorithm to calculate the marginal distribution of each label variable for this dense connected CRF. We show the superiority of our method by applying it to public datasets and comparing with the state of the art.Comment: 7 pages, 4 figure
    • …
    corecore