904 research outputs found

    Planar PØP: feature-less pose estimation with applications in UAV localization

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.We present a featureless pose estimation method that, in contrast to current Perspective-n-Point (PnP) approaches, it does not require n point correspondences to obtain the camera pose, allowing for pose estimation from natural shapes that do not necessarily have distinguished features like corners or intersecting edges. Instead of using n correspondences (e.g. extracted with a feature detector) we will use the raw polygonal representation of the observed shape and directly estimate the pose in the pose-space of the camera. This method compared with a general PnP method, does not require n point correspondences neither a priori knowledge of the object model (except the scale), which is registered with a picture taken from a known robot pose. Moreover, we achieve higher precision because all the information of the shape contour is used to minimize the area between the projected and the observed shape contours. To emphasize the non-use of n point correspondences between the projected template and observed contour shape, we call the method Planar PØP. The method is shown both in simulation and in a real application consisting on a UAV localization where comparisons with a precise ground-truth are provided.Peer ReviewedPostprint (author's final draft

    Map-Based Localization for Unmanned Aerial Vehicle Navigation

    Get PDF
    Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments. Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments. The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%

    On Rendering Synthetic Images for Training an Object Detector

    Get PDF
    We propose a novel approach to synthesizing images that are effective for training object detectors. Starting from a small set of real images, our algorithm estimates the rendering parameters required to synthesize similar images given a coarse 3D model of the target object. These parameters can then be reused to generate an unlimited number of training images of the object of interest in arbitrary 3D poses, which can then be used to increase classification performances. A key insight of our approach is that the synthetically generated images should be similar to real images, not in terms of image quality, but rather in terms of features used during the detector training. We show in the context of drone, plane, and car detection that using such synthetically generated images yields significantly better performances than simply perturbing real images or even synthesizing images in such way that they look very realistic, as is often done when only limited amounts of training data are available

    Cross-View Visual Geo-Localization for Outdoor Augmented Reality

    Full text link
    Precise estimation of global orientation and location is critical to ensure a compelling outdoor Augmented Reality (AR) experience. We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database. Recently, neural network-based methods have shown state-of-the-art performance in cross-view matching. However, most of the prior works focus only on location estimation, ignoring orientation, which cannot meet the requirements in outdoor AR applications. We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation. Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance. Furthermore, we present an approach to extend the single image query-based geo-localization approach by utilizing temporal information from a navigation pipeline for robust continuous geo-localization. Experimentation on several large-scale real-world video sequences demonstrates that our approach enables high-precision and stable AR insertion.Comment: IEEE VR 202

    RPAS AND TLS TECNIQUES FOR ARCHAEOLOGICAL SURVEY: THE CASE STUDY OF THE ARCHAEOLOGICAL SITE OF ERACLEA MINOA (ITALY)

    Get PDF
    Digital documentation and 3D modelling of archaeological sites are important for understanding, definition and recognition of the values of the sites and of the archaeological finds. The most part of archaeological sites are outdoor location, but a cover to preserve the ruins protects often parts of the sites. The possibility to acquire data with different techniques and merge them by using a single reference system allows creating multi-parties models in which 3D representations of the individual objects can be inserted. The paper presents the results of a recent study carried out by Geomatics Laboratory of University of Palermo for the digital documentation and 3D modelling of Eraclea Minoa archaeological site. This site is located near Agrigento, in the south of Sicily (Italy) and is one of the most famous ancient Greek colonies of Sicily. The paper presents the results of the integration of different data source to survey the Eraclea Minoa archaeological site. The application of two highly versatile recording systems, the TLS (Terrestrial Laser Scanning) and the RPAS (Remotely Piloted Aircraft System), allowed the Eraclea Minoa site to be documented in high resolution and with high accuracy. The integration of the two techniques has demonstrated the possibility to obtain high quality and accurate 3D models in archaeological survey

    FROM 3D SURVEY TO DIGITAL REALITY OF A COMPLEX ARCHITECTURE: A DIGITAL WORKFLOW FOR CULTURAL HERITAGE PROMOTION

    Get PDF
    In recent years, the digitalization and dissemination of historical heritage have become crucial nodes in the preservation and valorization of Cultural Heritage (CH). Technologies such as Unmanned Aerial Vehicle (UAV) and terrestrial photogrammetry, Terrestrial Laser Scanning (TLS) and handheld Simultaneous Localisation and Mapping (SLAM) laser scanning allow the generation of digital models of architecture that can be explored through interactive web platforms, such as those based on WebGL graphic library. These are considered one of the most promising innovations for digitizing and sharing CH site due to their application in a wide range of contexts, promoting new forms of interaction with architecture at different scales. Additionally, the use of geomatic tools allows for a more complete 3D reconstruction and evaluation of the results by comparing different techniques. The article focuses on digitization as a tool for documenting and sharing CH assets, with the aim of developing a replicable prototype platform for an immersive Virtual Tour (VT) of an art collection and the architectural complex in which it is resided. In addition, this paper presents the results of a case study conducted at the Ricci Oddi Gallery of Modern Art in Piacenza, Italy. The source code of the implemented application is available on GitHub to permit replicability for other case studies

    Sensor fusion of camera, GPS and IMU using fuzzy adaptive multiple motion models

    Get PDF
    A tracking system that will be used for augmented reality applications has two main requirements: accuracy and frame rate. The first requirement is related to the performance of the pose estimation algorithm and how accurately the tracking system can find the position and orientation of the user in the environment. Accuracy problems of current tracking devices, considering that they are low-cost devices, cause static errors during this motion estimation process. The second requirement is related to dynamic errors (the end-to-end system delay, occurring because of the delay in estimating the motion of the user and displaying images based on this estimate). This paper investigates combining the vision-based estimates with measurements from other sensors, GPS and IMU, in order to improve the tracking accuracy in outdoor environments. The idea of using Fuzzy Adaptive Multiple Models was investigated using a novel fuzzy rule-based approach to decide on the model that results in improved accuracy and faster convergence for the fusion filter. Results show that the developed tracking system is more accurate than a conventional GPS–IMU fusion approach due to additional estimates from a camera and fuzzy motion models. The paper also presents an application in cultural heritage context running at modest frame rates due to the design of the fusion algorithm
    • …
    corecore