28 research outputs found

    Planar PØP: feature-less pose estimation with applications in UAV localization

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.We present a featureless pose estimation method that, in contrast to current Perspective-n-Point (PnP) approaches, it does not require n point correspondences to obtain the camera pose, allowing for pose estimation from natural shapes that do not necessarily have distinguished features like corners or intersecting edges. Instead of using n correspondences (e.g. extracted with a feature detector) we will use the raw polygonal representation of the observed shape and directly estimate the pose in the pose-space of the camera. This method compared with a general PnP method, does not require n point correspondences neither a priori knowledge of the object model (except the scale), which is registered with a picture taken from a known robot pose. Moreover, we achieve higher precision because all the information of the shape contour is used to minimize the area between the projected and the observed shape contours. To emphasize the non-use of n point correspondences between the projected template and observed contour shape, we call the method Planar PØP. The method is shown both in simulation and in a real application consisting on a UAV localization where comparisons with a precise ground-truth are provided.Peer ReviewedPostprint (author's final draft

    METROLOGICAL CHARACTERIZATION OF A LASER-CAMERA 3D VISION SYSTEM THROUGH PERSPECTIVE-N-POINT POSE COMPUTATION AND MONTE CARLO SIMULATIONS

    Get PDF
    Abstract. This study focuses on the metrological characterization of a 3D vision system consisting in the fusion of a CMOS camera sensor with a 2D laser scanner for contactless dimensional measurements. The purpose is to obtain an enhanced measurement information as a result of the combination of two different data sources. On one side, we can estimate the pose of the target measurand by solving the well-known Perspective-n-Point (PnP) problem from the calibrated camera. On the other side, the 2D laser scanner generates a discrete point cloud which describes the profile of the intercepted surface of the same target object. This solution allows to estimate the target's geometrical parameters through the application of fit-to-purpose algorithms that see the data acquired by the overall system as their input. The measurement uncertainty is evaluated by applying the Monte Carlo Method (MCM) to estimate the uncertainty deriving from the Probability Distribution Functions (PDF) of the input variables. Through a Design of Experiments (DOE) model the effects of different influence factors were evaluated

    Sample-Efficient Learning to Solve a Real-World Labyrinth Game Using Data-Augmented Model-Based Reinforcement Learning

    Full text link
    Motivated by the challenge of achieving rapid learning in physical environments, this paper presents the development and training of a robotic system designed to navigate and solve a labyrinth game using model-based reinforcement learning techniques. The method involves extracting low-dimensional observations from camera images, along with a cropped and rectified image patch centered on the current position within the labyrinth, providing valuable information about the labyrinth layout. The learning of a control policy is performed purely on the physical system using model-based reinforcement learning, where the progress along the labyrinth's path serves as a reward signal. Additionally, we exploit the system's inherent symmetries to augment the training data. Consequently, our approach learns to successfully solve a popular real-world labyrinth game in record time, with only 5 hours of real-world training data

    Classification of Safety Driver Attention During Autonomous Vehicle Operation

    Full text link
    Despite the continual advances in Advanced Driver Assistance Systems (ADAS) and the development of high-level autonomous vehicles (AV), there is a general consensus that for the short to medium term, there is a requirement for a human supervisor to handle the edge cases that inevitably arise. Given this requirement, it is essential that the state of the vehicle operator is monitored to ensure they are contributing to the vehicle's safe operation. This paper introduces a dual-source approach integrating data from an infrared camera facing the vehicle operator and vehicle perception systems to produce a metric for driver alertness in order to promote and ensure safe operator behaviour. The infrared camera detects the driver's head, enabling the calculation of head orientation, which is relevant as the head typically moves according to the individual's focus of attention. By incorporating environmental data from the perception system, it becomes possible to determine whether the vehicle operator observes objects in the surroundings. Experiments were conducted using data collected in Sydney, Australia, simulating AV operations in an urban environment. Our results demonstrate that the proposed system effectively determines a metric for the attention levels of the vehicle operator, enabling interventions such as warnings or reducing autonomous functionality as appropriate. This comprehensive solution shows promise in contributing to ADAS and AVs' overall safety and efficiency in a real-world setting

    THE INVESTIGATION ON ARABIC WORD POSE ESTIMATION ALGORITHM AS MARKER FOR AUGMENTED REALITY APPLICATION

    Get PDF
    This study investigates which combination of matching technique with Infinitesimal Plane-Based Pose Estimation (IPPE) that suits better in estimating the pose of Arabic text images without character segmentation. The pattern matching technique involves are Speeded-Up Robust Features (SURF) and Affine Scale Invariant Feature Transform (ASIFT). The experiment is demonstrated in Arabic word images from different angles of viewpoints. The algorithms are tested on a dataset chosen from a few words within Surah Al-Fatihah in the Quran. The total of 260 images was taken from left and right side of the image. Then, a set of sub-words were recognized and tested the performance. This study will focus on comparing the performance of the technique against Arabic words in two sub-words or one sub-word form. We will evaluate the performance through analyzing the matching accuracy rate and how it affects the pose estimation. Based on results obtained for the pattern matching technique performance on Arabic scripts, SURF shows a better accuracy rate and execution time compared to another algorithm. This experiment result is used as a guide in estimating a pose of the target images in different sub-words. The overall results of the study signify that good IPPE pose does not rely on the accuracy rate of matching inliers with original interest points. The study also demonstrates that one sub-words shows a better accuracy rate than with two sub-words cause by unnecessary interest points detected

    Enhancing feline exercise : A Safe YOLO-based laser toy

    Get PDF
    This project develops a cat laser toy that shines a laser for the cat to chase, while ensuring that the laser is not shone on a human or in user defined regions where the cat does not belong. A night vision camera with infrared lights is used to allow for detection in both day and night lighting conditions, and cats and humans are detected using a custom YOLO machine learning model. Aiming of the camera and laser is accomplished with a pan-tilt robot controlled by a Raspberry Pi. The Raspberry Pi selected for the project is slower at processing the images than desired, but fast enough to allow for a functional toy. The code for this project can be found at https://gitlab.com/dean.sieck/gimpy_squirte

    Champion-level drone racing using deep reinforcement learning

    Get PDF
    First-person view (FPV) drone racing is a televised sport in which professional competitors pilot high-speed aircraft through a 3D circuit. Each pilot sees the environment from the perspective of their drone by means of video streamed from an onboard camera. Reaching the level of professional pilots with an autonomous drone is challenging because the robot needs to fly at its physical limits while estimating its speed and location in the circuit exclusively from onboard sensors. Here we introduce Swift, an autonomous system that can race physical vehicles at the level of the human world champions. The system combines deep reinforcement learning (RL) in simulation with data collected in the physical world. Swift competed against three human champions, including the world champions of two international leagues, in real-world head-to-head races. Swift won several races against each of the human champions and demonstrated the fastest recorded race time. This work represents a milestone for mobile robotics and machine intelligence, which may inspire the deployment of hybrid learning-based solutions in other physical systems

    sSLAM: Speeded-Up Visual SLAM Mixing Artificial Markers and Temporary Keypoints

    Get PDF
    Environment landmarks are generally employed by visual SLAM (vSLAM) methods in the form of keypoints. However, these landmarks are unstable over time because they belong to areas that tend to change, e.g., shadows or moving objects. To solve this, some other authors have proposed the combination of keypoints and artificial markers distributed in the environment so as to facilitate the tracking process in the long run. Artificial markers are special elements (similar to beacons) that can be permanently placed in the environment to facilitate tracking. In any case, these systems keep a set of keypoints that is not likely to be reused, thus unnecessarily increasing the computing time required for tracking. This paper proposes a novel visual SLAM approach that efficiently combines keypoints and artificial markers, allowing for a substantial reduction in the computing time and memory required without noticeably degrading the tracking accuracy. In the first stage, our system creates a map of the environment using both keypoints and artificial markers, but once the map is created, the keypoints are removed and only the markers are kept. Thus, our map stores only long-lasting features of the environment (i.e., the markers). Then, for localization purposes, our algorithm uses the marker information along with temporary keypoints created just in the time of tracking, which are removed after a while. Since our algorithm keeps only a small subset of recent keypoints, it is faster than the state-of-the-art vSLAM approaches. The experimental results show that our proposed sSLAM compares favorably with ORB-SLAM2, ORB-SLAM3, OpenVSLAM and UcoSLAM in terms of speed, without statistically significant differences in accuracy

    sSLAM: Speeded-Up Visual SLAM Mixing Artificial Markers and Temporary Keypoints

    Get PDF
    Environment landmarks are generally employed by visual SLAM (vSLAM) methods in the form of keypoints. However, these landmarks are unstable over time because they belong to areas that tend to change, e.g., shadows or moving objects. To solve this, some other authors have proposed the combination of keypoints and artificial markers distributed in the environment so as to facilitate the tracking process in the long run. Artificial markers are special elements (similar to beacons) that can be permanently placed in the environment to facilitate tracking. In any case, these systems keep a set of keypoints that is not likely to be reused, thus unnecessarily increasing the computing time required for tracking. This paper proposes a novel visual SLAM approach that efficiently combines keypoints and artificial markers, allowing for a substantial reduction in the computing time and memory required without noticeably degrading the tracking accuracy. In the first stage, our system creates a map of the environment using both keypoints and artificial markers, but once the map is created, the keypoints are removed and only the markers are kept. Thus, our map stores only long-lasting features of the environment (i.e., the markers). Then, for localization purposes, our algorithm uses the marker information along with temporary keypoints created just in the time of tracking, which are removed after a while. Since our algorithm keeps only a small subset of recent keypoints, it is faster than the state-of-the-art vSLAM approaches. The experimental results show that our proposed sSLAM compares favorably with ORB-SLAM2, ORB-SLAM3, OpenVSLAM and UcoSLAM in terms of speed, without statistically significant differences in accuracy.This research was funded by the project PID2019-103871GB-I00 of the Spanish Ministry of Economy, Industry and Competitiveness, FEDER, Project 1380047-F UCOFEDER-2021 of Andalusia and by the European Union–NextGeneration EU for requalification of Spanish University System 2021–2023
    corecore