Search CORE

125 research outputs found

L6DNet: Light 6 DoF Network for Robust and Precise Object Pose Estimation with Small Datasets

Author: Gonzalez Mathieu
Kacete Amine
Marchand Eric
Murienne Albert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2021
Field of study

Estimating the 3D pose of an object is a challenging task that can be considered within augmented reality or robotic applications. In this paper, we propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image. We adopt a hybrid pipeline in two stages: data-driven and geometric respectively. The data-driven step consists of a classification CNN to estimate the object 2D location in the image from local patches, followed by a regression CNN trained to predict the 3D location of a set of keypoints in the camera coordinate system. To extract the pose information, the geometric step consists in aligning the 3D points in the camera coordinate system with the corresponding 3D points in world coordinate system by minimizing a registration error, thus computing the pose. Our experiments on the standard dataset LineMod show that our approach is more robust and accurate than state-of-the-art methods. The approach is also validated to achieve a 6 DoF positioning task by visual servoing.Comment: This work has been accepted at IEEE Robotics and Automation Letter

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Robust Digital-Twin Localization via An RGBD-based Transformer Network and A Comprehensive Evaluation on a Mobile Dataset

Author: Feng Weiyu
Huang Zixun
Pan Chuanyu
Xu Tianjian
Yang Allen Y.
Yao Keling
Zhao Seth Z.
Publication venue
Publication date: 09/10/2023
Field of study

The potential of digital-twin technology, involving the creation of precise digital replicas of physical objects, to reshape AR experiences in 3D object tracking and localization scenarios is significant. However, enabling robust 3D object tracking in dynamic mobile AR environments remains a formidable challenge. These scenarios often require a more robust pose estimator capable of handling the inherent sensor-level measurement noise. In this paper, recognizing the challenges of comprehensive solutions in existing literature, we propose a transformer-based 6DoF pose estimator designed to achieve state-of-the-art accuracy under real-world noisy data. To systematically validate the new solution's performance against the prior art, we also introduce a novel RGBD dataset called Digital Twin Tracking Dataset v2 (DTTD2), which is focused on digital-twin object tracking scenarios. Expanded from an existing DTTD v1 (DTTD1), the new dataset adds digital-twin data captured using a cutting-edge mobile RGBD sensor suite on Apple iPhone 14 Pro, expanding the applicability of our approach to iPhone sensor data. Through extensive experimentation and in-depth analysis, we illustrate the effectiveness of our methods under significant depth data errors, surpassing the performance of existing baselines. Code and dataset are made publicly available at: https://github.com/augcog/DTTD

arXiv.org e-Print Archive

Estimation of Absolute Scale in Monocular SLAM Using Synthetic Data

Author: Kaestner Ralf
Mouritzen Daniel
Rufli Martin
Rukhovich Danila
Velizhev Alexander
Publication venue
Publication date: 01/09/2019
Field of study

This paper addresses the problem of scale estimation in monocular SLAM by estimating absolute distances between camera centers of consecutive image frames. These estimates would improve the overall performance of classical (not deep) SLAM systems and allow metric feature locations to be recovered from a single monocular camera. We propose several network architectures that lead to an improvement of scale estimation accuracy over the state of the art. In addition, we exploit a possibility to train the neural network only with synthetic data derived from a computer graphics simulator. Our key insight is that, using only synthetic training inputs, we can achieve similar scale estimation accuracy as that obtained from real data. This fact indicates that fully annotated simulated data is a viable alternative to existing deep-learning-based SLAM systems trained on real (unlabeled) data. Our experiments with unsupervised domain adaptation also show that the difference in visual appearance between simulated and real data does not affect scale estimation results. Our method operates with low-resolution images (0.03MP), which makes it practical for real-time SLAM applications with a monocular camera

arXiv.org e-Print Archive

Crossref

Scipedia

Recommended from our members

Semantic localisation via globally unique instance segmentation

Author: Budvytis I
Cipolla R
Sauer P
Publication venue: British Machine Vision Conference 2018, BMVC 2018
Publication date: 01/01/2019
Field of study

In this work we propose a novel approach to semantic localisation. Our work is motivated by the need for environment perception techniques which not only perform self-localisation within a map but also simultaneously recognise surrounding objects. Such capabilities are crucial for computer vision applications which interact with the environment: autonomous driving, augmented reality or robotics. In order to achieve this goal we propose a solution which consists of three key steps. Firstly, a database of panoramic RGB images and corresponding globally unique, per-pixel object instance labels is built for the desired environment where we typically consider objects from static categories such as "building" or "tree". Secondly, a semantic segmentation network capable of predicting more than 3000 labels is trained on the collected data. Finally, for a given panoramic query image, the corresponding instance label image predicted by the network is used for semantic matching within the database. The matching is performed in two stages: (i) a fast retrieval of a small subset of database images (~100) with highly overlapping instance label histograms, followed by (ii) an explicit approximate 3 DoF (yaw, pitch, roll) alignment of the selected subset of images and the query image. We evaluate our approach in challenging indoor and outdoor navigation scenarios, achieving better or similar performance when compared to state-of-the-art image retrieval-based localisation approaches using key-point matching and image level embedding. Our contribution includes: (i) a description of a novel semantic localisation approach using globally unique instance segmentation, (ii) corresponding quantitative and qualitative analysis and (iii) a novel CamVid-360 dataset containing 986 labelled instances of buildings, trees, road signs and poles

Apollo (Cambridge)

Autonomous Localization Of A Uav In A 3d Cad Model

Author: Haque Akkas
Publication venue: UND Scholarly Commons
Publication date: 01/01/2018
Field of study

This thesis presents a novel method of indoor localization and autonomous navigation of Unmanned Aerial Vehicles(UAVs) within a building, given a prebuilt Computer Aided Design(CAD) model of the building. The proposed system is novel in that it leverages the support of machine learning and traditional computer vision techniques to provide a robust method of localizing and navigating a drone autonomously in indoor and GPS denied environments leveraging preexisting knowledge of the environment. The goal of this work is to devise a method to enable a UAV to deduce its current pose within a CAD model that is fast and accurate while also maintaining efficient use of resources. A 3-Dimensional CAD model of the building to be navigated through is provided as input to the system along with the required goal position. Initially, the UAV has no idea of its location within the building. The system, comprising a stereo camera system and an Inertial Measurement Unit(IMU) as its sensors, then generates a globally consistent map of its surroundings using a Simultaneous Localization and Mapping (SLAM) algorithm. In addition to the map, it also stores spatially correlated 3D features. These 3D features are then used to generate correspondences between the SLAM map and the 3D CAD model. The correspondences are then used to generate a transformation between the SLAM map and the 3D CAD model, thus effectively localizing the UAV in the 3D CAD model. Our method has been tested to successfully localize the UAV in the test building in an average of 15 seconds in the different scenarios tested contingent upon the abundance of target features in the observed data. Due to the absence of a motion capture system, the results have been verified by the placement of tags on the ground at strategic known locations in the building and measuring the error in the projection of the current UAV location on the ground with the tag

UND Scholarly Commons (University of North Dakota)

ProQuest OAI Repository

Collaborative SLAM using a swarm intelligence-inspired exploration method

Author: Eiane Øystein
Lunde Jakob Einarssønn
Paulsen Tor André Andersen
Publication venue: 'University of Agder'
Publication date: 01/01/2020
Field of study

Master's thesis in Mechatronics (MAS500)Efficient exploration in multi-robot SLAM is a challenging task. This thesis describes the design of algorithms that would enable Loomo robots to collaboratively explore an unknown environment. A pose graph-based SLAM algorithm using the on-board sensors of the Loomo was developed from scratch. A YOLOv3-tiny neural network has been trained to recognize other Loomos, and an exploration simulation has been developed to test exploration methods. The bots in the simulation are controlled using swarm intelligence inspired rules. The system is not finished, and further workis needed to combine the work done in the thesis into a collaborative SLAM system that runs on the Loomo robots

NORA - Norwegian Open Research Archives

Agder University Research Archive