840 research outputs found

    Affine Correspondences between Multi-Camera Systems for Relative Pose Estimation

    Full text link
    We present a novel method to compute the relative pose of multi-camera systems using two affine correspondences (ACs). Existing solutions to the multi-camera relative pose estimation are either restricted to special cases of motion, have too high computational complexity, or require too many point correspondences (PCs). Thus, these solvers impede an efficient or accurate relative pose estimation when applying RANSAC as a robust estimator. This paper shows that the 6DOF relative pose estimation problem using ACs permits a feasible minimal solution, when exploiting the geometric constraints between ACs and multi-camera systems using a special parameterization. We present a problem formulation based on two ACs that encompass two common types of ACs across two views, i.e., inter-camera and intra-camera. Moreover, the framework for generating the minimal solvers can be extended to solve various relative pose estimation problems, e.g., 5DOF relative pose estimation with known rotation angle prior. Experiments on both virtual and real multi-camera systems prove that the proposed solvers are more efficient than the state-of-the-art algorithms, while resulting in a better relative pose accuracy. Source code is available at https://github.com/jizhaox/relpose-mcs-depth

    BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization

    Full text link
    Typical algorithms for point cloud registration such as Iterative Closest Point (ICP) require a favorable initial transform estimate between two point clouds in order to perform a successful registration. State-of-the-art methods for choosing this starting condition rely on stochastic sampling or global optimization techniques such as branch and bound. In this work, we present a new method based on Bayesian optimization for finding the critical initial ICP transform. We provide three different configurations for our method which highlights the versatility of the algorithm to both find rapid results and refine them in situations where more runtime is available such as offline map building. Experiments are run on popular data sets and we show that our approach outperforms state-of-the-art methods when given similar computation time. Furthermore, it is compatible with other improvements to ICP, as it focuses solely on the selection of an initial transform, a starting point for all ICP-based methods.Comment: IEEE International Conference on Robotics and Automation 202

    Shape-Constraint Recurrent Flow for 6D Object Pose Estimation

    Full text link
    Most recent 6D object pose methods use 2D optical flow to refine their results. However, the general optical flow methods typically do not consider the target's 3D shape information during matching, making them less effective in 6D object pose estimation. In this work, we propose a shape-constraint recurrent matching framework for 6D object pose estimation. We first compute a pose-induced flow based on the displacement of 2D reprojection between the initial pose and the currently estimated pose, which embeds the target's 3D shape implicitly. Then we use this pose-induced flow to construct the correlation map for the following matching iterations, which reduces the matching space significantly and is much easier to learn. Furthermore, we use networks to learn the object pose based on the current estimated flow, which facilitates the computation of the pose-induced flow for the next iteration and yields an end-to-end system for object pose. Finally, we optimize the optical flow and object pose simultaneously in a recurrent manner. We evaluate our method on three challenging 6D object pose datasets and show that it outperforms the state of the art significantly in both accuracy and efficiency.Comment: CVPR 202

    LiDAR-Based Place Recognition For Autonomous Driving: A Survey

    Full text link
    LiDAR-based place recognition (LPR) plays a pivotal role in autonomous driving, which assists Simultaneous Localization and Mapping (SLAM) systems in reducing accumulated errors and achieving reliable localization. However, existing reviews predominantly concentrate on visual place recognition (VPR) methods. Despite the recent remarkable progress in LPR, to the best of our knowledge, there is no dedicated systematic review in this area. This paper bridges the gap by providing a comprehensive review of place recognition methods employing LiDAR sensors, thus facilitating and encouraging further research. We commence by delving into the problem formulation of place recognition, exploring existing challenges, and describing relations to previous surveys. Subsequently, we conduct an in-depth review of related research, which offers detailed classifications, strengths and weaknesses, and architectures. Finally, we summarize existing datasets, commonly used evaluation metrics, and comprehensive evaluation results from various methods on public datasets. This paper can serve as a valuable tutorial for newcomers entering the field of place recognition and for researchers interested in long-term robot localization. We pledge to maintain an up-to-date project on our website https://github.com/ShiPC-AI/LPR-Survey.Comment: 26 pages,13 figures, 5 table

    Scene representation and matching for visual localization in hybrid camera scenarios

    Get PDF
    Scene representation and matching are crucial steps in a variety of tasks ranging from 3D reconstruction to virtual/augmented/mixed reality applications, to robotics, and others. While approaches exist that tackle these tasks, they mostly overlook the issue of efficiency in the scene representation, which is fundamental in resource-constrained systems and for increasing computing speed. Also, they normally assume the use of projective cameras, while performance on systems based on other camera geometries remains suboptimal. This dissertation contributes with a new efficient scene representation method that dramatically reduces the number of 3D points. The approach sets up an optimization problem for the automated selection of the most relevant points to retain. This leads to a constrained quadratic program, which is solved optimally with a newly introduced variant of the sequential minimal optimization method. In addition, a new initialization approach is introduced for the fast convergence of the method. Extensive experimentation on public benchmark datasets demonstrates that the approach produces a compressed scene representation quickly while delivering accurate pose estimates. The dissertation also contributes with new methods for scene matching that go beyond the use of projective cameras. Alternative camera geometries, like fisheye cameras, produce images with very high distortion, making current image feature point detectors and descriptors less efficient, since designed for projective cameras. New methods based on deep learning are introduced to address this problem, where feature detectors and descriptors can overcome distortion effects and more effectively perform feature matching between pairs of fisheye images, and also between hybrid pairs of fisheye and perspective images. Due to the limited availability of fisheye-perspective image datasets, three datasets were collected for training and testing the methods. The results demonstrate an increase of the detection and matching rates which outperform the current state-of-the-art methods

    Robotic Burst Imaging for Light-Constrained 3D Reconstruction

    Get PDF
    This thesis proposes a novel input scheme, robotic burst, to improve vision-based 3D reconstruction for robots operating in low-light conditions, where existing state-of-the-art robotic vision algorithms struggle due to low signal-to-noise ratio in low-light images. We aim to improve the correspondence search stage of feature-based reconstruction using robotic burst imaging, including burst-merged images, a burst feature finder, and an end-to-end learning-based feature extractor. Firstly, we establish the use of robotic burst imaging to compute burst-merged images for feature-based reconstruction. We then develop a burst feature finder that locates features with well-defined scale and apparent motion on a burst to deal with limitations of burst-merged images such as misalignment at strong noise. To improve feature matches in burst-based reconstruction, we also present an end-to-end learning-based feature extractor that finds well-defined scale features directly on light-constrained bursts. We evaluate our methods against state-of-the-art reconstruction methods for conventional imaging that uses both classical and learning-based feature extractors. We validate our novel input scheme using burst imagery captured on a robotic arm and drones. We demonstrate progressive improvements in low-light reconstruction using our burst-based methods against conventional approaches and overall, converging 90% of all scenes captured in millilux conditions that otherwise converge with 10% success rate using conventional methods. This work opens up new avenues for applications, including autonomous driving and drone delivery at night, mining, and behavioral studies on nocturnal animals

    Survey on Motion Planning for Multirotor Aerial Vehicles in Plan-based Control Paradigm

    Get PDF
    In general, optimal motion planning can be performed both locally and globally. In such a planning, the choice in favour of either local or global planning technique mainly depends on whether the environmental conditions are dynamic or static. Hence, the most adequate choice is to use local planning or local planning alongside global planning. When designing optimal motion planning both local and global, the key metrics to bear in mind are execution time, asymptotic optimality, and quick reaction to dynamic obstacles. Such planning approaches can address the aforesaid target metrics more efficiently compared to other approaches such as path planning followed by smoothing. Thus, the foremost objective of this study is to analyse related literature in order to understand how the motion planning, especially trajectory planning, problem is formulated, when being applied for generating optimal trajectories in real-time for Multirotor Aerial Vehicles, impacts the listed metrics. As a result of the research, the trajectory planning problem was broken down into a set of subproblems, and the lists of methods for addressing each of the problems were identified and described in detail. Subsequently, the most prominent results from 2010 to 2022 were summarized and presented in the form of a timeline

    Modeling and Robust Control of Flying Robots Using Intelligent Approaches Modélisation et commande robuste des robots volants en utilisant des approches intelligentes

    Get PDF
    This thesis aims to modeling and robust controlling of a flying robot of quadrotor type. Where we focused in this thesis on quadrotor unmanned Aerial Vehicle (QUAV). Intelligent nonlinear controllers and intelligent fractional-order nonlinear controllers are designed to control. The QUAV system is considered as MIMO large-scale system that can be divided on six interconnected single-input–single-output (SISO) subsystems, which define one DOF, i.e., three-angle subsystems with three position subsystems. In addition, nonlinear models is considered and assumed to suffer from the incidence of parameter uncertainty. Every parameters such as mass, inertia of the system are assumed completely unknown and change over time without prior information. Next, basing on nonlinear, Fractional-Order nonlinear and the intelligent adaptive approximate techniques a control law is established for all subsystems. The stability is performed by Lyapunov method and getting the desired output with respect to the desired input. The modeling and control is done using MATLAB/Simulink. At the end, the simulation tests are performed to that, the designed controller is able to maintain best performance of the QUAV even in the presence of unknown dynamics, parametric uncertainties and external disturbance

    Learning Better Keypoints for Multi-Object 6DoF Pose Estimation

    Full text link
    We investigate the impact of pre-defined keypoints for pose estimation, and found that accuracy and efficiency can be improved by training a graph network to select a set of disperse keypoints with similarly distributed votes. These votes, learned by a regression network to accumulate evidence for the keypoint locations, can be regressed more accurately compared to previous heuristic keypoint algorithms. The proposed KeyGNet, supervised by a combined loss measuring both Wassserstein distance and dispersion, learns the color and geometry features of the target objects to estimate optimal keypoint locations. Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy for all evaluation metrics of all seven datasets tested, for three keypoint voting methods. The challenging Occlusion LINEMOD dataset notably improved ADD(S) by +16.4% on PVN3D, and all core BOP datasets showed an AR improvement for all objects, of between +1% and +21.5%. There was also a notable increase in performance when transitioning from single object to multiple object training using KeyGNet keypoints, essentially eliminating the SISO-MIMO gap for Occlusion LINEMOD

    EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild

    Full text link
    We present EMDB, the Electromagnetic Database of Global 3D Human Pose and Shape in the Wild. EMDB is a novel dataset that contains high-quality 3D SMPL pose and shape parameters with global body and camera trajectories for in-the-wild videos. We use body-worn, wireless electromagnetic (EM) sensors and a hand-held iPhone to record a total of 58 minutes of motion data, distributed over 81 indoor and outdoor sequences and 10 participants. Together with accurate body poses and shapes, we also provide global camera poses and body root trajectories. To construct EMDB, we propose a multi-stage optimization procedure, which first fits SMPL to the 6-DoF EM measurements and then refines the poses via image observations. To achieve high-quality results, we leverage a neural implicit avatar model to reconstruct detailed human surface geometry and appearance, which allows for improved alignment and smoothness via a dense pixel-level objective. Our evaluations, conducted with a multi-view volumetric capture system, indicate that EMDB has an expected accuracy of 2.3 cm positional and 10.6 degrees angular error, surpassing the accuracy of previous in-the-wild datasets. We evaluate existing state-of-the-art monocular RGB methods for camera-relative and global pose estimation on EMDB. EMDB is publicly available under https://ait.ethz.ch/emdbComment: Accepted to ICCV 202
    corecore