840 research outputs found
Affine Correspondences between Multi-Camera Systems for Relative Pose Estimation
We present a novel method to compute the relative pose of multi-camera
systems using two affine correspondences (ACs). Existing solutions to the
multi-camera relative pose estimation are either restricted to special cases of
motion, have too high computational complexity, or require too many point
correspondences (PCs). Thus, these solvers impede an efficient or accurate
relative pose estimation when applying RANSAC as a robust estimator. This paper
shows that the 6DOF relative pose estimation problem using ACs permits a
feasible minimal solution, when exploiting the geometric constraints between
ACs and multi-camera systems using a special parameterization. We present a
problem formulation based on two ACs that encompass two common types of ACs
across two views, i.e., inter-camera and intra-camera. Moreover, the framework
for generating the minimal solvers can be extended to solve various relative
pose estimation problems, e.g., 5DOF relative pose estimation with known
rotation angle prior. Experiments on both virtual and real multi-camera systems
prove that the proposed solvers are more efficient than the state-of-the-art
algorithms, while resulting in a better relative pose accuracy. Source code is
available at https://github.com/jizhaox/relpose-mcs-depth
BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization
Typical algorithms for point cloud registration such as Iterative Closest
Point (ICP) require a favorable initial transform estimate between two point
clouds in order to perform a successful registration. State-of-the-art methods
for choosing this starting condition rely on stochastic sampling or global
optimization techniques such as branch and bound. In this work, we present a
new method based on Bayesian optimization for finding the critical initial ICP
transform. We provide three different configurations for our method which
highlights the versatility of the algorithm to both find rapid results and
refine them in situations where more runtime is available such as offline map
building. Experiments are run on popular data sets and we show that our
approach outperforms state-of-the-art methods when given similar computation
time. Furthermore, it is compatible with other improvements to ICP, as it
focuses solely on the selection of an initial transform, a starting point for
all ICP-based methods.Comment: IEEE International Conference on Robotics and Automation 202
Shape-Constraint Recurrent Flow for 6D Object Pose Estimation
Most recent 6D object pose methods use 2D optical flow to refine their
results. However, the general optical flow methods typically do not consider
the target's 3D shape information during matching, making them less effective
in 6D object pose estimation. In this work, we propose a shape-constraint
recurrent matching framework for 6D object pose estimation. We first compute a
pose-induced flow based on the displacement of 2D reprojection between the
initial pose and the currently estimated pose, which embeds the target's 3D
shape implicitly. Then we use this pose-induced flow to construct the
correlation map for the following matching iterations, which reduces the
matching space significantly and is much easier to learn. Furthermore, we use
networks to learn the object pose based on the current estimated flow, which
facilitates the computation of the pose-induced flow for the next iteration and
yields an end-to-end system for object pose. Finally, we optimize the optical
flow and object pose simultaneously in a recurrent manner. We evaluate our
method on three challenging 6D object pose datasets and show that it
outperforms the state of the art significantly in both accuracy and efficiency.Comment: CVPR 202
LiDAR-Based Place Recognition For Autonomous Driving: A Survey
LiDAR-based place recognition (LPR) plays a pivotal role in autonomous
driving, which assists Simultaneous Localization and Mapping (SLAM) systems in
reducing accumulated errors and achieving reliable localization. However,
existing reviews predominantly concentrate on visual place recognition (VPR)
methods. Despite the recent remarkable progress in LPR, to the best of our
knowledge, there is no dedicated systematic review in this area. This paper
bridges the gap by providing a comprehensive review of place recognition
methods employing LiDAR sensors, thus facilitating and encouraging further
research. We commence by delving into the problem formulation of place
recognition, exploring existing challenges, and describing relations to
previous surveys. Subsequently, we conduct an in-depth review of related
research, which offers detailed classifications, strengths and weaknesses, and
architectures. Finally, we summarize existing datasets, commonly used
evaluation metrics, and comprehensive evaluation results from various methods
on public datasets. This paper can serve as a valuable tutorial for newcomers
entering the field of place recognition and for researchers interested in
long-term robot localization. We pledge to maintain an up-to-date project on
our website https://github.com/ShiPC-AI/LPR-Survey.Comment: 26 pages,13 figures, 5 table
Scene representation and matching for visual localization in hybrid camera scenarios
Scene representation and matching are crucial steps in a variety of tasks ranging from 3D reconstruction to virtual/augmented/mixed reality applications, to robotics, and others. While approaches exist that tackle these tasks, they mostly overlook the issue of efficiency in the scene representation, which is fundamental in resource-constrained systems and for increasing computing speed. Also, they normally assume the use of projective cameras, while performance on systems based on other camera geometries remains suboptimal. This dissertation contributes with a new efficient scene representation method that dramatically reduces the number of 3D points. The approach sets up an optimization problem for the automated selection of the most relevant points to retain. This leads to a constrained quadratic program, which is solved optimally with a newly introduced variant of the sequential minimal optimization method. In addition, a new initialization approach is introduced for the fast convergence of the method. Extensive experimentation on public benchmark datasets demonstrates that the approach produces a compressed scene representation quickly while delivering accurate pose estimates.
The dissertation also contributes with new methods for scene matching that go beyond the use of projective cameras. Alternative camera geometries, like fisheye cameras, produce images with very high distortion, making current image feature point detectors and descriptors less efficient, since designed for projective cameras. New methods based on deep learning are introduced to address this problem, where feature detectors and descriptors can overcome distortion effects and more effectively perform feature matching between pairs of fisheye images, and also between hybrid pairs of fisheye and perspective images. Due to the limited availability of fisheye-perspective image datasets, three datasets were collected for training and testing the methods. The results demonstrate an increase of the detection and matching rates which outperform the current state-of-the-art methods
Robotic Burst Imaging for Light-Constrained 3D Reconstruction
This thesis proposes a novel input scheme, robotic burst, to improve vision-based 3D reconstruction for robots operating in low-light conditions, where existing state-of-the-art robotic vision algorithms struggle due to low signal-to-noise ratio in low-light images. We aim to improve the correspondence search stage of feature-based reconstruction using robotic burst imaging, including burst-merged images, a burst feature finder, and an end-to-end learning-based feature extractor. Firstly, we establish the use of robotic burst imaging to compute burst-merged images for feature-based reconstruction. We then develop a burst feature finder that locates features with well-defined scale and apparent motion on a burst to deal with limitations of burst-merged images such as misalignment at strong noise. To improve feature matches in burst-based reconstruction, we also present an end-to-end learning-based feature extractor that finds well-defined scale features directly on light-constrained bursts.
We evaluate our methods against state-of-the-art reconstruction methods for conventional imaging that uses both classical and learning-based feature extractors. We validate our novel input scheme using burst imagery captured on a robotic arm and drones. We demonstrate progressive improvements in low-light reconstruction using our burst-based methods against conventional approaches and overall, converging 90% of all scenes captured in millilux conditions that otherwise converge with 10% success rate using conventional methods. This work opens up new avenues for applications, including autonomous driving and drone delivery at night, mining, and behavioral studies on nocturnal animals
Survey on Motion Planning for Multirotor Aerial Vehicles in Plan-based Control Paradigm
In general, optimal motion planning can be performed both locally and globally. In such a planning, the choice in favour of either local or global planning technique mainly depends on whether the environmental conditions are dynamic or static. Hence, the most adequate choice is to use local planning or local planning alongside global planning. When designing optimal motion planning both local and global, the key metrics to bear in mind are execution time, asymptotic optimality, and quick reaction to dynamic obstacles. Such planning approaches can address the aforesaid target metrics more efficiently compared to other approaches such as path planning followed by smoothing. Thus, the foremost objective of this study is to analyse related literature in order to understand how the motion planning, especially trajectory planning, problem is formulated, when being applied for generating optimal trajectories in real-time for Multirotor Aerial Vehicles, impacts the listed metrics. As a result of the research, the trajectory planning problem was broken down into a set of subproblems, and the lists of methods for addressing each of the problems were identified and described in detail. Subsequently, the most prominent results from 2010 to 2022 were summarized and presented in the form of a timeline
Modeling and Robust Control of Flying Robots Using Intelligent Approaches Modélisation et commande robuste des robots volants en utilisant des approches intelligentes
This thesis aims to modeling and robust controlling of a flying robot of quadrotor type. Where
we focused in this thesis on quadrotor unmanned Aerial Vehicle (QUAV). Intelligent
nonlinear controllers and intelligent fractional-order nonlinear controllers are designed to
control. The QUAV system is considered as MIMO large-scale system that can be divided on
six interconnected single-input–single-output (SISO) subsystems, which define one DOF, i.e.,
three-angle subsystems with three position subsystems. In addition, nonlinear models is
considered and assumed to suffer from the incidence of parameter uncertainty. Every
parameters such as mass, inertia of the system are assumed completely unknown and change
over time without prior information. Next, basing on nonlinear, Fractional-Order nonlinear
and the intelligent adaptive approximate techniques a control law is established for all
subsystems. The stability is performed by Lyapunov method and getting the desired output
with respect to the desired input. The modeling and control is done using
MATLAB/Simulink. At the end, the simulation tests are performed to that, the designed
controller is able to maintain best performance of the QUAV even in the presence of unknown
dynamics, parametric uncertainties and external disturbance
Learning Better Keypoints for Multi-Object 6DoF Pose Estimation
We investigate the impact of pre-defined keypoints for pose estimation, and
found that accuracy and efficiency can be improved by training a graph network
to select a set of disperse keypoints with similarly distributed votes. These
votes, learned by a regression network to accumulate evidence for the keypoint
locations, can be regressed more accurately compared to previous heuristic
keypoint algorithms. The proposed KeyGNet, supervised by a combined loss
measuring both Wassserstein distance and dispersion, learns the color and
geometry features of the target objects to estimate optimal keypoint locations.
Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy
for all evaluation metrics of all seven datasets tested, for three keypoint
voting methods. The challenging Occlusion LINEMOD dataset notably improved
ADD(S) by +16.4% on PVN3D, and all core BOP datasets showed an AR improvement
for all objects, of between +1% and +21.5%. There was also a notable increase
in performance when transitioning from single object to multiple object
training using KeyGNet keypoints, essentially eliminating the SISO-MIMO gap for
Occlusion LINEMOD
EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild
We present EMDB, the Electromagnetic Database of Global 3D Human Pose and
Shape in the Wild. EMDB is a novel dataset that contains high-quality 3D SMPL
pose and shape parameters with global body and camera trajectories for
in-the-wild videos. We use body-worn, wireless electromagnetic (EM) sensors and
a hand-held iPhone to record a total of 58 minutes of motion data, distributed
over 81 indoor and outdoor sequences and 10 participants. Together with
accurate body poses and shapes, we also provide global camera poses and body
root trajectories. To construct EMDB, we propose a multi-stage optimization
procedure, which first fits SMPL to the 6-DoF EM measurements and then refines
the poses via image observations. To achieve high-quality results, we leverage
a neural implicit avatar model to reconstruct detailed human surface geometry
and appearance, which allows for improved alignment and smoothness via a dense
pixel-level objective. Our evaluations, conducted with a multi-view volumetric
capture system, indicate that EMDB has an expected accuracy of 2.3 cm
positional and 10.6 degrees angular error, surpassing the accuracy of previous
in-the-wild datasets. We evaluate existing state-of-the-art monocular RGB
methods for camera-relative and global pose estimation on EMDB. EMDB is
publicly available under https://ait.ethz.ch/emdbComment: Accepted to ICCV 202
- …