13 research outputs found
Efficient 2D-3D Matching for Multi-Camera Visual Localization
Visual localization, i.e., determining the position and orientation of a
vehicle with respect to a map, is a key problem in autonomous driving. We
present a multicamera visual inertial localization algorithm for large scale
environments. To efficiently and effectively match features against a pre-built
global 3D map, we propose a prioritized feature matching scheme for
multi-camera systems. In contrast to existing works, designed for monocular
cameras, we (1) tailor the prioritization function to the multi-camera setup
and (2) run feature matching and pose estimation in parallel. This
significantly accelerates the matching and pose estimation stages and allows us
to dynamically adapt the matching efforts based on the surrounding environment.
In addition, we show how pose priors can be integrated into the localization
system to increase efficiency and robustness. Finally, we extend our algorithm
by fusing the absolute pose estimates with motion estimates from a multi-camera
visual inertial odometry pipeline (VIO). This results in a system that provides
reliable and drift-less pose estimation. Extensive experiments show that our
localization runs fast and robust under varying conditions, and that our
extended algorithm enables reliable real-time pose estimation.Comment: 7 pages, 5 figure
MLPnP - A Real-Time Maximum Likelihood Solution to the Perspective-n-Point Problem
In this paper, a statistically optimal solution to the Perspective-n-Point
(PnP) problem is presented. Many solutions to the PnP problem are geometrically
optimal, but do not consider the uncertainties of the observations. In
addition, it would be desirable to have an internal estimation of the accuracy
of the estimated rotation and translation parameters of the camera pose. Thus,
we propose a novel maximum likelihood solution to the PnP problem, that
incorporates image observation uncertainties and remains real-time capable at
the same time. Further, the presented method is general, as is works with 3D
direction vectors instead of 2D image points and is thus able to cope with
arbitrary central camera models. This is achieved by projecting (and thus
reducing) the covariance matrices of the observations to the corresponding
vector tangent space.Comment: Submitted to the ISPRS congress (2016) in Prague. Oral Presentation.
Published in ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-3,
131-13
Calibrated and Partially Calibrated Semi-Generalized Homographies
In this paper, we propose the first minimal solutions for estimating the
semi-generalized homography given a perspective and a generalized camera. The
proposed solvers use five 2D-2D image point correspondences induced by a scene
plane. One of them assumes the perspective camera to be fully calibrated, while
the other solver estimates the unknown focal length together with the absolute
pose parameters. This setup is particularly important in structure-from-motion
and image-based localization pipelines, where a new camera is localized in each
step with respect to a set of known cameras and 2D-3D correspondences might not
be available. As a consequence of a clever parametrization and the elimination
ideal method, our approach only needs to solve a univariate polynomial of
degree five or three. The proposed solvers are stable and efficient as
demonstrated by a number of synthetic and real-world experiments
Video Registration in Egocentric Vision under Day and Night Illumination Changes
With the spread of wearable devices and head mounted cameras, a wide range of
application requiring precise user localization is now possible. In this paper
we propose to treat the problem of obtaining the user position with respect to
a known environment as a video registration problem. Video registration, i.e.
the task of aligning an input video sequence to a pre-built 3D model, relies on
a matching process of local keypoints extracted on the query sequence to a 3D
point cloud. The overall registration performance is strictly tied to the
actual quality of this 2D-3D matching, and can degrade if environmental
conditions such as steep changes in lighting like the ones between day and
night occur. To effectively register an egocentric video sequence under these
conditions, we propose to tackle the source of the problem: the matching
process. To overcome the shortcomings of standard matching techniques, we
introduce a novel embedding space that allows us to obtain robust matches by
jointly taking into account local descriptors, their spatial arrangement and
their temporal robustness. The proposal is evaluated using unconstrained
egocentric video sequences both in terms of matching quality and resulting
registration performance using different 3D models of historical landmarks. The
results show that the proposed method can outperform state of the art
registration algorithms, in particular when dealing with the challenges of
night and day sequences