379 research outputs found
Projective Bundle Adjustment from Arbitrary Initialization Using the Variable Projection Method
Bundle adjustment is used in structure-from-motion pipelines as final refinement stage requiring a sufficiently good initialization to reach a useful local mininum. Starting from an arbitrary initialization almost always gets trapped in a poor minimum. In this work we aim to obtain an initialization-free approach which returns global minima from a large proportion of purely random starting points. Our key inspiration lies in the success of the Variable Projection (VarPro) method for affine factorization problems, which have close to 100% chance of reaching a global minimum from random initialization. We find empirically that this desirable behaviour does not directly carry over to the projective case, and we consequently design and evaluate strategies to overcome this limitation. Also, by unifying the affine and the projective camera settings, we obtain numerically better conditioned reformulations of original bundle adjustment algorithms
Recommended from our members
Projective Bundle Adjustment from Arbitrary Initialization Using the Variable Projection Method
Bundle adjustment is used in structure-from-motion pipelines as final refinement stage requiring a sufficiently good initialization to reach a useful local mininum. Starting from an arbitrary initialization almost always gets trapped in a poor minimum. In this work we aim to obtain an initialization-free approach which returns global minima from a large proportion of purely random starting points. Our key inspiration lies in the success of the Variable Projection (VarPro) method for affine factorization problems, which have close to 100% chance of reaching a global minimum from random initialization. We find empirically that this desirable behaviour does not directly carry over to the projective case, and we consequently design and evaluate strategies to overcome this limitation. Also, by unifying the affine and the projective camera settings, we obtain numerically better conditioned reformulations of original bundle adjustment algorithms
POSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
Bundle adjustment is a nonlinear refinement method for
camera poses and 3D structure requiring sufficiently good
initialization. In recent years, it was experimentally observed
that useful minima can be reached even from arbitrary
initialization for affine bundle adjustment problems
(and fixed-rank matrix factorization instances in general).
The key success factor lies in the use of the variable projection
(VarPro) method, which is known to have a wide basin
of convergence for such problems. In this paper, we propose
the Pseudo Object Space Error (pOSE), which is an objective
with cameras represented as a hybrid between the affine
and projective models. This formulation allows us to obtain
3D reconstructions that are close to the true projective reconstructions
while retaining a bilinear problem structure
suitable for the VarPro method. Experimental results show
that using pOSE has a high success rate to yield faithful 3D
reconstructions from random initializations, taking one step
towards initialization-free structure from motion
Recommended from our members
Widening the basin of convergence for the bundle adjustment type of problems in computer vision
Bundle adjustment is the process of simultaneously optimizing camera poses and 3D structure
given image point tracks. In structure-from-motion, it is typically used as the final refinement
step due to the nonlinearity of the problem, meaning that it requires sufficiently good
initialization. Contrary to this belief, recent literature showed that useful solutions can
be obtained even from arbitrary initialization for fixed-rank matrix factorization problems,
including bundle adjustment with affine cameras. This property of wide convergence basin of
high quality optima is desirable for any nonlinear optimization algorithm since obtaining good
initial values can often be non-trivial. The aim of this thesis is to find the key factor behind the
success of these recent matrix factorization algorithms and explore the potential applicability
of the findings to bundle adjustment, which is closely related to matrix factorization.
The thesis begins by unifying a handful of matrix factorization algorithms and comparing
similarities and differences between them. The theoretical analysis shows that the set
of successful algorithms actually stems from the same root of the optimization method
called variable projection (VarPro). The investigation then extends to address why VarPro
outperforms the joint optimization technique, which is widely used in computer vision. This
algorithmic comparison of these methods yields a larger unification, leading to a conclusion
that VarPro benefits from an unequal trust region assumption between two matrix factors.
The thesis then explores ways to incorporate VarPro to bundle adjustment problems
using projective and perspective cameras. Unfortunately, the added nonlinearity causes
a substantial decrease in the convergence basin of VarPro, and therefore a bootstrapping
strategy is proposed to bypass this issue. Experimental results show that it is possible to
yield feasible metric reconstructions and pose estimations from arbitrary initialization given
relatively clean point tracks, taking one step towards initialization-free structure-from-motion.Microsoft
Toshiba Research Europ
Autocalibration with the Minimum Number of Cameras with Known Pixel Shape
In 3D reconstruction, the recovery of the calibration parameters of the
cameras is paramount since it provides metric information about the observed
scene, e.g., measures of angles and ratios of distances. Autocalibration
enables the estimation of the camera parameters without using a calibration
device, but by enforcing simple constraints on the camera parameters. In the
absence of information about the internal camera parameters such as the focal
length and the principal point, the knowledge of the camera pixel shape is
usually the only available constraint. Given a projective reconstruction of a
rigid scene, we address the problem of the autocalibration of a minimal set of
cameras with known pixel shape and otherwise arbitrarily varying intrinsic and
extrinsic parameters. We propose an algorithm that only requires 5 cameras (the
theoretical minimum), thus halving the number of cameras required by previous
algorithms based on the same constraint. To this purpose, we introduce as our
basic geometric tool the six-line conic variety (SLCV), consisting in the set
of planes intersecting six given lines of 3D space in points of a conic. We
show that the set of solutions of the Euclidean upgrading problem for three
cameras with known pixel shape can be parameterized in a computationally
efficient way. This parameterization is then used to solve autocalibration from
five or more cameras, reducing the three-dimensional search space to a
two-dimensional one. We provide experiments with real images showing the good
performance of the technique.Comment: 19 pages, 14 figures, 7 tables, J. Math. Imaging Vi
Visual SLAM from image sequences acquired by unmanned aerial vehicles
This thesis shows that Kalman filter based approaches are sufficient for the task of simultaneous localization and mapping from image sequences acquired by unmanned aerial vehicles. Using solely direction measurements to solve the problem of simultaneous localization and mapping (SLAM) is an important part of autonomous systems. Because the need for real-time capable systems, recursive estimation techniques, Kalman filter based approaches are the main focus of interest. Unfortunately, the non-linearity of the triangulation using the direction measurements cause decrease of accuracy and consistency of the results. The first contribution of this work is a general derivation of the recursive update of the Kalman filter. This derivation is based on implicit measurement equations, having the classical iterative non-linear as well as the non-iterative and linear Kalman filter as specializations of our general derivation. Second, a new formulation of linear-motion models for the single camera state model and the sliding window camera state model are given, that make it possible to compute the prediction in a fully linear manner. The third major contribution is a novel method for the initialization of new object points in the Kalman filter. Empirical studies using synthetic and real data of an image sequence of a photogrammetric strip are made, that demonstrate and compare the influences of the initialization methods of new object points in the Kalman filter. Forth, the accuracy potential of monoscopic image sequences from unmanned aerial vehicles for autonomous localization and mapping is theoretically analyzed, which can be used for planning purposes.Visuelle gleichzeitige Lokalisierung und Kartierung aus Bildfolgen von unbemannten Flugkörpern Diese Arbeit zeigt, dass die Kalmanfilter basierte Lösung der Triangulation zur Lokalisierung und Kartierung aus Bildfolgen von unbemannten Flugkörpern realisierbar ist. Aufgrund von Echtzeitanforderungen autonomer Systeme erreichen rekursive Schätz-verfahren, insbesondere Kalmanfilter basierte Ansätze, große Beliebheit. Bedauerlicherweise treten dabei durch die Nichtlinearität der Triangulation einige Effekte auf, welche die Konsistenz und Genauigkeit der Lösung hinsichtlich der geschätzten Parameter maßgeblich beeinflussen. Der erste Beitrag dieser Arbeit besteht in der Herleitung eines generellen Verfahrens zum rekursiven Verbessern im Kalmanfilter mit impliziten Beobachtungsgleichungen. Wir zeigen, dass die klassischen Verfahren im Kalmanfilter eine Spezialisierung unseres Ansatzes darstellen. Im zweiten Beitrag erweitern wir die klassische Modellierung für ein Einkameramodell zu einem Mehrkameramodell im Kalmanfilter. Diese Erweiterung erlaubt es uns, die Prädiktion für eine lineares Bewegungsmodell vollkommen linear zu berechnen. In einem dritten Hauptbeitrag stellen wir ein neues Verfahren zur Initialisierung von Neupunkten im Kalmanfilter vor. Anhand von empirischen Untersuchungen unter Verwendung simulierter und realer Daten einer Bildfolge eines photogrammetrischen Streifens zeigen und vergleichen wir, welchen Einfluß die Initialisierungsmethoden für Neupunkte im Kalmanfilter haben und welche Genauigkeiten für diese Szenarien erreichbar sind. Am Beispiel von Bildfolgen eines unbemannten Flugkörpern zeigen wir in dieser Arbeit als vierten Beitrag, welche Genauigkeit zur Lokalisierung und Kartierung durch Triangulation möglich ist. Diese theoretische Analyse kann wiederum zu Planungszwecken verwendet werden
Affine multi-view modelling for close range object measurement
In photogrammetry, sensor modelling with 3D point estimation is a fundamental topic of research. Perspective frame cameras offer the mathematical basis for close range modelling approaches. The norm is to employ robust bundle adjustments for simultaneous parameter estimation and 3D object measurement. In 2D to 3D modelling strategies image resolution, scale, sampling and geometric distortion are prior factors. Non-conventional image geometries that implement uncalibrated cameras are established in computer vision approaches; these aim for fast solutions at the expense of precision. The projective camera is defined in homogeneous terms and linear algorithms are employed. An attractive sensor model disembodied from projective distortions is the affine. Affine modelling has been studied in the contexts of geometry recovery, feature detection and texturing in vision, however multi-view approaches for precise object measurement are not yet widely available.
This project investigates affine multi-view modelling from a photogrammetric standpoint. A new affine bundle adjustment system has been developed for point-based data observed in close range image networks. The system allows calibration, orientation and 3D point estimation. It is processed as a least squares solution with high redundancy providing statistical analysis. Starting values are recovered from a combination of implicit perspective and explicit affine approaches. System development focuses on retrieval of orientation parameters, 3D point coordinates and internal calibration with definition of system datum, sensor scale and radial lens distortion. Algorithm development is supported with method description by simulation. Initialization and implementation are evaluated with the statistical indicators, algorithm convergence and correlation of parameters. Object space is assessed with evaluation of the 3D point correlation coefficients and error ellipsoids. Sensor scale is checked with comparison of camera systems utilizing quality and accuracy metrics. For independent method evaluation, testing is implemented over a perspective bundle adjustment tool with similar indicators. Test datasets are initialized from precise reference image networks. Real affine image networks are acquired with an optical system (~1M pixel CCD cameras with 0.16x telecentric lens). Analysis of tests ascertains that the affine method results in an RMS image misclosure at a sub-pixel level and precisions of a few tenths of microns in object space
Towards Reliable and Accurate Global Structure-from-Motion
Reconstruction of objects or scenes from sparse point detections across multiple views is one of the most tackled problems in computer vision. Given the coordinates of 2D points tracked in multiple images, the problem consists of estimating the corresponding 3D points and cameras\u27 calibrations (intrinsic and pose), and can be solved by minimizing reprojection errors using bundle adjustment. However, given bundle adjustment\u27s nonlinear objective function and iterative nature, a good starting guess is required to converge to global minima. Global and Incremental Structure-from-Motion methods appear as ways to provide good initializations to bundle adjustment, each with different properties. While Global Structure-from-Motion has been shown to result in more accurate reconstructions compared to Incremental Structure-from-Motion, the latter has better scalability by starting with a small subset of images and sequentially adding new views, allowing reconstruction of sequences with millions of images. Additionally, both Global and Incremental Structure-from-Motion methods rely on accurate models of the scene or object, and under noisy conditions or high model uncertainty might result in poor initializations for bundle adjustment. Recently pOSE, a class of matrix factorization methods, has been proposed as an alternative to conventional Global SfM methods. These methods use VarPro - a second-order optimization method - to minimize a linear combination of an approximation of reprojection errors and a regularization term based on an affine camera model, and have been shown to converge to global minima with a high rate even when starting from random camera calibration estimations.This thesis aims at improving the reliability and accuracy of global SfM through different approaches. First, by studying conditions for global optimality of point set registration, a point cloud averaging method that can be used when (incomplete) 3D point clouds of the same scene in different coordinate systems are available. Second, by extending pOSE methods to different Structure-from-Motion problem instances, such as Non-Rigid SfM or radial distortion invariant SfM. Third and finally, by replacing the regularization term of pOSE methods with an exponential regularization on the projective depth of the 3D point estimations, resulting in a loss that achieves reconstructions with accuracy close to bundle adjustment
A factorization approach to inertial affine structure from motion
We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives
A factorization approach to inertial affine structure from motion
We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives
- …