197 research outputs found

    Linear Global Translation Estimation with Feature Tracks

    Full text link
    This paper derives a novel linear position constraint for cameras seeing a common scene point, which leads to a direct linear method for global camera translation estimation. Unlike previous solutions, this method deals with collinear camera motion and weak image association at the same time. The final linear formulation does not involve the coordinates of scene points, which makes it efficient even for large scale data. We solve the linear equation based on L1L_1 norm, which makes our system more robust to outliers in essential matrices and feature correspondences. We experiment this method on both sequentially captured images and unordered Internet images. The experiments demonstrate its strength in robustness, accuracy, and efficiency.Comment: Changes: 1. Adopt BMVC2015 style; 2. Combine sections 3 and 5; 3. Move "Evaluation on synthetic data" out to supplementary file; 4. Divide subsection "Evaluation on general data" to subsections "Experiment on sequential data" and "Experiment on unordered Internet data"; 5. Change Fig. 1 and Fig.8; 6. Move Fig. 6 and Fig. 7 to supplementary file; 7 Change some symbols; 8. Correct some typo

    Metric 3D-reconstruction from Unordered and Uncalibrated Image Collections

    Get PDF
    In this thesis the problem of Structure from Motion (SfM) for uncalibrated and unordered image collections is considered. The proposed framework is an adaptation of the framework for calibrated SfM proposed by Olsson-Enqvist (2011) to the uncalibrated case. Olsson-Enqvist's framework consists of three main steps; pairwise relative rotation estimation, rotation averaging, and geometry estimation with known rotations. For this to work with uncalibrated images we also perform auto-calibration during the first step. There is a well-known degeneracy for pairwise auto-calibration which occurs when the two principal axes meet in a point. This is unfortunately common for real images. To mitigate this the rotation estimation is instead performed by estimating image triplets. For image triplets the degenerate congurations are less likely to occur in practice. This is followed by estimation of the pairs which did not get a successful relative rotation from the previous step. The framework is successfully applied to an uncalibrated and unordered collection of images of the cathedral in Lund. It is also applied to the well-known Oxford dinosaur sequence which consists of turntable motion. Image pairs from the turntable motion are in a degenerate conguration for auto-calibration since they both view the same point on the rotation axis

    Structure from Motion with Higher-level Environment Representations

    Get PDF
    Computer vision is an important area focusing on understanding, extracting and using the information from vision-based sensor. It has many applications such as vision-based 3D reconstruction, simultaneous localization and mapping(SLAM) and data-driven understanding of the real world. Vision is a fundamental sensing modality in many different fields of application. While the traditional structure from motion mostly uses sparse point-based feature, this thesis aims to explore the possibility of using higher order feature representation. It starts with a joint work which uses straight line for feature representation and performs bundle adjustment with straight line parameterization. Then, we further try an even higher order representation where we use Bezier spline for parameterization. We start with a simple case where all contours are lying on the plane and uses Bezier splines to parametrize the curves in the background and optimize on both camera position and Bezier splines. For application, we present a complete end-to-end pipeline which produces meaningful dense 3D models from natural data of a 3D object: the target object is placed on a structured but unknown planar background that is modeled with splines. The data is captured using only a hand-held monocular camera. However, this application is limited to a planar scenario and we manage to push the parameterizations into real 3D. Following the potential of this idea, we introduce a more flexible higher-order extension of points that provide a general model for structural edges in the environment, no matter if straight or curved. Our model relies on linked B´ezier curves, the geometric intuition of which proves great benefits during parameter initialization and regularization. We present the first fully automatic pipeline that is able to generate spline-based representations without any human supervision. Besides a full graphical formulation of the problem, we introduce both geometric and photometric cues as well as higher-level concepts such overall curve visibility and viewing angle restrictions to automatically manage the correspondences in the graph. Results prove that curve-based structure from motion with splines is able to outperform state-of-the-art sparse feature-based methods, as well as to model curved edges in the environment

    Structureless Camera Motion Estimation of Unordered Omnidirectional Images

    Get PDF
    This work aims at providing a novel camera motion estimation pipeline from large collections of unordered omnidirectional images. In oder to keep the pipeline as general and flexible as possible, cameras are modelled as unit spheres, allowing to incorporate any central camera type. For each camera an unprojection lookup is generated from intrinsics, which is called P2S-map (Pixel-to-Sphere-map), mapping pixels to their corresponding positions on the unit sphere. Consequently the camera geometry becomes independent of the underlying projection model. The pipeline also generates P2S-maps from world map projections with less distortion effects as they are known from cartography. Using P2S-maps from camera calibration and world map projection allows to convert omnidirectional camera images to an appropriate world map projection in oder to apply standard feature extraction and matching algorithms for data association. The proposed estimation pipeline combines the flexibility of SfM (Structure from Motion) - which handles unordered image collections - with the efficiency of PGO (Pose Graph Optimization), which is used as back-end in graph-based Visual SLAM (Simultaneous Localization and Mapping) approaches to optimize camera poses from large image sequences. SfM uses BA (Bundle Adjustment) to jointly optimize camera poses (motion) and 3d feature locations (structure), which becomes computationally expensive for large-scale scenarios. On the contrary PGO solves for camera poses (motion) from measured transformations between cameras, maintaining optimization managable. The proposed estimation algorithm combines both worlds. It obtains up-to-scale transformations between image pairs using two-view constraints, which are jointly scaled using trifocal constraints. A pose graph is generated from scaled two-view transformations and solved by PGO to obtain camera motion efficiently even for large image collections. Obtained results can be used as input data to provide initial pose estimates for further 3d reconstruction purposes e.g. to build a sparse structure from feature correspondences in an SfM or SLAM framework with further refinement via BA. The pipeline also incorporates fixed extrinsic constraints from multi-camera setups as well as depth information provided by RGBD sensors. The entire camera motion estimation pipeline does not need to generate a sparse 3d structure of the captured environment and thus is called SCME (Structureless Camera Motion Estimation).:1 Introduction 1.1 Motivation 1.1.1 Increasing Interest of Image-Based 3D Reconstruction 1.1.2 Underground Environments as Challenging Scenario 1.1.3 Improved Mobile Camera Systems for Full Omnidirectional Imaging 1.2 Issues 1.2.1 Directional versus Omnidirectional Image Acquisition 1.2.2 Structure from Motion versus Visual Simultaneous Localization and Mapping 1.3 Contribution 1.4 Structure of this Work 2 Related Work 2.1 Visual Simultaneous Localization and Mapping 2.1.1 Visual Odometry 2.1.2 Pose Graph Optimization 2.2 Structure from Motion 2.2.1 Bundle Adjustment 2.2.2 Structureless Bundle Adjustment 2.3 Corresponding Issues 2.4 Proposed Reconstruction Pipeline 3 Cameras and Pixel-to-Sphere Mappings with P2S-Maps 3.1 Types 3.2 Models 3.2.1 Unified Camera Model 3.2.2 Polynomal Camera Model 3.2.3 Spherical Camera Model 3.3 P2S-Maps - Mapping onto Unit Sphere via Lookup Table 3.3.1 Lookup Table as Color Image 3.3.2 Lookup Interpolation 3.3.3 Depth Data Conversion 4 Calibration 4.1 Overview of Proposed Calibration Pipeline 4.2 Target Detection 4.3 Intrinsic Calibration 4.3.1 Selected Examples 4.4 Extrinsic Calibration 4.4.1 3D-2D Pose Estimation 4.4.2 2D-2D Pose Estimation 4.4.3 Pose Optimization 4.4.4 Uncertainty Estimation 4.4.5 PoseGraph Representation 4.4.6 Bundle Adjustment 4.4.7 Selected Examples 5 Full Omnidirectional Image Projections 5.1 Panoramic Image Stitching 5.2 World Map Projections 5.3 World Map Projection Generator for P2S-Maps 5.4 Conversion between Projections based on P2S-Maps 5.4.1 Proposed Workflow 5.4.2 Data Storage Format 5.4.3 Real World Example 6 Relations between Two Camera Spheres 6.1 Forward and Backward Projection 6.2 Triangulation 6.2.1 Linear Least Squares Method 6.2.2 Alternative Midpoint Method 6.3 Epipolar Geometry 6.4 Transformation Recovery from Essential Matrix 6.4.1 Cheirality 6.4.2 Standard Procedure 6.4.3 Simplified Procedure 6.4.4 Improved Procedure 6.5 Two-View Estimation 6.5.1 Evaluation Strategy 6.5.2 Error Metric 6.5.3 Evaluation of Estimation Algorithms 6.5.4 Concluding Remarks 6.6 Two-View Optimization 6.6.1 Epipolar-Based Error Distances 6.6.2 Projection-Based Error Distances 6.6.3 Comparison between Error Distances 6.7 Two-View Translation Scaling 6.7.1 Linear Least Squares Estimation 6.7.2 Non-Linear Least Squares Optimization 6.7.3 Comparison between Initial and Optimized Scaling Factor 6.8 Homography to Identify Degeneracies 6.8.1 Homography for Spherical Cameras 6.8.2 Homography Estimation 6.8.3 Homography Optimization 6.8.4 Homography and Pure Rotation 6.8.5 Homography in Epipolar Geometry 7 Relations between Three Camera Spheres 7.1 Three View Geometry 7.2 Crossing Epipolar Planes Geometry 7.3 Trifocal Geometry 7.4 Relation between Trifocal, Three-View and Crossing Epipolar Planes 7.5 Translation Ratio between Up-To-Scale Two-View Transformations 7.5.1 Structureless Determination Approaches 7.5.2 Structure-Based Determination Approaches 7.5.3 Comparison between Proposed Approaches 8 Pose Graphs 8.1 Optimization Principle 8.2 Solvers 8.2.1 Additional Graph Solvers 8.2.2 False Loop Closure Detection 8.3 Pose Graph Generation 8.3.1 Generation of Synthetic Pose Graph Data 8.3.2 Optimization of Synthetic Pose Graph Data 9 Structureless Camera Motion Estimation 9.1 SCME Pipeline 9.2 Determination of Two-View Translation Scale Factors 9.3 Integration of Depth Data 9.4 Integration of Extrinsic Camera Constraints 10 Camera Motion Estimation Results 10.1 Directional Camera Images 10.2 Omnidirectional Camera Images 11 Conclusion 11.1 Summary 11.2 Outlook and Future Work Appendices A.1 Additional Extrinsic Calibration Results A.2 Linear Least Squares Scaling A.3 Proof Rank Deficiency A.4 Alternative Derivation Midpoint Method A.5 Simplification of Depth Calculation A.6 Relation between Epipolar and Circumferential Constraint A.7 Covariance Estimation A.8 Uncertainty Estimation from Epipolar Geometry A.9 Two-View Scaling Factor Estimation: Uncertainty Estimation A.10 Two-View Scaling Factor Optimization: Uncertainty Estimation A.11 Depth from Adjoining Two-View Geometries A.12 Alternative Three-View Derivation A.12.1 Second Derivation Approach A.12.2 Third Derivation Approach A.13 Relation between Trifocal Geometry and Alternative Midpoint Method A.14 Additional Pose Graph Generation Examples A.15 Pose Graph Solver Settings A.16 Additional Pose Graph Optimization Examples Bibliograph

    Hierarchical structure-and-motion recovery from uncalibrated images

    Full text link
    This paper addresses the structure-and-motion problem, that requires to find camera motion and 3D struc- ture from point matches. A new pipeline, dubbed Samantha, is presented, that departs from the prevailing sequential paradigm and embraces instead a hierarchical approach. This method has several advantages, like a provably lower computational complexity, which is necessary to achieve true scalability, and better error containment, leading to more stability and less drift. Moreover, a practical autocalibration procedure allows to process images without ancillary information. Experiments with real data assess the accuracy and the computational efficiency of the method.Comment: Accepted for publication in CVI

    EVALUATION OF THE METRIC TRIFOCAL TENSOR FOR RELATIVE THREE-VIEW ORIENTATION

    Get PDF
    In photogrammetry and computer vision the trifocal tensor is used to describe the geometric relation between projections of points in three views. In this paper we analyze the stability and accuracy of the metric trifocal tensor for calibrated cameras. Since a minimal parameterization of the metric trifocal tensor is challenging, the additional constraints of the interior orientation are applied to the well-known projective 6-point and 7-point algorithms for three images. The experimental results show that the linear 7-point algorithm fails for some noise-free degenerated cases, whereas the minimal 6-point algorithm seems to be competitive even with realistic noise
    • …
    corecore