Search CORE

764 research outputs found

A factorization-based projective reconstruction algorithm with circular motion constraint

Author: Hung YS
Li Y
Tang WK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

In this paper, we propose a projective reconstruction algorithm for a circular motion image sequence. We first formulate the circular motion constraint in the Euclidean frame, and then deduce its expression in a projective frame. The circular motion constraint is gradually enforced during the iterations of a projective reconstruction. This approach can be used to deal with both constant and varying intrinsic parameters. Experimental results for synthetic and real data are presented to illustrate the performance and improvements of our approach over methods based on general motion. ©2004 IEEE.published_or_final_versio

HKU Scholars Hub

GSLAM: Initialization-robust Monocular Visual SLAM via Global Structure-from-Motion

Author: Tan Ping
Tang Chengzhou
Wang Oliver
Publication venue
Publication date: 19/10/2017
Field of study

Many monocular visual SLAM algorithms are derived from incremental structure-from-motion (SfM) methods. This work proposes a novel monocular SLAM method which integrates recent advances made in global SfM. In particular, we present two main contributions to visual SLAM. First, we solve the visual odometry problem by a novel rank-1 matrix factorization technique which is more robust to the errors in map initialization. Second, we adopt a recent global SfM method for the pose-graph optimization, which leads to a multi-stage linear formulation and enables L1 optimization for better robustness to false loops. The combination of these two approaches generates more robust reconstruction and is significantly faster (4X) than recent state-of-the-art SLAM systems. We also present a new dataset recorded with ground truth camera motion in a Vicon motion capture room, and compare our method to prior systems on it and established benchmark datasets.Comment: 3DV 2017 Project Page: https://frobelbest.github.io/gsla

arXiv.org e-Print Archive

Crossref

An Enhanced Structure-from-Motion Paradigm based on the Absolute Dual Quadric and Images of Circular Points

Author: Calvet Lilian
Gurdjos Pierre
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceThis work aims at introducing a new unified Structure-from-Motion (SfM) paradigm in which images of circular point-pairs can be combined with images of natural points. An imaged circular point-pair encodes the 2D Euclidean structure of a world plane and can easily be derived from the image of a planar shape, especially those including circles. A classical SfM method generally runs two steps: first a projective factorization of all matched image points (into projective cameras and points) and second a camera self-calibration that updates the obtained world from projective to Euclidean. This work shows how to introduce images of circular points in these two SfM steps while its key contribution is to provide the theoretical foundations for combining “classical” linear self-calibration constraints with additional ones derived from such images. We show that the two proposed SfM steps clearly contribute to better results than the classical approach. We validate our contributions on synthetic and real images

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Statistical Models and Optimization Algorithms for High-Dimensional Computer Vision Problems

Author: Mitra Kaushik
Publication venue
Publication date: 01/01/2011
Field of study

Data-driven and computational approaches are showing significant promise in solving several challenging problems in various fields such as bioinformatics, finance and many branches of engineering. In this dissertation, we explore the potential of these approaches, specifically statistical data models and optimization algorithms, for solving several challenging problems in computer vision. In doing so, we contribute to the literatures of both statistical data models and computer vision. In the context of statistical data models, we propose principled approaches for solving robust regression problems, both linear and kernel, and missing data matrix factorization problem. In computer vision, we propose statistically optimal and efficient algorithms for solving the remote face recognition and structure from motion (SfM) problems. The goal of robust regression is to estimate the functional relation between two variables from a given data set which might be contaminated with outliers. Under the reasonable assumption that there are fewer outliers than inliers in a data set, we formulate the robust linear regression problem as a sparse learning problem, which can be solved using efficient polynomial-time algorithms. We also provide sufficient conditions under which the proposed algorithms correctly solve the robust regression problem. We then extend our robust formulation to the case of kernel regression, specifically to propose a robust version for relevance vector machine (RVM) regression. Matrix factorization is used for finding a low-dimensional representation for data embedded in a high-dimensional space. Singular value decomposition is the standard algorithm for solving this problem. However, when the matrix has many missing elements this is a hard problem to solve. We formulate the missing data matrix factorization problem as a low-rank semidefinite programming problem (essentially a rank constrained SDP), which allows us to find accurate and efficient solutions for large-scale factorization problems. Face recognition from remotely acquired images is a challenging problem because of variations due to blur and illumination. Using the convolution model for blur, we show that the set of all images obtained by blurring a given image forms a convex set. We then use convex optimization techniques to find the distances between a given blurred (probe) image and the gallery images to find the best match. Further, using a low-dimensional linear subspace model for illumination variations, we extend our theory in a similar fashion to recognize blurred and poorly illuminated faces. Bundle adjustment is the final optimization step of the SfM problem where the goal is to obtain the 3-D structure of the observed scene and the camera parameters from multiple images of the scene. The traditional bundle adjustment algorithm, based on minimizing the l_2 norm of the image re-projection error, has cubic complexity in the number of unknowns. We propose an algorithm, based on minimizing the l_infinity norm of the re-projection error, that has quadratic complexity in the number of unknowns. This is achieved by reducing the large-scale optimization problem into many small scale sub-problems each of which can be solved using second-order cone programming

CiteSeerX

Digital Repository at the University of Maryland

Conjugate epipole-based self-calibration of camera under circular motion

Author: Hung YS
Zhong H
Publication venue: IEEE.
Publication date: 01/01/2003
Field of study

In this paper, we propose a new method to self-calibrate camera with constant internal parameters under circular motion. The basis of our approach is to make use of the conjugate epipoles which are related to camera positions with rotation angles satisfying the conjugate constraint. A novel circular projective reconstruction is developed for computing the conjugate epipoles robustly. It is shown that for a camera with zero skew, two turntable sequences with different camera orientations are needed, and for a general camera three sequences with different camera orientations are required. The performance of the algorithm is tested with real images.published_or_final_versio

HKU Scholars Hub

Acquiring 3D scene information from 2D images

Author: Li Ping
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2011
Field of study

In recent years, people are becoming increasingly acquainted with 3D technologies such as 3DTV, 3D movies and 3D virtual navigation of city environments in their daily life. Commercial 3D movies are now commonly available for consumers. Virtual navigation of our living environment as used on a personal computer has become a reality due to well-known web-based geographic applications using advanced imaging technologies. To enable such 3D applications, many technological challenges such as 3D content creation, 3D displaying technology and 3D content transmission need to tackled and deployed at low cost. This thesis concentrates on the reconstruction of 3D scene information from multiple 2D images, aiming for an automatic and low-cost production of the 3D content. In this thesis, two multiple-view 3D reconstruction systems are proposed: a 3D modeling system for reconstructing the sparse 3D scene model from long video sequences captured with a hand-held consumer camcorder, and a depth reconstruction system for creating depth maps from multiple-view videos taken by multiple synchronized cameras. Both systems are designed to compute the 3D scene information in an automated way with minimum human interventions, in order to reduce the production cost of 3D contents. Experimental results on real videos of hundreds and thousands frames have shown that the two systems are able to accurately and automatically reconstruct the 3D scene information from 2D image data. The findings of this research are useful for emerging 3D applications such as 3D games, 3D visualization and 3D content production. Apart from designing and implementing the two proposed systems, we have developed three key scientific contributions to enable the two proposed 3D reconstruction systems. The first contribution is that we have designed a novel feature point matching algorithm that uses only a smoothness constraint for matching the points, which states that neighboring feature points in images tend to move with similar directions and magnitudes. The employed smoothness assumption is not only valid but also robust for most images with limited image motion, regardless of the camera motion and scene structure. Because of this, the algorithm obtains two major advan- 1 tages. First, the algorithm is robust to illumination changes, as the employed smoothness constraint does not rely on any texture information. Second, the algorithm has a good capability to handle the drift of the feature points over time, as the drift can hardly lead to a violation of the smoothness constraint. This leads to the large number of feature points matched and tracked by the proposed algorithm, which significantly helps the subsequent 3D modeling process. Our feature point matching algorithm is specifically designed for matching and tracking feature points in image/video sequences where the image motion is limited. Our extensive experimental results show that the proposed algorithm is able to track at least 2.5 times as many feature points compared with the state-of-the-art algorithms, with a comparable or higher accuracy. This contributes significantly to the robustness of the 3D reconstruction process. The second contribution is that we have developed algorithms to detect critical configurations where the factorization-based 3D reconstruction degenerates. Based on the detection, we have proposed a sequence-dividing algorithm to divide a long sequence into subsequences, such that successful 3D reconstructions can be performed on individual subsequences with a high confidence. The partial reconstructions are merged later to obtain the 3D model of the complete scene. In the critical configuration detection algorithm, the four critical configurations are detected: (1) coplanar 3D scene points, (2) pure camera rotation, (3) rotation around two camera centers, and (4) presence of excessive noise and outliers in the measurements. The configurations in cases (1), (2) and (4) will affect the rank of the Scaled Measurement Matrix (SMM). The number of camera centers in case (3) will affect the number of independent rows of the SMM. By examining the rank and the row space of the SMM, the abovementioned critical configurations are detected. Based on the detection results, the proposed sequence-dividing algorithm divides a long sequence into subsequences, such that each subsequence is free of the four critical configurations in order to obtain successful 3D reconstructions on individual subsequences. Experimental results on both synthetic and real sequences have demonstrated that the above four critical configurations are robustly detected, and a long sequence of thousands frames is automatically divided into subsequences, yielding successful 3D reconstructions. The proposed critical configuration detection and sequence-dividing algorithms provide an essential processing block for an automatical 3D reconstruction on long sequences. The third contribution is that we have proposed a coarse-to-fine multiple-view depth labeling algorithm to compute depth maps from multiple-view videos, where the accuracy of resulting depth maps is gradually refined in multiple optimization passes. In the proposed algorithm, multiple-view depth reconstruction is formulated as an image-based labeling problem using the framework of Maximum A Posterior (MAP) on Markov Random Fields (MRF). The MAP-MRF framework allows the combination of various objective and heuristic depth cues to define the local penalty and the interaction energies, which provides a straightforward and computationally tractable formulation. Furthermore, the global optimal MAP solution to depth labeli ing can be found by minimizing the local energies, using existing MRF optimization algorithms. The proposed algorithm contains the following three key contributions. (1) A graph construction algorithm to proposed to construct triangular meshes on over-segmentation maps, in order to exploit the color and the texture information for depth labeling. (2) Multiple depth cues are combined to define the local energies. Furthermore, the local energies are adapted to the local image content, in order to consider the varying nature of the image content for an accurate depth labeling. (3) Both the density of the graph nodes and the intervals of the depth labels are gradually refined in multiple labeling passes. By doing so, both the computational efficiency and the robustness of the depth labeling process are improved. The experimental results on real multiple-view videos show that the depth maps of for selected reference view are accurately reconstructed. Depth discontinuities are very well preserved

Repository TU/e

Pure OAI Repository

A Self-calibration Algorithm Based on a Unified Framework for Constraints on Multiple Views

Author: Hung YS
Tang AWK
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper, we propose a new self-calibration algorithm for upgrading projective space to Euclidean space. The proposed method aims to combine the most commonly used metric constraints, including zero skew and unit aspect-ratio by formulating each constraint as a cost function within a unified framework. Additional constraints, e.g., constant principal points, can also be formulated in the same framework. The cost function is very flexible and can be composed of different constraints on different views. The upgrade process is then stated as a minimization problem which may be solved by minimizing an upper bound of the cost function. This proposed method is non-iterative. Experimental results on synthetic data and real data are presented to show the performance of the proposed method and accuracy of the reconstructed scene. © 2012 The Author(s).published_or_final_versionSpringer Open Choice, 25 May 201

Springer - Publisher Connector

HKU Scholars Hub

Camera calibration of long image sequences with the presence of occlusions

Author: Bagherzadeh N
Susín Sánchez Antonio
Sáinz M.
Publication venue: www.icip2003.org
Publication date: 01/01/2003
Field of study

Camera calibration is a critical problem in applications such as augmented reality and image based model reconstruction. When constructing a 3D model of an object from an uncalibrated video sequence, large amounts of frames and self occlusions of parts of the object are common and difficult problems. In this paper we present a fast and robust algorithm that uses a divide and conquer strategy to split the video sequence into sub-sequences containing only the most relevant frames. Then a robust stratified linear based algorithm is able to calibrate each of the subsequences to a metric structure and finally the subsequences are merged together and a final non-linearoptimization refines the solution. Examples of real datareconstructions are presented.Postprint (author’s final draft

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Affine Approximation for Direct Batch Recovery of Euclidean Motion From Sparse Data

Author: Bartoli Adrien
Guilbert Nicolas
Heyden Anders
Publication venue: Springer Verlag
Publication date: 01/01/2006
Field of study

We present a batch method for recovering Euclidian camera motion from sparse image data. The main purpose of the algorithm is to recover the motion parameters using as much of the available information and as few computational steps as possible. The algorithmthus places itself in the gap between factorisation schemes, which make use of all available information in the initial recovery step, and sequential approaches which are able to handle sparseness in the image data. Euclidian camera matrices are approximated via the affine camera model, thus making the recovery direct in the sense that no intermediate projective reconstruction is made. Using a little known closure constraint, the FA-closure, we are able to formulate the camera coefficients linearly in the entries of the affine fundamental matrices. The novelty of the presented work is twofold: Firstly the presented formulation allows for a particularly good conditioning of the estimation of the initial motion parameters but also for an unprecedented diversity in the choice of possible regularisation terms. Secondly, the new autocalibration scheme presented here is in practice guaranteed to yield a Least Squares Estimate of the calibration parameters. As a bi-product, the affine camera model is rehabilitated as a useful model for most cameras and scene configurations, e.g. wide angle lenses observing a scene at close range. Experiments on real and synthetic data demonstrate the ability to reconstruct scenes which are very problematic for previous structure from motion techniques due to local ambiguities and error accumulation

HAL Clermont Université