172 research outputs found

    Robust Estimation of Trifocal Tensors Using Natural Features for Augmented Reality Systems

    Get PDF
    Augmented reality deals with the problem of dynamically augmenting or enhancing the real world with computer generated virtual scenes. Registration is one of the most pivotal problems currently limiting AR applications. In this paper, a novel registration method using natural features based on online estimation of trifocal tensors is proposed. This method consists of two stages: offline initialization and online registration. Initialization involves specifying four points in two reference images respectively to build the world coordinate system on which a virtual object will be augmented. In online registration, the natural feature correspondences detected from the reference views are tracked in the current frame to build the feature triples. Then these triples are used to estimate the corresponding trifocal tensors in the image sequence by which the four specified points are transferred to compute the registration matrix for augmentation. The estimated registration matrix will be used as an initial estimate for a nonlinear optimization method that minimizes the actual residual errors based on the Levenberg-Marquardt (LM) minimization method, thus making the results more robust and stable. This paper also proposes a robust method for estimating the trifocal tensors, where a modified RANSAC algorithm is used to remove outliers. Compared with standard RANSAC, our method can significantly reduce computation complexity, while overcoming the disturbance of mismatches. Some experiments have been carried out to demonstrate the validity of the proposed approach

    Integral Geometric Dual Distributions of Multilinear Models

    Get PDF
    We propose an integral geometric approach for computing dual distributions for the parameter distributions of multilinear models. The dual distributions can be computed from, for example, the parameter distributions of conics, multiple view tensors, homographies, or as simple entities as points, lines, and planes. The dual distributions have analytical forms that follow from the asymptotic normality property of the maximum likelihood estimator and an application of integral transforms, fundamentally the generalised Radon transforms, on the probability density of the parameters. The approach allows us, for instance, to look at the uncertainty distributions in feature distributions, which are essentially tied to the distribution of training data, and helps us to derive conditional distributions for interesting variables and characterise confidence intervals of the estimates

    Image Based View Synthesis

    Get PDF
    This dissertation deals with the image-based approach to synthesize a virtual scene using sparse images or a video sequence without the use of 3D models. In our scenario, a real dynamic or static scene is captured by a set of un-calibrated images from different viewpoints. After automatically recovering the geometric transformations between these images, a series of photo-realistic virtual views can be rendered and a virtual environment covered by these several static cameras can be synthesized. This image-based approach has applications in object recognition, object transfer, video synthesis and video compression. In this dissertation, I have contributed to several sub-problems related to image based view synthesis. Before image-based view synthesis can be performed, images need to be segmented into individual objects. Assuming that a scene can approximately be described by multiple planar regions, I have developed a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, correctly detect the occlusion pixels over multiple consecutive frames, and accurately segment the scene into several motion layers. First, a number of seed regions using correspondences in two frames are determined, and the seed regions are expanded and outliers are rejected employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, the occlusion order constraints on multiple frames are explored, which guarantee that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then the correct layer segmentation is obtained by using a graph cuts algorithm, and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Recovering the geometrical transformations among images of a scene is a prerequisite step for image-based view synthesis. I have developed a wide baseline matching algorithm to identify the correspondences between two un-calibrated images, and to further determine the geometric relationship between images, such as epipolar geometry or projective transformation. In our approach, a set of salient features, edge-corners, are detected to provide robust and consistent matching primitives. Then, based on the Singular Value Decomposition (SVD) of an affine matrix, we effectively quantize the search space into two independent subspaces for rotation angle and scaling factor, and then we use a two-stage affine matching algorithm to obtain robust matches between these two frames. The experimental results on a number of wide baseline images strongly demonstrate that our matching method outperforms the state-of-art algorithms even under the significant camera motion, illumination variation, occlusion, and self-similarity. Given the wide baseline matches among images I have developed a novel method for Dynamic view morphing. Dynamic view morphing deals with the scenes containing moving objects in presence of camera motion. The objects can be rigid or non-rigid, each of them can move in any orientation or direction. The proposed method can generate a series of continuous and physically accurate intermediate views from only two reference images without any knowledge about 3D. The procedure consists of three steps: segmentation, morphing and post-warping. Given a boundary connection constraint, the source and target scenes are segmented into several layers for morphing. Based on the decomposition of affine transformation between corresponding points, we uniquely determine a physically correct path for post-warping by the least distortion method. I have successfully generalized the dynamic scene synthesis problem from the simple scene with only rotation to the dynamic scene containing non-rigid objects. My method can handle dynamic rigid or non-rigid objects, including complicated objects such as humans. Finally, I have also developed a novel algorithm for tri-view morphing. This is an efficient image-based method to navigate a scene based on only three wide-baseline un-calibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images using our wide baseline matching method, an accurate trifocal plane is extracted from the trifocal tensor implied in these three images. Next, employing a trinocular-stereo algorithm and barycentric blending technique, we generate an arbitrary novel view to navigate the scene in a 2D space. Furthermore, after self-calibration of the cameras, a 3D model can also be correctly augmented into this virtual environment synthesized by the tri-view morphing algorithm. We have applied our view morphing framework to several interesting applications: 4D video synthesis, automatic target recognition, multi-view morphing

    Concentric mosaic(s), planar motion and 1D cameras

    Get PDF
    International audienceGeneral SFM methods give poor results for images captured by constrained motions such as planar motion of concentric mosaics (CM). In this paper, we propose new SFM algorithms for both images captured by CM and composite mosaic images from CM. We first introduce 1D affine camera model for completing 1D camera models. Then we show that a 2D image captured by CM can be decoupled into two 1D images: one 1D projective and one 1D affine; a composite mosaic image can by rebinned into a calibrated 1D panorama projective camera. Finally we describe subspace reconstruction methods and demonstrate both in theory and experiments the advantage of the decomposition method over the general SFM methods by incorporating the constrained motion into the earliest stage of motion analysis

    Collaborative Perception From Data Association To Localization

    Get PDF
    During the last decade, visual sensors have become ubiquitous. One or more cameras can be found in devices ranging from smartphones to unmanned aerial vehicles and autonomous cars. During the same time, we have witnessed the emergence of large scale networks ranging from sensor networks to robotic swarms. Assume multiple visual sensors perceive the same scene from different viewpoints. In order to achieve consistent perception, the problem of correspondences between ob- served features must be first solved. Then, it is often necessary to perform distributed localization, i.e. to estimate the pose of each agent with respect to a global reference frame. Having everything set in the same coordinate system and everything having the same meaning for all agents, operation of the agents and interpretation of the jointly observed scene become possible. The questions we address in this thesis are the following: first, can a group of visual sensors agree on what they see, in a decentralized fashion? This is the problem of collaborative data association. Then, based on what they see, can the visual sensors agree on where they are, in a decentralized fashion as well? This is the problem of cooperative localization. The contributions of this work are five-fold. We are the first to address the problem of consistent multiway matching in a decentralized setting. Secondly, we propose an efficient decentralized dynamical systems approach for computing any number of smallest eigenvalues and the associated eigenvectors of a weighted graph with global convergence guarantees with direct applications in group synchronization problems, e.g. permutations or rotations synchronization. Thirdly, we propose a state-of-the art framework for decentralized collaborative localization for mobile agents under the presence of unknown cross-correlations by solving a minimax optimization prob- lem to account for the missing information. Fourthly, we are the first to present an approach to the 3-D rotation localization of a camera sensor network from relative bearing measurements. Lastly, we focus on the case of a group of three visual sensors. We propose a novel Riemannian geometric representation of the trifocal tensor which relates projections of points and lines in three overlapping views. The aforemen- tioned representation enables the use of the state-of-the-art optimization methods on Riemannian manifolds and the use of robust averaging techniques for estimating the trifocal tensor
    corecore