25,630 research outputs found

    Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

    Full text link
    Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

    Single View Modeling and View Synthesis

    Get PDF
    This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video

    Accurately scaled 3-D scene reconstruction using a moving monocular camera and a single-point depth sensor

    Get PDF
    Abstract: A 3-D reconstruction produced using only a single camera and Structure from Motion (SfM) is always up to scale i.e. without real world dimensions. Real-world dimensions are necessary for many applications that require 3-D reconstruction since decisions are made based on the accuracy of the reconstruction and the estimated camera poses. Current solutions to the absence of scale require prior knowledge of or access to the imaged environment in order to provide absolute scale to a reconstruction. It is often necessary to obtain a 3-D reconstruction of an inaccessible or unknown enviroment. This research proposes the use of a basic SfM pipeline for 3-D reconstruction with a single camera while augmenting the camera with a depth measurement for each image by way of a laser point marker. The marker is identied in the image and projected such that its location is determined as the point with highest point density along the projection in the up to scale reconstruction. The known distance to this point provides a scale factor that can be applied to the up to scale reconstruction. The results obtained show that the proposed augmentation does provide better scale accuracy. The SfM pipeline has room for improvement especially in terms of two-view geometry and structure estimations. A proof of concept is achieved that may open the door to improved algorithms for more demanding applications.M.Ing. (Electrical and Electronic Engineering

    Restricted ability to recover three-dimensional global motion from one-dimensional local signals: Theoretical observations

    Get PDF
    AbstractRecovering 3D information from a 2D time-varying image is a vital task which human observers face daily. Numerous models exist which compute global 3D structure and motion on the basis of 2D local motion measurements of point-like elements. On the other hand, both experimental and computational research of early visual motion mechanisms emphasize the role of oriented (1D) detectors. Therefore, it is important to find out whether indeed 1D motion signals can serve as primary cues for 3D global motion computation. We have addressed this question by combining mathematical results and perceptual observations. We show that given the 2D-projected 1D instantaneous velocity field, it is mathematically impossible to discriminate rigid rotations from non-rigid transformations and/or to recover the rotation parameters. We relate this fact to existing results in cases where localized (point-like) cues are present, and to our own experiments on human performance in global motion perception when only 1D cues are given. Taken together, the data suggest a necessary role for localized information in early motion mechanisms and call for further physiological and psychophysical research in that direction

    A topological solution to object segmentation and tracking

    Full text link
    The world is composed of objects, the ground, and the sky. Visual perception of objects requires solving two fundamental challenges: segmenting visual input into discrete units, and tracking identities of these units despite appearance changes due to object deformation, changing perspective, and dynamic occlusion. Current computer vision approaches to segmentation and tracking that approach human performance all require learning, raising the question: can objects be segmented and tracked without learning? Here, we show that the mathematical structure of light rays reflected from environment surfaces yields a natural representation of persistent surfaces, and this surface representation provides a solution to both the segmentation and tracking problems. We describe how to generate this surface representation from continuous visual input, and demonstrate that our approach can segment and invariantly track objects in cluttered synthetic video despite severe appearance changes, without requiring learning.Comment: 21 pages, 6 main figures, 3 supplemental figures, and supplementary material containing mathematical proof

    A PCA approach to the object constancy for faces using view-based models of the face

    Get PDF
    The analysis of object and face recognition by humans attracts a great deal of interest, mainly because of its many applications in various fields, including psychology, security, computer technology, medicine and computer graphics. The aim of this work is to investigate whether a PCA-based mapping approach can offer a new perspective on models of object constancy for faces in human vision. An existing system for facial motion capture and animation developed for performance-driven animation of avatars is adapted, improved and repurposed to study face representation in the context of viewpoint and lighting invariance. The main goal of the thesis is to develop and evaluate a new approach to viewpoint invariance that is view-based and allows mapping of facial variation between different views to construct a multi-view representation of the face. The thesis describes a computer implementation of a model that uses PCA to generate example- based models of the face. The work explores the joint encoding of expression and viewpoint using PCA and the mapping between viewspecific PCA spaces. The simultaneous, synchronised video recording of 6 views of the face was used to construct multi-view representations, which helped to investigate how well multiple views could be recovered from a single view via the content addressable memory property of PCA. A similar approach was taken to lighting invariance. Finally, the possibility of constructing a multi-view representation from asynchronous view-based data was explored. The results of this thesis have implications for a continuing research problem in computer vision – the problem of recognising faces and objects from different perspectives and in different lighting. It also provides a new approach to understanding viewpoint invariance and lighting invariance in human observers

    Intraoperative Endoscopic Augmented Reality in Third Ventriculostomy

    Get PDF
    In neurosurgery, as a result of the brain-shift, the preoperative patient models used as a intraoperative reference change. A meaningful use of the preoperative virtual models during the operation requires for a model update. The NEAR project, Neuroendoscopy towards Augmented Reality, describes a new camera calibration model for high distorted lenses and introduces the concept of active endoscopes endowed with with navigation, camera calibration, augmented reality and triangulation modules

    Deformable 3-D Modelling from Uncalibrated Video Sequences

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of Londo

    Between Laws and Models: Some Philosophical Morals of Lagrangian Mechanics

    Get PDF
    I extract some philosophical morals from some aspects of Lagrangian mechanics. (A companion paper will present similar morals from Hamiltonian mechanics and Hamilton-Jacobi theory.) One main moral concerns methodology: Lagrangian mechanics provides a level of description of phenomena which has been largely ignored by philosophers, since it falls between their accustomed levels--``laws of nature'' and ``models''. Another main moral concerns ontology: the ontology of Lagrangian mechanics is both more subtle and more problematic than philosophers often realize. The treatment of Lagrangian mechanics provides an introduction to the subject for philosophers, and is technically elementary. In particular, it is confined to systems with a finite number of degrees of freedom, and for the most part eschews modern geometry. But it includes a presentation of Routhian reduction and of Noether's ``first theorem''.Comment: 106 pages, no figure
    • …
    corecore