172 research outputs found

    Affine reconstruction of curved surfaces from uncalibrated views of apparent contours

    Full text link

    Representations for Cognitive Vision : a Review of Appearance-Based, Spatio-Temporal, and Graph-Based Approaches

    Get PDF
    The emerging discipline of cognitive vision requires a proper representation of visual information including spatial and temporal relationships, scenes, events, semantics and context. This review article summarizes existing representational schemes in computer vision which might be useful for cognitive vision, a and discusses promising future research directions. The various approaches are categorized according to appearance-based, spatio-temporal, and graph-based representations for cognitive vision. While the representation of objects has been covered extensively in computer vision research, both from a reconstruction as well as from a recognition point of view, cognitive vision will also require new ideas how to represent scenes. We introduce new concepts for scene representations and discuss how these might be efficiently implemented in future cognitive vision systems

    Ordinal depth from SFM and its application in robust scene recognition

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Purposive three-dimensional reconstruction by means of a controlled environment

    Get PDF
    Retrieving 3D data using imaging devices is a relevant task for many applications in medical imaging, surveillance, industrial quality control, and others. As soon as we gain procedural control over parameters of the imaging device, we encounter the necessity of well-defined reconstruction goals and we need methods to achieve them. Hence, we enter next-best-view planning. In this work, we present a formalization of the abstract view planning problem and deal with different planning aspects, whereat we focus on using an intensity camera without active illumination. As one aspect of view planning, employing a controlled environment also provides the planning and reconstruction methods with additional information. We incorporate the additional knowledge of camera parameters into the Kanade-Lucas-Tomasi method used for feature tracking. The resulting Guided KLT tracking method benefits from a constrained optimization space and yields improved accuracy while regarding the uncertainty of the additional input. Serving other planning tasks dealing with known objects, we propose a method for coarse registration of 3D surface triangulations. By the means of exact surface moments of surface triangulations we establish invariant surface descriptors based on moment invariants. These descriptors allow to tackle tasks of surface registration, classification, retrieval, and clustering, which are also relevant to view planning. In the main part of this work, we present a modular, online approach to view planning for 3D reconstruction. Based on the outcome of the Guided KLT tracking, we design a planning module for accuracy optimization with respect to an extended E-criterion. Further planning modules endow non-discrete surface estimation and visibility analysis. The modular nature of the proposed planning system allows to address a wide range of specific instances of view planning. The theoretical findings in this work are underlined by experiments evaluating the relevant terms

    Task-Driven Video Collection

    Get PDF
    Vision systems are increasingly being deployed to perform complex surveillance tasks. While improved algorithms are being developed to perform these tasks, it is also important that data suitable for these algorithms be acquired - a non-trivial task in a dynamic and crowded scene viewed by multiple PTZ cameras. In this paper, we describe a multi-camera system that collects images and videos of moving objects in such scenes, subject to task constraints. The system constructs "task visibility intervals" that contain information about what can be sensed in future time intervals. Constructing these intervals requires prediction of future object motion and consideration of several factors such as object occlusion and camera control parameters. Using a plane-sweep algorithm, these atomic intervals can be combined to form multi-task intervals, during which a single camera can collect videos suitable for multiple tasks simultaneously. Although cameras can then be scheduled based on the constructed intervals, finding an optimal schedule is a typical NP-hard problem. Due to this, and the lack of exact future information in a dynamic environment, we propose several methods for fast camera scheduling that yield solutions within a small constant factor of optimal. Experimental results illustrate system capabilities for both real and more complicated simulated scenarios

    3D regularized B-spline surface reconstruction from occluding contours of a sequence of images

    Get PDF
    The three dimensional surface reconstruction of a non polyhedral object is a difficult problem in computer vision . In this paper, a new method for reconstructing three dimensional surface from the recovered motion of occluding contours is presented through calibrated image sequences . We use the uniform bicubic Bspline surface patches to give a parametric representation of an object surface . Finally, the problem of three dimensional B-spline surface patches reconstruction is equivalent to find their control points by solving a nonlinear system . Two numerical methods are outlined : Levenberg-Marquardt, Quasi-Newton . To avoid the classic camera calibration that needs a calibration pattern, we propose a direct nonlinear method of the autocalibration of a camera using the stable points in the scene. Our approach can be applied in the case where the camera is calibrated, the object is smooth, specifically, that its surface is at least C2. Results for reconstruction based on synthetic and real data are presented .La reconstruction de surfaces tridimensionnelles d'un objet non polyédrique est un problème difficile de la vision par ordinateur. Dans cet article, une nouvelle approche est presentée pour la reconstruction des surfaces tridimensionnelles à partir de l'observation du mouvement des contours occultants dans une séquence d'images calibrées. La surface de cet objet est modélisée par des surfaces B-splines uniformes et bicubiques. Nous ramenons le problème de la reconstruction des surfaces au problème de résolution d'un système d'équations non linéaires déterminant leurs points de contrôle. Deux méthodes numériques de résolution du problème sont utilisées: Levenberg-Marquardt et Quasi-Newton. Pour éviter le calibrage classique nécessitant une mire, nous avons utilisé des points stables de la scène pour autocalibrer la caméra. L'approche proposée s'applique dans le cas d'un mouvement d'une caméra calibrée avec des surfaces C2. Des résultats expérimentaux sur des données simulées et réelles sont présentés

    Active recognition through next view planning: a survey

    Full text link

    Depth Enhancement and Surface Reconstruction with RGB/D Sequence

    Get PDF
    Surface reconstruction and 3D modeling is a challenging task, which has been explored for decades by the computer vision, computer graphics, and machine learning communities. It is fundamental to many applications such as robot navigation, animation and scene understanding, industrial control and medical diagnosis. In this dissertation, I take advantage of the consumer depth sensors for surface reconstruction. Considering its limited performance on capturing detailed surface geometry, a depth enhancement approach is proposed in the first place to recovery small and rich geometric details with captured depth and color sequence. In addition to enhancing its spatial resolution, I present a hybrid camera to improve the temporal resolution of consumer depth sensor and propose an optimization framework to capture high speed motion and generate high speed depth streams. Given the partial scans from the depth sensor, we also develop a novel fusion approach to build up complete and watertight human models with a template guided registration method. Finally, the problem of surface reconstruction for non-Lambertian objects, on which the current depth sensor fails, is addressed by exploiting multi-view images captured with a hand-held color camera and we propose a visual hull based approach to recovery the 3D model

    View Synthesis from Image and Video for Object Recognition Applications

    Get PDF
    Object recognition is one of the most important and successful applications in computer vision community. The varying appearances of the test object due to different poses or illumination conditions can make the object recognition problem very challenging. Using view synthesis techniques to generate pose-invariant or illumination-invariant images or videos of the test object is an appealing approach to alleviate the degrading recognition performance due to non-canonical views or lighting conditions. In this thesis, we first present a complete framework for better synthesis and understanding of the human pose from a limited number of available silhouette images. Pose-normalized silhouette images are generated using an active virtual camera and an image based visual hull technique, with the silhouette turning function distance being used as the pose similarity measurement. In order to overcome the inability of the shape from silhouettes method to reonstruct concave regions for human postures, a view synthesis algorithm is proposed for articulating humans using visual hull and contour-based body part segmentation. These two components improve each other for better performance through the correspondence across viewpoints built via the inner distance shape context measurement. Face recognition under varying pose is a challenging problem, especially when illumination variations are also present. We propose two algorithms to address this scenario. For a single light source, we demonstrate a pose-normalized face synthesis approach on a pixel-by-pixel basis from a single view by exploiting the bilateral symmetry of the human face. For more complicated illumination condition, the spherical harmonic representation is extended to encode pose information. An efficient method is proposed for robust face synthesis and recognition with a very compact training set. Finally, we present an end-to-end moving object verification system for airborne video, wherein a homography based view synthesis algorithm is used to simultaneously handle the object's changes in aspect angle, depression angle, and resolution. Efficient integration of spatial and temporal model matching assures the robustness of the verification step. As a byproduct, a robust two camera tracking method using homography is also proposed and demonstrated using challenging surveillance video sequences
    corecore