84 research outputs found

    A study of Symmetric and Repetitive Structures in Image-Based Modeling

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Higher-Order Regularization in Computer Vision

    Get PDF
    At the core of many computer vision models lies the minimization of an objective function consisting of a sum of functions with few arguments. The order of the objective function is defined as the highest number of arguments of any summand. To reduce ambiguity and noise in the solution, regularization terms are included into the objective function, enforcing different properties of the solution. The most commonly used regularization is penalization of boundary length, which requires a second-order objective function. Most of this thesis is devoted to introducing higher-order regularization terms and presenting efficient minimization schemes. One of the topics of the thesis covers a reformulation of a large class of discrete functions into an equivalent form. The reformulation is shown, both in theory and practical experiments, to be advantageous for higher-order regularization models based on curvature and second-order derivatives. Another topic is the parametric max-flow problem. An analysis is given, showing its inherent limitations for large-scale problems which are common in computer vision. The thesis also introduces a segmentation approach for finding thin and elongated structures in 3D volumes. Using a line-graph formulation, it is shown how to efficiently regularize with respect to higher-order differential geometric properties such as curvature and torsion. Furthermore, an efficient optimization approach for a multi-region model is presented which, in addition to standard regularization, is able to enforce geometric constraints such as inclusion or exclusion of different regions. The final part of the thesis deals with dense stereo estimation. A new regularization model is introduced, penalizing the second-order derivatives of a depth or disparity map. Compared to previous second-order approaches to dense stereo estimation, the new regularization model is shown to be more easily optimized

    Variable Resolution & Dimensional Mapping For 3d Model Optimization

    Get PDF
    Three-dimensional computer models, especially geospatial architectural data sets, can be visualized in the same way humans experience the world, providing a realistic, interactive experience. Scene familiarization, architectural analysis, scientific visualization, and many other applications would benefit from finely detailed, high resolution, 3D models. Automated methods to construct these 3D models traditionally has produced data sets that are often low fidelity or inaccurate; otherwise, they are initially highly detailed, but are very labor and time intensive to construct. Such data sets are often not practical for common real-time usage and are not easily updated. This thesis proposes Variable Resolution & Dimensional Mapping (VRDM), a methodology that has been developed to address some of the limitations of existing approaches to model construction from images. Key components of VRDM are texture palettes, which enable variable and ultra-high resolution images to be easily composited; texture features, which allow image features to integrated as image or geometry, and have the ability to modify the geometric model structure to add detail. These components support a primary VRDM objective of facilitating model refinement with additional data. This can be done until the desired fidelity is achieved as practical limits of infinite detail are approached. Texture Levels, the third component, enable real-time interaction with a very detailed model, along with the flexibility of having alternate pixel data for a given area of the model and this is achieved through extra dimensions. Together these techniques have been used to construct models that can contain GBs of imagery data

    Evaluation Method for Automotive Stereo-Vision Systems

    Get PDF
    International audienceSafe vehicle guidance under human or computer control requires a thorough understanding of the traversed environment. Consequently if perception systems are to be introduced into mass market vehicles as part of driving assistance systems, their proper operation throughout the vehicle working life is needed. Onboard stereo-vision systems can provide rich information in terms of range, feature recognition, etc., hence the interest by car OEMs. System performance depends on multiple factors like light conditions, algorithms and the mechanical apparatus. Due to inaccuracies produced by changes in the system physical properties due to vibrations, misalignment of fixtures, etc. through the vehicle operational life a reduction in performance will occur. In this paper, an evaluation framework to estimate the performance of a vehicle onboard stereo-vision system in terms of 3D measurements and re-projection errors is presented. The approach considers changes that might occur in the system during the vehicle working life. It includes means to evaluate the self-calibration process often used to correct the effects of physical changes in the stereo-vision system. The results provide key information for the design and geometrical specification of automotive stereo-vision systems. As the potential physical changes in the geometric configuration of the camera-pair over the vehicle life time are difficult to predict, it was necessary to simulate them to generate families of errors that these might trigger on the system performance

    Pedestrian detection and tracking using stereo vision techniques

    Get PDF
    Automated pedestrian detection, counting and tracking has received significant attention from the computer vision community of late. Many of the person detection techniques described so far in the literature work well in controlled environments, such as laboratory settings with a small number of people. This allows various assumptions to be made that simplify this complex problem. The performance of these techniques, however, tends to deteriorate when presented with unconstrained environments where pedestrian appearances, numbers, orientations, movements, occlusions and lighting conditions violate these convenient assumptions. Recently, 3D stereo information has been proposed as a technique to overcome some of these issues and to guide pedestrian detection. This thesis presents such an approach, whereby after obtaining robust 3D information via a novel disparity estimation technique, pedestrian detection is performed via a 3D point clustering process within a region-growing framework. This clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. This pedestrian detection technique requires no external training and is able to robustly handle challenging real-world unconstrained environments from various camera positions and orientations. In addition, this thesis presents a continuous detect-and-track approach, with additional kinematic constraints and explicit occlusion analysis, to obtain robust temporal tracking of pedestrians over time. These approaches are experimentally validated using challenging datasets consisting of both synthetic data and real-world sequences gathered from a number of environments. In each case, the techniques are evaluated using both 2D and 3D groundtruth methodologies

    Depth and IMU aided image deblurring based on deep learning

    Get PDF
    Abstract. With the wide usage and spread of camera phones, it becomes necessary to tackle the problem of the image blur. Embedding a camera in those small devices implies obviously small sensor size compared to sensors in professional cameras such as full-frame Digital Single-Lens Reflex (DSLR) cameras. As a result, this can dramatically affect the collected amount of photons on the image sensor. To overcome this, a long exposure time is needed, but with slight motions that often happen in handheld devices, experiencing image blur is inevitable. Our interest in this thesis is the motion blur that can be caused by the camera motion, scene (objects in the scene) motion, or generally the relative motion between the camera and scene. We use deep neural network (DNN) models in contrary to conventional (non DNN-based) methods which are computationally expensive and time-consuming. The process of deblurring an image is guided by utilizing the scene depth and camera’s inertial measurement unit (IMU) records. One of the challenges of adopting DNN solutions is that a relatively huge amount of data is needed to train the neural network. Moreover, several hyperparameters need to be tuned including the network architecture itself. To train our network, a novel and promising method of synthesizing spatially-variant motion blur is proposed that considers the depth variations in the scene, which showed improvement of results against other methods. In addition to the synthetic dataset generation algorithm, a real blurry and sharp dataset collection setup is designed. This setup can provide thousands of real blurry and sharp images which can be of paramount benefit in DNN training or fine-tuning
    • …
    corecore