82 research outputs found

    Robust Estimation of Motion Parameters and Scene Geometry : Minimal Solvers and Convexification of Regularisers for Low-Rank Approximation

    Get PDF
    In the dawning age of autonomous driving, accurate and robust tracking of vehicles is a quintessential part. This is inextricably linked with the problem of Simultaneous Localisation and Mapping (SLAM), in which one tries to determine the position of a vehicle relative to its surroundings without prior knowledge of them. The more you know about the object you wish to track—through sensors or mechanical construction—the more likely you are to get good positioning estimates. In the first part of this thesis, we explore new ways of improving positioning for vehicles travelling on a planar surface. This is done in several different ways: first, we generalise the work done for monocular vision to include two cameras, we propose ways of speeding up the estimation time with polynomial solvers, and we develop an auto-calibration method to cope with radially distorted images, without enforcing pre-calibration procedures.We continue to investigate the case of constrained motion—this time using auxiliary data from inertial measurement units (IMUs) to improve positioning of unmanned aerial vehicles (UAVs). The proposed methods improve the state-of-the-art for partially calibrated cases (with unknown focal length) for indoor navigation. Furthermore, we propose the first-ever real-time compatible minimal solver for simultaneous estimation of radial distortion profile, focal length, and motion parameters while utilising the IMU data.In the third and final part of this thesis, we develop a bilinear framework for low-rank regularisation, with global optimality guarantees under certain conditions. We also show equivalence between the linear and the bilinear framework, in the sense that the objectives are equal. This enables users of alternating direction method of multipliers (ADMM)—or other subgradient or splitting methods—to transition to the new framework, while being able to enjoy the benefits of second order methods. Furthermore, we propose a novel regulariser fusing two popular methods. This way we are able to combine the best of two worlds by encouraging bias reduction while enforcing low-rank solutions

    Cross-calibration of Time-of-flight and Colour Cameras

    Get PDF
    Time-of-flight cameras provide depth information, which is complementary to the photometric appearance of the scene in ordinary images. It is desirable to merge the depth and colour information, in order to obtain a coherent scene representation. However, the individual cameras will have different viewpoints, resolutions and fields of view, which means that they must be mutually calibrated. This paper presents a geometric framework for this multi-view and multi-modal calibration problem. It is shown that three-dimensional projective transformations can be used to align depth and parallax-based representations of the scene, with or without Euclidean reconstruction. A new evaluation procedure is also developed; this allows the reprojection error to be decomposed into calibration and sensor-dependent components. The complete approach is demonstrated on a network of three time-of-flight and six colour cameras. The applications of such a system, to a range of automatic scene-interpretation problems, are discussed.Comment: 18 pages, 12 figures, 3 table

    Perspective Preserving Solution for Quasi-Orthoscopic Video See-Through HMDs

    Get PDF
    In non-orthoscopic video see-through (VST) head-mounted displays (HMDs), depth perception through stereopsis is adversely affected by sources of spatial perception errors. Solutions for parallax-free and orthoscopic VST HMDs were considered to ensure proper space perception but at expenses of an increased bulkiness and weight. In this work, we present a hybrid video-optical see-through HMD the geometry of which explicitly violates the rigorous conditions of orthostereoscopy. For properly recovering natural stereo fusion of the scene within the personal space in a region around a predefined distance from the observer, we partially resolve the eye-camera parallax by warping the camera images through a perspective preserving homography that accounts for the geometry of the VST HMD and refers to such distance. For validating our solution; we conducted objective and subjective tests. The goal of the tests was to assess the efficacy of our solution in recovering natural depth perception in the space around said reference distance. The results obtained showed that the quasi-orthoscopic setting of the HMD; together with the perspective preserving image warping; allow the recovering of a correct perception of the relative depths. The perceived distortion of space around the reference plane proved to be not as severe as predicted by the mathematical models

    Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments

    Get PDF
    Image-based estimation of camera motion, known as visual odometry (VO), plays a very important role in many robotic applications such as control and navigation of unmanned mobile robots, especially when no external navigation reference signal is available. The core problem of VO is the estimation of the camera’s ego-motion (i.e. tracking) either between successive frames, namely relative pose estimation, or with respect to a global map, namely absolute pose estimation. This thesis aims to develop efficient, accurate and robust VO solutions by taking advantage of structural regularities in man-made environments, such as piece-wise planar structures, Manhattan World and more generally, contours and edges. Furthermore, to handle challenging scenarios that are beyond the limits of classical sensor based VO solutions, we investigate a recently emerging sensor — the event camera and study on event-based mapping — one of the key problems in the event-based VO/SLAM. The main achievements are summarized as follows. First, we revisit an old topic on relative pose estimation: accurately and robustly estimating the fundamental matrix given a collection of independently estimated homograhies. Three classical methods are reviewed and then we show a simple but nontrivial two-step normalization within the direct linear method that achieves similar performance to the less attractive and more computationally intensive hallucinated points based method. Second, an efficient 3D rotation estimation algorithm for depth cameras in piece-wise planar environments is presented. It shows that by using surface normal vectors as an input, planar modes in the corresponding density distribution function can be discovered and continuously tracked using efficient non-parametric estimation techniques. The relative rotation can be estimated by registering entire bundles of planar modes by using robust L1-norm minimization. Third, an efficient alternative to the iterative closest point algorithm for real-time tracking of modern depth cameras in ManhattanWorlds is developed. We exploit the common orthogonal structure of man-made environments in order to decouple the estimation of the rotation and the three degrees of freedom of the translation. The derived camera orientation is absolute and thus free of long-term drift, which in turn benefits the accuracy of the translation estimation as well. Fourth, we look into a more general structural regularity—edges. A real-time VO system that uses Canny edges is proposed for RGB-D cameras. Two novel alternatives to classical distance transforms are developed with great properties that significantly improve the classical Euclidean distance field based methods in terms of efficiency, accuracy and robustness. Finally, to deal with challenging scenarios that go beyond what standard RGB/RGB-D cameras can handle, we investigate the recently emerging event camera and focus on the problem of 3D reconstruction from data captured by a stereo event-camera rig moving in a static scene, such as in the context of stereo Simultaneous Localization and Mapping

    Uncalibrated stereo vision applied to breast cancer treatment aesthetic assessment

    Get PDF
    Mestrado Integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201

    Graph-based Spatial Motion Tracking Using Affine-covariant Regions

    Get PDF
    This thesis considers the task of spatial motion reconstruction from image sequences using a stereoscopic camera setup. In a variety of fields, such as flow analysis in physics or the measurement of oscillation characteristics and damping behavior in mechanical engineering, efficient and accurate methods for motion analysis are of great importance. This work discusses each algorithmic step of the motion reconstruction problem using a set of freely available image sequences. The presented concepts and evaluation results are of a generic nature and may thus be applied to a multitude of applications in various fields, where motion can be observed by two calibrated cameras. The first step in the processing chain of a motion reconstruction algorithm is concerned with the automated detection of salient locations (=features or regions) within each image of a given sequence. In this thesis, detection is directly performed on the natural texture of the observed objects instead of using artificial marker elements (as with many currently available methods). As one of the major contributions of this work, five well-known detection methods from the contemporary literature are compared to each other with regard to several performance measures, such as localization accuracy or the robustness under perspective distortions. The given results extend the available literature on the topic and facilitate the well-founded selection of appropriate detectors according to the requirements of specific target applications. In the second step, both spatial and temporal correspondences have to be established between features extracted from different images. With the former, two images taken at the same time instant but with different cameras are considered (stereo reconstruction) while with the latter, correspondences are sought between temporally adjacent images from the same camera instead (monocular feature tracking). With most classical methods, an observed object is either spatially reconstructed at a single time instant yielding a set of three-dimensional coordinates, or its motion is analyzed separately within each camera yielding a set of two-dimensional trajectories. A major contribution of this thesis is a concept for the unification of both stereo reconstruction and monocular tracking. Based on sets of two-dimensional trajectories from each camera of a stereo setup, the proposed method uses a graph-based approach to find correspondences not between single features but between entire trajectories instead. Thereby, the influence of locally ambiguous correspondences is mitigated significantly. The resulting spatial trajectories contain both the three-dimensional structure and the motion of the observed objects at the same time. To the best knowledge of the author, a similar concept does not yet exist in the literature. In a detailed evaluation, the superiority of the new method is demonstrated

    From light rays to 3D models

    Get PDF

    Geometric and photometric affine invariant image registration

    Get PDF
    This thesis aims to present a solution to the correspondence problem for the registration of wide-baseline images taken from uncalibrated cameras. We propose an affine invariant descriptor that combines the geometry and photometry of the scene to find correspondences between both views. The geometric affine invariant component of the descriptor is based on the affine arc-length metric, whereas the photometry is analysed by invariant colour moments. A graph structure represents the spatial distribution of the primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs represent connectivities by extracted contours. After matching, we refine the search for correspondences by using a maximum likelihood robust algorithm. We have evaluated the system over synthetic and real data. The method is endemic to propagation of errors introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Raum-Zeit Interpolationstechniken

    Get PDF
    The photo-realistic modeling and animation of complex scenes in 3D requires a lot of work and skill of artists even with modern acquisition techniques. This is especially true if the rendering should additionally be performed in real-time. In this thesis we follow another direction in computer graphics to generate photo-realistic results based on recorded video sequences of one or multiple cameras. We propose several methods to handle scenes showing natural phenomena and also multi-view footage of general complex 3D scenes. In contrast to other approaches, we make use of relaxed geometric constraints and focus especially on image properties important to create perceptually plausible in-between images. The results are novel photo-realistic video sequences rendered in real-time allowing for interactive manipulation or to interactively explore novel view and time points.Das Modellieren und die Animation von 3D Szenen in fotorealistischer Qualität ist sehr arbeitsaufwändig, auch wenn moderne Verfahren benutzt werden. Wenn die Bilder in Echtzeit berechnet werden sollen ist diese Aufgabe um so schwieriger zu lösen. In dieser Dissertation verfolgen wir einen alternativen Ansatz der Computergrafik, um neue photorealistische Ergebnisse aus einer oder mehreren aufgenommenen Videosequenzen zu gewinnen. Es werden mehrere Methoden entwickelt die für natürlicher Phänomene und für generelle Szenen einsetzbar sind. Im Unterschied zu anderen Verfahren nutzen wir abgeschwächte geometrische Einschränkungen und berechnen eine genaue Lösung nur dort wo sie wichtig für die menschliche Wahrnehmung ist. Die Ergebnisse sind neue fotorealistische Videosequenzen, die in Echtzeit berechnet und interaktiv manipuliert, oder in denen neue Blick- und Zeitpunkte der Szenen frei erkundet werden können
    • …
    corecore