    Advances in 3D reconstruction

    La tesi affronta il problema della ricostruzione di scene tridimensionali a partire da insiemi non strutturati di fotografie delle stesse. Lo stato dell'arte viene avanzato su diversi fronti: il primo contributo consiste in una formulazione robusta del problema di struttura e moto basata su di un approccio gerarchico, contrariamente a quello sequenziale prevalente in letteratura. Questa metodologia abbatte di un ordine di grandezza il costo computazionale complessivo, risulta inerentemente parallelizzabile, minimizza il progressivo accumulo degli errori e elimina la cruciale dipendenza dalla scelta della coppia di viste iniziale comune a tutte le formulazioni concorrenti. Un secondo contributo consiste nello sviluppo di una nuova procedura di autocalibrazione, particolarmente robusta e adatta al contesto del problema di moto e struttura. La soluzione proposta consiste in una procedura in forma chiusa per il recupero del piano all'infinito data una stima dei parametri intrinseci di almeno due camere. Questo metodo viene utilizzato per la ricerca esaustiva dei parametri interni, il cui spazio di ricerca Š strutturalmente limitato dalla finitezza dei dispositivi di acquisizione. Si Š indagato infine come visualizzare in maniera efficiente e gradevole i risultati di ricostruzione ottenuti: a tale scopo sono stati sviluppati algoritmi per il calcolo della disparit… stereo e procedure per la visualizzazione delle ricostruzione come insiemi di piani tessiturati automaticamente estratti, ottenendo una rappresentazione fedele, compatta e semanticamente significativa. Ogni risultato Š stato corredato da una validazione sperimentale rigorosa, con verifiche sia qualitative che quantitative.The thesis tackles the problem of 3D reconstruction of scenes from unstructured picture datasets. State of the art is advanced on several aspects: the first contribute consists in a robust formulation of the structure and motion problem based on a hierarchical approach, as opposed to the sequential one prevalent in literature. This methodology reduces the total computational complexity by one order of magnitude, is inherently parallelizable, minimizes the error accumulation causing drift and eliminates the crucial dependency from the choice of the initial couple of views which is common to all competing approaches. A second contribute consists in the discovery of a novel slef-calibration procedure, very robust and tailored to the structure and motion task. The proposed solution is a closed-form procedure for the recovery of the plane at infinity given a rough estimate of focal parameters of at least two cameras. This method is employed for the exaustive search of internal parameters, whise space is inherently bounded from the finiteness of acquisition devices. Finally, we inevstigated how to visualize in a efficient and compelling way the obtained reconstruction results: to this effect several algorithms for the computation of stereo disparity are presented. Along with procedures for the automatic extraction of support planes, they have been employed to obtain a faithful, compact and semantically significant representation of the scene as a collection of textured planes, eventually augmented by depth information encoded in relief maps. Every result has been verified by a rigorous experimental validation, comprising both qualitative and quantitative comparisons

    Rendering from unstructured collections of images

    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 157-163).Computer graphics researchers recently have turned to image-based rendering to achieve the goal of photorealistic graphics. Instead of constructing a scene with millions of polygons, the scene is represented by a collection of photographs along with a greatly simplified geometric model. This simple representation allows traditional light transport simulations to be replaced with basic image-processing routines that combine multiple images together to produce never-before-seen images from new vantage points. This thesis presents a new image-based rendering algorithm called unstructured lumigraph rendering (ULR). ULR is an image-based rendering algorithm that is specifically designed to work with unstructured (i.e., irregularly arranged) collections of images. The algorithm is unique in that it is capable of using any amount of geometric or image information that is available about a scene. Specifically, the research in this thesis makes the following contributions: * An enumeration of image-based rendering properties that an ideal algorithm should attempt to satisfy. An algorithm that satisfies these properties should work as well as possible with any configuration of input images or geometric knowledge. * An optimal formulation of the basic image-based rendering problem, the solution to which is designed to satisfy the aforementioned properties. * The unstructured lumigraph rendering algorithm, which is an efficient approximation to the optimal image-based rendering solution. * A non-metric ULR algorithm, which generalizes the basic ULR algorithm to work with uncalibrated images. * A time-dependent ULR algorithm, which generalizes the basic ULR algorithm to work with time-dependent data.by Christopher James Buehler.Ph.D

    Accurate automatic localization of surfaces of revolution for self-calibration and metric reconstruction

    In this paper, we address the problem of the automatic metric reconstruction Surface of Revolution (SOR) from a single uncalibrated view. The apparent contour and the visible portions of the imaged SOR cross sections are extracted and classified. The harmonic homology that models the image projection of the SOR is also estimated. The special care devoted to accuracy and robustness with respect to outliers makes the approach suitable for automatic camera calibration and metric reconstruction from single uncalibrated views of a SOR. Robustness and accuracy are obtained by embedding a graph-based grouping strategy (Euclidean Minimum Spanning Tree) into an Iterative Closest Point framework for projective curve alignment at multiple scales. Classification of SOR curves is achieved through a 2-dof voting scheme based on a pencil of conics novel parametrization. The main contribution of this work is to extend the domain of automatic single view reconstruction from piecewise planar scenes to scenes including curved surfaces, thus allowing to create automatically realistic image models of man-made objects. Experimental results with real images taken from the internet are reported, and the effectiveness and limitations of the approach are discussed

    Robust Methods for Accurate and Efficient Reconstruction from Motion Imagery

    Creating virtual representations of real-world scenes has been a long-standing goal in photogrammetry and computer vision, and has high practical relevance in industries involved in creating intelligent urban solutions. This includes a wide range of applications such as urban and community planning, reconnaissance missions by the military and government, autonomous robotics, virtual reality, cultural heritage preservation, and many others. Over the last decades, image-based modeling emerged as one of the most popular solutions. The objective is to extract metric information directly from images. Many procedural techniques achieve good results in terms of robustness, accuracy, completeness, and efficiency. More recently, deep-learning-based techniques were proposed to tackle this problem by training on vast amounts of data to learn to associate features between images through deep convolutional neural networks and were shown to outperform traditional procedural techniques. However, many of the key challenges such as large displacement and scalability still remain, especially when dealing with large-scale aerial imagery. This thesis investigates image-based modeling and proposes robust and scalable methods for large-scale aerial imagery. First, we present a method for reconstructing large-scale areas from aerial imagery that formulates the solution as a single-step process, reducing the processing time considerably. Next, we address feature matching and propose a variational optical flow technique (HybridFlow) for dense feature matching that leverages the robustness of graph matching to large displacements. The proposed solution efficiently handles arbitrary-sized aerial images. Finally, for general-purpose image-based modeling, we propose a deep-learning-based approach, an end-to-end multi-view structure from motion employing hypercorrelation volumes for learning dense feature matches. We demonstrate the application of the proposed techniques on several applications and report on task-related measures

    A New Computational Framework for Efficient Parallelization and Optimization of Large Scale Graph Matching

    There are so many applications in data fusion, comparison, and recognition that require a robust and efficient algorithm to match features of multiple images. To improve accuracy and get a more stable result is important to take into consideration both local appearance and the pairwise relationship of features. Graphs are a powerful and flexible data structure, allowing for the description of complex relationships between data elements, whose nodes correspond to salient features and edges correspond to relational aspects between features. Therefore, the problem of graph matching is to find a mapping between the two sets of nodes that preserves the relationships between them as much as possible. This graph-matching problem is mathematically formulated as an IQP problem which solving it is NP-hard, and obtaining exact Optima only plausible for very small data. Therefore, handling large-scale scientific visual data is quite limited, necessitating both efficient serial algorithms, as well as scalable parallel formulations. In this thesis, we first focused on exploring techniques to reduce the computation cost as well as memory usage of Pairwise graph matching by adopting a heuristic pruning strategy together with a redundancy pattern suppression scheme. We also modified the structure of the affinity matrix for minimizing memory requirement and parallelizing our algorithm by employing CPU’s and GPU’s accelerated libraries. Any pair of features with similar distance from first image results in same sub-matrices, therefore instead of constructing the whole affinity matrix, we only built the sub-blocked affinity for those distinct feature distances. By employing this scheme not only saved large memory and reduced computation time tremendously but also, the matrix-vector multiplication of gradient computation performed in parallel, where each block-vector calculation computed independently without synchronization. The accelerated libraries such as MKL, cuSparse, cuBlas and thrust applied to solving the GM problem, following the scheme of the spectral matching algorithm. We also extended our work for Multi-graph imaging, since many tasks require finding correspondences across multiple images. Also, considering more graph improves the matching accuracy. Most algorithms obtain approximate solutions for solving the GM NP-hard problem, result in a weak optimal solution. Therefore, we proposed a new solver, which iteratively modified the affinity matrix and binarized the solution by optimizing the original problem with its integer constraints

    Robust and affordable localization and mapping for 3D reconstruction. Application to architecture and construction

    La localización y mapeado simultáneo a partir de una sola cámara en movimiento se conoce como Monocular SLAM. En esta tesis se aborda este problema con cámaras de bajo coste cuyo principal reto consiste en ser robustos al ruido, blurring y otros artefactos que afectan a la imagen. La aproximación al problema es discreta, utilizando solo puntos de la imagen significativos para localizar la cámara y mapear el entorno. La principal contribución es una simplificación del grafo de poses que permite mejorar la precisión en las escenas más habituales, evaluada de forma exhaustiva en 4 datasets. Los resultados del mapeado permiten obtener una reconstrucción 3D de la escena que puede ser utilizada en arquitectura y construcción para Modelar la Información del Edificio (BIM). En la segunda parte de la tesis proponemos incorporar dicha información en un sistema de visualización avanzada usando WebGL que ayude a simplificar la implantación de la metodología BIM.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Doctorado en Informátic

    Long-range video motion estimation using point trajectories

    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (leaves 97-104).This thesis describes a new approach to video motion estimation, in which motion is represented using a set of particles. Each particle is an image point sample with a long-duration trajectory and other properties. To optimize these particles, we measure point-based matching along the particle trajectories and distortion between the particles. The resulting motion representation is useful for a variety of applications and differs from optical flow, feature tracking, and parametric or layer-based models. We demonstrate the algorithm on challenging real-world videos that include complex scene geometry, multiple types of occlusion, regions with low texture, and non-rigid deformation.by Peter Sand.Ph.D

    Contemporary Robotics

    This book book is a collection of 18 chapters written by internationally recognized experts and well-known professionals of the field. Chapters contribute to diverse facets of contemporary robotics and autonomous systems. The volume is organized in four thematic parts according to the main subjects, regarding the recent advances in the contemporary robotics. The first thematic topics of the book are devoted to the theoretical issues. This includes development of algorithms for automatic trajectory generation using redudancy resolution scheme, intelligent algorithms for robotic grasping, modelling approach for reactive mode handling of flexible manufacturing and design of an advanced controller for robot manipulators. The second part of the book deals with different aspects of robot calibration and sensing. This includes a geometric and treshold calibration of a multiple robotic line-vision system, robot-based inline 2D/3D quality monitoring using picture-giving and laser triangulation, and a study on prospective polymer composite materials for flexible tactile sensors. The third part addresses issues of mobile robots and multi-agent systems, including SLAM of mobile robots based on fusion of odometry and visual data, configuration of a localization system by a team of mobile robots, development of generic real-time motion controller for differential mobile robots, control of fuel cells of mobile robots, modelling of omni-directional wheeled-based robots, building of hunter- hybrid tracking environment, as well as design of a cooperative control in distributed population-based multi-agent approach. The fourth part presents recent approaches and results in humanoid and bioinspirative robotics. It deals with design of adaptive control of anthropomorphic biped gait, building of dynamic-based simulation for humanoid robot walking, building controller for perceptual motor control dynamics of humans and biomimetic approach to control mechatronic structure using smart materials

    Optical and hyperspectral image analysis for image-guided surgery

