992 research outputs found

    Grouping Uncertain Oriented Projective Geometric Entities with Application to Automatic Building Reconstruction

    Get PDF
    The fully automatic reconstruction of 3d scenes from a set of 2d images has always been a key issue in photogrammetry and computer vision and has not been solved satisfactory so far. Most of the current approaches match features between the images based on radiometric cues followed by a reconstruction using the image geometry. The motivation for this work is the conjecture that in the presence of highly redundant data it should be possible to recover the scene structure by grouping together geometric primitives in a bottom-up manner. Oriented projective geometry will be used throughout this work, which allows to represent geometric primitives, such as points, lines and planes in 2d and 3d space as well as projective cameras, together with their uncertainty. The first major contribution of the work is the use of uncertain oriented projective geometry, rather than uncertain projective geometry, that enables the representation of more complex compound entities, such as line segments and polygons in 2d and 3d space as well as 2d edgels and 3d facets. Within the uncertain oriented projective framework a procedure is developed, which allows to test pairwise relations between the various uncertain oriented projective entities. Again, the novelty lies in the possibility to check relations between the novel compound entities. The second major contribution of the work is the development of a data structure, specifically designed to enable performing the tests between large numbers of entities in an efficient manner. Being able to efficiently test relations between the geometric entities, a framework for grouping those entities together is developed. Various different grouping methods are discussed. The third major contribution of this work is the development of a novel grouping method that by analyzing the entropy change incurred by incrementally adding observations into an estimation is able to balance efficiency against robustness in order to achieve better grouping results. Finally the applicability of the proposed representations, tests and grouping methods for the task of purely geometry based building reconstruction from oriented aerial images is demonstrated. lt will be shown that in the presence of highly redundant datasets it is possible to achieve reasonable reconstruction results by grouping together geometric primitives.Gruppierung unsicherer orientierter projektiver geometrischer Elemente mit Anwendung in der automatischen Gebäuderekonstruktion Die vollautomatische Rekonstruktion von 3D Szenen aus einer Menge von 2D Bildern war immer ein Hauptanliegen in der Photogrammetrie und Computer Vision und wurde bisher noch nicht zufriedenstellend gelöst. Die meisten aktuellen Ansätze ordnen Merkmale zwischen den Bildern basierend auf radiometrischen Eigenschaften zu. Daran schließt sich dann eine Rekonstruktion auf der Basis der Bildgeometrie an. Die Motivation für diese Arbeit ist die These, dass es möglich sein sollte, die Struktur einer Szene durch Gruppierung geometrischer Primitive zu rekonstruieren, falls die Eingabedaten genügend redundant sind. Orientierte projektive Geometrie wird in dieser Arbeit zur Repräsentation geometrischer Primitive, wie Punkten, Linien und Ebenen in 2D und 3D sowie projektiver Kameras, zusammen mit ihrer Unsicherheit verwendet. Der erste Hauptbeitrag dieser Arbeit ist die Verwendung unsicherer orientierter projektiver Geometrie, anstatt von unsicherer projektiver Geometrie, welche die Repräsentation von komplexeren zusammengesetzten Objekten, wie Liniensegmenten und Polygonen in 2D und 3D sowie 2D Edgels und 3D Facetten, ermöglicht. Innerhalb dieser unsicheren orientierten projektiven Repräsentation wird ein Verfahren zum Testen paarweiser Relationen zwischen den verschiedenen unsicheren orientierten projektiven geometrischen Elementen entwickelt. Dabei liegt die Neuheit wieder in der Möglichkeit, Relationen zwischen den neuen zusammengesetzten Elementen zu prüfen. Der zweite Hauptbeitrag dieser Arbeit ist die Entwicklung einer Datenstruktur, welche speziell auf die effiziente Prüfung von solchen Relationen zwischen vielen Elementen ausgelegt ist. Die Möglichkeit zur effizienten Prüfung von Relationen zwischen den geometrischen Elementen erlaubt nun die Entwicklung eines Systems zur Gruppierung dieser Elemente. Verschiedene Gruppierungsmethoden werden vorgestellt. Der dritte Hauptbeitrag dieser Arbeit ist die Entwicklung einer neuen Gruppierungsmethode, die durch die Analyse der Änderung der Entropie beim Hinzufügen von Beobachtungen in die Schätzung Effizienz und Robustheit gegeneinander ausbalanciert und dadurch bessere Gruppierungsergebnisse erzielt. Zum Schluss wird die Anwendbarkeit der vorgeschlagenen Repräsentationen, Tests und Gruppierungsmethoden für die ausschließlich geometriebasierte Gebäuderekonstruktion aus orientierten Luftbildern demonstriert. Es wird gezeigt, dass unter der Annahme von hoch redundanten Datensätzen vernünftige Rekonstruktionsergebnisse durch Gruppierung von geometrischen Primitiven erzielbar sind

    Grouping Uncertain Oriented Projective Geometric Entities with Application to Automatic Building Reconstruction

    Get PDF
    The fully automatic reconstruction of 3d scenes from a set of 2d images has always been a key issue in photogrammetry and computer vision and has not been solved satisfactory so far. Most of the current approaches match features between the images based on radiometric cues followed by a reconstruction using the image geometry. The motivation for this work is the conjecture that in the presence of highly redundant data it should be possible to recover the scene structure by grouping together geometric primitives in a bottom-up manner. Oriented projective geometry will be used throughout this work, which allows to represent geometric primitives, such as points, lines and planes in 2d and 3d space as well as projective cameras, together with their uncertainty. The first major contribution of the work is the use of uncertain oriented projective geometry, rather than uncertain projective geometry, that enables the representation of more complex compound entities, such as line segments and polygons in 2d and 3d space as well as 2d edgels and 3d facets. Within the uncertain oriented projective framework a procedure is developed, which allows to test pairwise relations between the various uncertain oriented projective entities. Again, the novelty lies in the possibility to check relations between the novel compound entities. The second major contribution of the work is the development of a data structure, specifically designed to enable performing the tests between large numbers of entities in an efficient manner. Being able to efficiently test relations between the geometric entities, a framework for grouping those entities together is developed. Various different grouping methods are discussed. The third major contribution of this work is the development of a novel grouping method that by analyzing the entropy change incurred by incrementally adding observations into an estimation is able to balance efficiency against robustness in order to achieve better grouping results. Finally the applicability of the proposed representations, tests and grouping methods for the task of purely geometry based building reconstruction from oriented aerial images is demonstrated. It will be shown that in the presence of highly redundant datasets it is possible to achieve reasonable reconstruction results by grouping together geometric primitives.Gruppierung unsicherer orientierter projektiver geometrischer Elemente mit Anwendung in der automatischen Gebäuderekonstruktion Die vollautomatische Rekonstruktion von 3D Szenen aus einer Menge von 2D Bildern war immer ein Hauptanliegen in der Photogrammetrie und Computer Vision und wurde bisher noch nicht zufriedenstellend gelöst. Die meisten aktuellen Ansätze ordnen Merkmale zwischen den Bildern basierend auf radiometrischen Eigenschaften zu. Daran schließt sich dann eine Rekonstruktion auf der Basis der Bildgeometrie an. Die Motivation für diese Arbeit ist die These, dass es möglich sein sollte, die Struktur einer Szene durch Gruppierung geometrischer Primitive zu rekonstruieren, falls die Eingabedaten genügend redundant sind. Orientierte projektive Geometrie wird in dieser Arbeit zur Repräsentation geometrischer Primitive, wie Punkten, Linien und Ebenen in 2D und 3D sowie projektiver Kameras, zusammen mit ihrer Unsicherheit verwendet.Der erste Hauptbeitrag dieser Arbeit ist die Verwendung unsicherer orientierter projektiver Geometrie, anstatt von unsicherer projektiver Geometrie, welche die Repräsentation von komplexeren zusammengesetzten Objekten, wie Liniensegmenten und Polygonen in 2D und 3D sowie 2D Edgels und 3D Facetten, ermöglicht. Innerhalb dieser unsicheren orientierten projektiven Repräsentation wird ein Verfahren zum testen paarweiser Relationen zwischen den verschiedenen unsicheren orientierten projektiven geometrischen Elementen entwickelt. Dabei liegt die Neuheit wieder in der Möglichkeit, Relationen zwischen den neuen zusammengesetzten Elementen zu prüfen. Der zweite Hauptbeitrag dieser Arbeit ist die Entwicklung einer Datenstruktur, welche speziell auf die effiziente Prüfung von solchen Relationen zwischen vielen Elementen ausgelegt ist. Die Möglichkeit zur effizienten Prüfung von Relationen zwischen den geometrischen Elementen erlaubt nun die Entwicklung eines Systems zur Gruppierung dieser Elemente. Verschiedene Gruppierungsmethoden werden vorgestellt. Der dritte Hauptbeitrag dieser Arbeit ist die Entwicklung einer neuen Gruppierungsmethode, die durch die Analyse der änderung der Entropie beim Hinzufügen von Beobachtungen in die Schätzung Effizienz und Robustheit gegeneinander ausbalanciert und dadurch bessere Gruppierungsergebnisse erzielt. Zum Schluss wird die Anwendbarkeit der vorgeschlagenen Repräsentationen, Tests und Gruppierungsmethoden für die ausschließlich geometriebasierte Gebäuderekonstruktion aus orientierten Luftbildern demonstriert. Es wird gezeigt, dass unter der Annahme von hoch redundanten Datensätzen vernünftige Rekonstruktionsergebnisse durch Gruppierung von geometrischen Primitiven erzielbar sind

    Semantic Validation in Structure from Motion

    Full text link
    The Structure from Motion (SfM) challenge in computer vision is the process of recovering the 3D structure of a scene from a series of projective measurements that are calculated from a collection of 2D images, taken from different perspectives. SfM consists of three main steps; feature detection and matching, camera motion estimation, and recovery of 3D structure from estimated intrinsic and extrinsic parameters and features. A problem encountered in SfM is that scenes lacking texture or with repetitive features can cause erroneous feature matching between frames. Semantic segmentation offers a route to validate and correct SfM models by labelling pixels in the input images with the use of a deep convolutional neural network. The semantic and geometric properties associated with classes in the scene can be taken advantage of to apply prior constraints to each class of object. The SfM pipeline COLMAP and semantic segmentation pipeline DeepLab were used. This, along with planar reconstruction of the dense model, were used to determine erroneous points that may be occluded from the calculated camera position, given the semantic label, and thus prior constraint of the reconstructed plane. Herein, semantic segmentation is integrated into SfM to apply priors on the 3D point cloud, given the object detection in the 2D input images. Additionally, the semantic labels of matched keypoints are compared and inconsistent semantically labelled points discarded. Furthermore, semantic labels on input images are used for the removal of objects associated with motion in the output SfM models. The proposed approach is evaluated on a data-set of 1102 images of a repetitive architecture scene. This project offers a novel method for improved validation of 3D SfM models

    Representations for Cognitive Vision : a Review of Appearance-Based, Spatio-Temporal, and Graph-Based Approaches

    Get PDF
    The emerging discipline of cognitive vision requires a proper representation of visual information including spatial and temporal relationships, scenes, events, semantics and context. This review article summarizes existing representational schemes in computer vision which might be useful for cognitive vision, a and discusses promising future research directions. The various approaches are categorized according to appearance-based, spatio-temporal, and graph-based representations for cognitive vision. While the representation of objects has been covered extensively in computer vision research, both from a reconstruction as well as from a recognition point of view, cognitive vision will also require new ideas how to represent scenes. We introduce new concepts for scene representations and discuss how these might be efficiently implemented in future cognitive vision systems

    Convolutional neural network architecture for geometric matching

    Get PDF
    We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end. Second, we demonstrate that the network parameters can be trained from synthetically generated imagery without the need for manual annotation and that our matching layer significantly increases generalization capabilities to never seen before images. Finally, we show that the same model can perform both instance-level and category-level matching giving state-of-the-art results on the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017

    Coronary Artery Segmentation and Motion Modelling

    No full text
    Conventional coronary artery bypass surgery requires invasive sternotomy and the use of a cardiopulmonary bypass, which leads to long recovery period and has high infectious potential. Totally endoscopic coronary artery bypass (TECAB) surgery based on image guided robotic surgical approaches have been developed to allow the clinicians to conduct the bypass surgery off-pump with only three pin holes incisions in the chest cavity, through which two robotic arms and one stereo endoscopic camera are inserted. However, the restricted field of view of the stereo endoscopic images leads to possible vessel misidentification and coronary artery mis-localization. This results in 20-30% conversion rates from TECAB surgery to the conventional approach. We have constructed patient-specific 3D + time coronary artery and left ventricle motion models from preoperative 4D Computed Tomography Angiography (CTA) scans. Through temporally and spatially aligning this model with the intraoperative endoscopic views of the patient's beating heart, this work assists the surgeon to identify and locate the correct coronaries during the TECAB precedures. Thus this work has the prospect of reducing the conversion rate from TECAB to conventional coronary bypass procedures. This thesis mainly focus on designing segmentation and motion tracking methods of the coronary arteries in order to build pre-operative patient-specific motion models. Various vessel centreline extraction and lumen segmentation algorithms are presented, including intensity based approaches, geometric model matching method and morphology-based method. A probabilistic atlas of the coronary arteries is formed from a group of subjects to facilitate the vascular segmentation and registration procedures. Non-rigid registration framework based on a free-form deformation model and multi-level multi-channel large deformation diffeomorphic metric mapping are proposed to track the coronary motion. The methods are applied to 4D CTA images acquired from various groups of patients and quantitatively evaluated

    Multimodal Three Dimensional Scene Reconstruction, The Gaussian Fields Framework

    Get PDF
    The focus of this research is on building 3D representations of real world scenes and objects using different imaging sensors. Primarily range acquisition devices (such as laser scanners and stereo systems) that allow the recovery of 3D geometry, and multi-spectral image sequences including visual and thermal IR images that provide additional scene characteristics. The crucial technical challenge that we addressed is the automatic point-sets registration task. In this context our main contribution is the development of an optimization-based method at the core of which lies a unified criterion that solves simultaneously for the dense point correspondence and transformation recovery problems. The new criterion has a straightforward expression in terms of the datasets and the alignment parameters and was used primarily for 3D rigid registration of point-sets. However it proved also useful for feature-based multimodal image alignment. We derived our method from simple Boolean matching principles by approximation and relaxation. One of the main advantages of the proposed approach, as compared to the widely used class of Iterative Closest Point (ICP) algorithms, is convexity in the neighborhood of the registration parameters and continuous differentiability, allowing for the use of standard gradient-based optimization techniques. Physically the criterion is interpreted in terms of a Gaussian Force Field exerted by one point-set on the other. Such formulation proved useful for controlling and increasing the region of convergence, and hence allowing for more autonomy in correspondence tasks. Furthermore, the criterion can be computed with linear complexity using recently developed Fast Gauss Transform numerical techniques. In addition, we also introduced a new local feature descriptor that was derived from visual saliency principles and which enhanced significantly the performance of the registration algorithm. The resulting technique was subjected to a thorough experimental analysis that highlighted its strength and showed its limitations. Our current applications are in the field of 3D modeling for inspection, surveillance, and biometrics. However, since this matching framework can be applied to any type of data, that can be represented as N-dimensional point-sets, the scope of the method is shown to reach many more pattern analysis applications

    Control de robots móviles mediante visión omnidireccional utilizando la geometría de tres vistas

    Get PDF
    Este trabajo trata acerca del control visual de robot móviles. Dentro de este campo tan amplio de investigación existen dos elementos a los que prestaremos especial atención: la visión omnidireccional y los modelos geométricos multi-vista. Las cámaras omnidireccionales proporcionan información angular muy precisa, aunque presentan un grado de distorsión significativo en dirección radial. Su cualidad de poseer un amplio campo de visión hace que dichas cámaras sean apropiadas para tareas de navegación robótica. Por otro lado, el uso de los modelos geométricos que relacionan distintas vistas de una escena permite rechazar emparejamientos erróneos de características visuales entre imágenes, y de este modo robustecer el proceso de control mediante visión. Nuestro trabajo presenta dos técnicas de control visual para ser usadas por un robot moviéndose en el plano del suelo. En primer lugar, proponemos un nuevo método para homing visual, que emplea la información dada por un conjunto de imágenes de referencia adquiridas previamente en el entorno, y las imágenes que toma el robot a lo largo de su movimiento. Con el objeto de sacar partido de las cualidades de la visión omnidireccional, nuestro método de homing es puramente angular, y no emplea información alguna sobre distancia. Esta característica, unida al hecho de que el movimiento se realiza en un plano, motiva el empleo del modelo geométrico dado por el tensor trifocal 1D. En particular, las restricciones geométricas impuestas por dicho tensor, que puede ser calculado a partir de correspondencias de puntos entre tres imágenes, mejoran la robustez del control en presencia de errores de emparejamiento. El interés de nuestra propuesta reside en que el método de control empleado calcula las velocidades del robot a partir de información únicamente angular, siendo ésta muy precisa en las cámaras omnidireccionales. Además, presentamos un procedimiento que calcula las relaciones angulares entre las vistas disponibles de manera indirecta, sin necesidad de que haya información visual compartida entre todas ellas. La técnica descrita se puede clasificar como basada en imagen (image-based), dado que no precisa estimar la localización ni utiliza información 3D. El robot converge a la posición objetivo sin conocer la información métrica sobre la trayectoria seguida. Para algunas aplicaciones, como la evitación de obstáculos, puede ser necesario disponer de mayor información sobre el movimiento 3D realizado. Con esta idea en mente, presentamos un nuevo método de control visual basado en entradas sinusoidales. Las sinusoides son funciones con propiedades matemáticas bien conocidas y de variación suave, lo cual las hace adecuadas para su empleo en maniobras de aparcamiento de vehículos. A partir de las velocidades de variación sinusoidal que definimos en nuestro diseño, obtenemos las expresiones analíticas de la evolución de las variables de estado del robot. Además, basándonos en dichas expresiones, proponemos un método de control mediante realimentación del estado. La estimación del estado del robot se obtiene a partir del tensor trifocal 1D calculado entre la vista objetivo, la vista inicial y la vista actual del robot. Mediante este control sinusoidal, el robot queda alineado con la posición objetivo. En un segundo paso, efectuamos la corrección de la profundidad mediante una ley de control definida directamente en términos del tensor trifocal 1D. El funcionamiento de los dos controladores propuestos en el trabajo se ilustra mediante simulaciones, y con el objeto de respaldar su viabilidad se presentan análisis de estabilidad y resultados de simulaciones y de experimentos con imágenes reales
    corecore