23 research outputs found

    High-Level Facade Image Interpretation using Marked Point Processes

    Get PDF
    In this thesis, we address facade image interpretation as one essential ingredient for the generation of high-detailed, semantic meaningful, three-dimensional city-models. Given a single rectified facade image, we detect relevant facade objects such as windows, entrances, and balconies, which yield a description of the image in terms of accurate position and size of these objects. Urban digital three-dimensional reconstruction and documentation is an active area of research with several potential applications, e.g., in the area of digital mapping for navigation, urban planning, emergency management, disaster control or the entertainment industry. A detailed building model which is not just a geometric object enriched with texture, allows for semantic requests as the number of floors or the location of balconies and entrances. Facade image interpretation is one essential step in order to yield such models. In this thesis, we propose the interpretation of facade images by combining evidence for the occurrence of individual object classes which we derive from data, and prior knowledge which guides the image interpretation in its entirety. We present a three-step procedure which generates features that are suited to describe relevant objects, learns a representation that is suited for object detection, and that enables the image interpretation using the results of object detection while incorporating prior knowledge about typical configurations of facade objects, which we learn from training data. According to these three sub-tasks, our major achievements are: We propose a novel method for facade image interpretation based on a marked point process. Therefor, we develop a model for the description of typical configurations of facade objects and propose an image interpretation system which combines evidence derived from data and prior knowledge about typical configurations of facade objects. In order to generate evidence from data, we propose a feature type which we call shapelets. They are scale invariant and provide large distinctiveness for facade objects. Segments of lines, arcs, and ellipses serve as basic features for the generation of shapelets. Therefor, we propose a novel line simplification approach which approximates given pixel-chains by a sequence of lines, circular, and elliptical arcs. Among others, it is based on an adaption to Douglas-Peucker's algorithm, which is based on circles as basic geometric elements We evaluate each step separately. We show the effects of polyline segmentation and simplification on several images with comparable good or even better results, referring to a state-of-the-art algorithm, which proves their large distinctiveness for facade objects. Using shapelets we provide a reasonable classification performance on a challenging dataset, including intra-class variations, clutter, and scale changes. Finally, we show promising results for the facade interpretation system on several datasets and provide a qualitative evaluation which demonstrates the capability of complete and accurate detection of facade objectsHigh-Level Interpretation von Fassaden-Bildern unter Benutzung von Markierten PunktprozessenDas Thema dieser Arbeit ist die Interpretation von Fassadenbildern als wesentlicher Beitrag zur Erstellung hoch detaillierter, semantisch reichhaltiger dreidimensionaler Stadtmodelle. In rektifizierten Einzelaufnahmen von Fassaden detektieren wir relevante Objekte wie Fenster, Türen und Balkone, um daraus eine Bildinterpretation in Form von präzisen Positionen und Größen dieser Objekte abzuleiten. Die digitale dreidimensionale Rekonstruktion urbaner Regionen ist ein aktives Forschungsfeld mit zahlreichen Anwendungen, beispielsweise der Herstellung digitaler Kartenwerke für Navigation, Stadtplanung, Notfallmanagement, Katastrophenschutz oder die Unterhaltungsindustrie. Detaillierte Gebäudemodelle, die nicht nur als geometrische Objekte repräsentiert und durch eine geeignete Textur visuell ansprechend dargestellt werden, erlauben semantische Anfragen, wie beispielsweise nach der Anzahl der Geschosse oder der Position der Balkone oder Eingänge. Die semantische Interpretation von Fassadenbildern ist ein wesentlicher Schritt für die Erzeugung solcher Modelle. In der vorliegenden Arbeit lösen wir diese Aufgabe, indem wir aus Daten abgeleitete Evidenz für das Vorkommen einzelner Objekte mit Vorwissen kombinieren, das die Analyse der gesamten Bildinterpretation steuert. Wir präsentieren dafür ein dreistufiges Verfahren: Wir erzeugen Bildmerkmale, die für die Beschreibung der relevanten Objekte geeignet sind. Wir lernen, auf Basis abgeleiteter Merkmale, eine Repräsentation dieser Objekte. Schließlich realisieren wir die Bildinterpretation basierend auf der zuvor gelernten Repräsentation und dem Vorwissen über typische Konfigurationen von Fassadenobjekten, welches wir aus Trainingsdaten ableiten. Wir leisten dazu die folgenden wissenschaftlichen Beiträge: Wir schlagen eine neuartige Me-thode zur Interpretation von Fassadenbildern vor, die einen sogenannten markierten Punktprozess verwendet. Dafür entwickeln wir ein Modell zur Beschreibung typischer Konfigurationen von Fassadenobjekten und entwickeln ein Bildinterpretationssystem, welches aus Daten abgeleitete Evidenz und a priori Wissen über typische Fassadenkonfigurationen kombiniert. Für die Erzeugung der Evidenz stellen wir Merkmale vor, die wir Shapelets nennen und die skaleninvariant und durch eine ausgesprochene Distinktivität im Bezug auf Fassadenobjekte gekennzeichnet sind. Als Basismerkmale für die Erzeugung der Shapelets dienen Linien-, Kreis- und Ellipsensegmente. Dafür stellen wir eine neuartige Methode zur Vereinfachung von Liniensegmenten vor, die eine Pixelkette durch eine Sequenz von geraden Linienstücken und elliptischen Bogensegmenten approximiert. Diese basiert unter anderem auf einer Adaption des Douglas-Peucker Algorithmus, die anstelle gerader Linienstücke, Bogensegmente als geometrische Basiselemente verwendet. Wir evaluieren jeden dieser drei Teilschritte separat. Wir zeigen Ergebnisse der Liniensegmen-tierung anhand verschiedener Bilder und weisen dabei vergleichbare und teilweise verbesserte Ergebnisse im Vergleich zu bestehende Verfahren nach. Für die vorgeschlagenen Shapelets weisen wir in der Evaluation ihre diskriminativen Eigenschaften im Bezug auf Fassadenobjekte nach. Wir erzeugen auf einem anspruchsvollen Datensatz von skalenvariablen Fassadenobjekten, mit starker Variabilität der Erscheinung innerhalb der Klassen, vielversprechende Klassifikationsergebnisse, die die Verwendbarkeit der gelernten Shapelets für die weitere Interpretation belegen. Schließlich zeigen wir Ergebnisse der Interpretation der Fassadenstruktur anhand verschiedener Datensätze. Die qualitative Evaluation demonstriert die Fähigkeit des vorgeschlagenen Lösungsansatzes zur vollständigen und präzisen Detektion der genannten Fassadenobjekte

    Variable Resolution & Dimensional Mapping For 3d Model Optimization

    Get PDF
    Three-dimensional computer models, especially geospatial architectural data sets, can be visualized in the same way humans experience the world, providing a realistic, interactive experience. Scene familiarization, architectural analysis, scientific visualization, and many other applications would benefit from finely detailed, high resolution, 3D models. Automated methods to construct these 3D models traditionally has produced data sets that are often low fidelity or inaccurate; otherwise, they are initially highly detailed, but are very labor and time intensive to construct. Such data sets are often not practical for common real-time usage and are not easily updated. This thesis proposes Variable Resolution & Dimensional Mapping (VRDM), a methodology that has been developed to address some of the limitations of existing approaches to model construction from images. Key components of VRDM are texture palettes, which enable variable and ultra-high resolution images to be easily composited; texture features, which allow image features to integrated as image or geometry, and have the ability to modify the geometric model structure to add detail. These components support a primary VRDM objective of facilitating model refinement with additional data. This can be done until the desired fidelity is achieved as practical limits of infinite detail are approached. Texture Levels, the third component, enable real-time interaction with a very detailed model, along with the flexibility of having alternate pixel data for a given area of the model and this is achieved through extra dimensions. Together these techniques have been used to construct models that can contain GBs of imagery data

    La gestión del patrimonio construido a través del proyecto HBIM: un estudio de caso de pavimentos y alicatados

    Get PDF
    [EN] Building Information Modelling (BIM) is a collaborative system that has been fully developed in the design and management of industries involved in Architecture, Engineering and Construction (AEC) sectors. There are, however, very few studies aimed at managing information models in the field of architectural and cultural heritage interventions. This research therefore proposes an innovative methodology of analysis and treatment of the information based on a representative 3D graphic model of the flooring and wall tiling of a historic building. The objective is to set up a model of graphic information which guarantees the interoperability of the aforementioned information amongst the diverse disciplines intervening in the conservation and restoration process. The Pavillion of Charles V, a Renaissancecharacterised building located in outdoor areas of the Alcazar of Seville, Spain, was selected for the study. This work constitutes a project of intervention based on Heritage or Historic Building Information Modelling, called the “HBIM Project”.[ES] El Modelado de Información para la Construcción (BIM) es un sistema colaborativo que actualmente está plenamente desarrollado en el diseño y la gestión de las industrias involucradas en el sector de la Arquitectura, Ingeniería y Construcción (Architecture, Engineering and Contruction- AEC). En cambio, en el ámbito de las intervenciones en el Patrimonio Cultural y Arquitectónico, son muy pocos los estudios dedicados a gestionar modelos de información. Por ello, esta investigación propone una metodología innovadora de análisis y tratamiento de la información basada en un Modelo gráfico 3D representativo de pavimentos y alicatados de un Edificio Histórico. La finalidad es crear un modelo de información gráfica que garantice la interoperabilidad de dicha información entre las diversas disciplinas que intervienen en el Proceso de Conservación y Rehabilitación Patrimonial. Para el estudio, se ha seleccionado el Cenador de Carlos V (o de la Alcoba); edificio de carácter renacentista perteneciente a los espacios exteriores del Alcázar de Sevilla, España. Este trabajo constituye un proyecto de intervención basado en un modelo de información patrimonial o del edificio histórico, denominado “Proyecto HBIM”.The authors wish to thank Prof. Tabales of the Departamento de Construcciones Arquitectónicas II of the University of Seville for his major contribution and help, and Mr. Jacinto Pérez Elliot, Director-Conservador del Patronato de los Reales Alcázares de Sevilla, for granting access to the aforementioned buildings.Nieto, JE.; Moyano, JJ.; Rico Delgado, F.; Antón García, D. (2016). Management of built heritage via HBIM Project: A case of study of flooring and tiling. Virtual Archaeology Review. 7(14):1-12. doi:10.4995/var.2016.434911271

    Development of a SGM-based multi-view reconstruction framework for aerial imagery

    Get PDF
    Advances in the technology of digital airborne camera systems allow for the observation of surfaces with sampling rates in the range of a few centimeters. In combination with novel matching approaches, which estimate depth information for virtually every pixel, surface reconstructions of impressive density and precision can be generated. Therefore, image based surface generation meanwhile is a serious alternative to LiDAR based data collection for many applications. Surface models serve as primary base for geographic products as for example map creation, production of true-ortho photos or visualization purposes within the framework of virtual globes. The goal of the presented theses is the development of a framework for the fully automatic generation of 3D surface models based on aerial images - both standard nadir as well as oblique views. This comprises several challenges. On the one hand dimensions of aerial imagery is considerable and the extend of the areas to be reconstructed can encompass whole countries. Beside scalability of methods this also requires decent processing times and efficient handling of the given hardware resources. Moreover, beside high precision requirements, a high degree of automation has to be guaranteed to limit manual interaction as much as possible. Due to the advantages of scalability, a stereo method is utilized in the presented thesis. The approach for dense stereo is based on an adapted version of the semi global matching (SGM) algorithm. Following a hierarchical approach corresponding image regions and meaningful disparity search ranges are identified. It will be verified that, dependent on undulations of the scene, time and memory demands can be reduced significantly, by up to 90% within some of the conducted tests. This enables the processing of aerial datasets on standard desktop machines in reasonable times even for large fields of depth. Stereo approaches generate disparity or depth maps, in which redundant depth information is available. To exploit this redundancy, a method for the refinement of stereo correspondences is proposed. Thereby redundant observations across stereo models are identified, checked for geometric consistency and their reprojection error is minimized. This way outliers are removed and precision of depth estimates is improved. In order to generate consistent surfaces, two algorithms for depth map fusion were developed. The first fusion strategy aims for the generation of 2.5D height models, also known as digital surface models (DSM). The proposed method improves existing methods regarding quality in areas of depth discontinuities, for example at roof edges. Utilizing benchmarks designed for the evaluation of image based DSM generation we show that the developed approaches favorably compare to state-of-the-art algorithms and that height precisions of few GSDs can be achieved. Furthermore, methods for the derivation of meshes based on DSM data are discussed. The fusion of depth maps for 3D scenes, as e.g. frequently required during evaluation of high resolution oblique aerial images in complex urban environments, demands for a different approach since scenes can in general not be represented as height fields. Moreover, depths across depth maps possess varying precision and sampling rates due to variances in image scale, errors in orientation and other effects. Within this thesis a median-based fusion methodology is proposed. By using geometry-adaptive triangulation of depth maps depth-wise normals are extracted and, along the point coordinates are filtered and fused using tree structures. The output of this method are oriented points which then can be used to generate meshes. Precision and density of the method will be evaluated using established multi-view benchmarks. Beside the capability to process close range datasets, results for large oblique airborne data sets will be presented. The report closes with a summary, discussion of limitations and perspectives regarding improvements and enhancements. The implemented algorithms are core elements of the commercial software package SURE, which is freely available for scientific purposes

    Exploiting Repetitions for Image-Based Rendering of Facades

    Get PDF
    International audienceAbstract Street-level imagery is now abundant but does not have sufficient capture density to be usable for Image-Based Rendering (IBR) of facades. We present a method that exploits repetitive elements in facades – such as windows – to perform data augmentation, in turn improving camera calibration, reconstructed geometry and overall rendering quality for IBR. The main intuition behind our approach is that a few views of several instances of an element provide similar information to many views of a single instance of that element. We first select similar instances of an element from 3-4 views of a facade and transform them into a common coordinate system, creating a " platonic " element. We use this common space to refine the camera calibration of each view of each instance and to reconstruct a 3D mesh of the element with multi-view stereo, that we regularize to obtain a piecewise-planar mesh aligned with dominant image contours. Observing the same element under multiple views also allows us to identify reflective areas – such as glass panels – which we use at rendering time to generate plausible reflections using an environment map. Our detailed 3D mesh, augmented set of views, and reflection mask enable image-based rendering of much higher quality than results obtained using the input images directly

    Low-rank Based Algorithms for Rectification, Repetition Detection and De-noising in Urban Images

    Full text link
    In this thesis, we aim to solve the problem of automatic image rectification and repeated patterns detection on 2D urban images, using novel low-rank based techniques. Repeated patterns (such as windows, tiles, balconies and doors) are prominent and significant features in urban scenes. Detection of the periodic structures is useful in many applications such as photorealistic 3D reconstruction, 2D-to-3D alignment, facade parsing, city modeling, classification, navigation, visualization in 3D map environments, shape completion, cinematography and 3D games. However both of the image rectification and repeated patterns detection problems are challenging due to scene occlusions, varying illumination, pose variation and sensor noise. Therefore, detection of these repeated patterns becomes very important for city scene analysis. Given a 2D image of urban scene, we automatically rectify a facade image and extract facade textures first. Based on the rectified facade texture, we exploit novel algorithms that extract repeated patterns by using Kronecker product based modeling that is based on a solid theoretical foundation. We have tested our algorithms in a large set of images, which includes building facades from Paris, Hong Kong and New York

    Towards automatic reconstruction of indoor scenes from incomplete point clouds: door and window detection and regularization

    Get PDF
    In the last years, point clouds have become the main source of information for building modelling. Although a considerable amount of methodologies addressing the automated generation of 3D models from point clouds have been developed, indoor modelling is still a challenging task due to complex building layouts and the high presence of severe clutters and occlusions. Most of methodologies are highly dependent on data quality, often producing irregular and non-consistent models. Although manmade environments generally exhibit some regularities, they are not commonly considered. This paper presents an optimization-based approach for detecting regularities (i.e., same shape, same alignment and same spacing) in building indoor features. The methodology starts from the detection of openings based on a voxel-based visibility analysis to distinguish ‘occluded’ from ‘empty’ regions in wall surfaces. The extraction of regular patterns in windows is addressed from studying the point cloud from an outdoor perspective. The layout is regularized by minimizing deformations while respecting the detected constraints. The methodology applies for elements placed in the same planeXunta de Galicia | Ref. ED481B 2016/079-

    TOWARDS AUTOMATIC RECONSTRUCTION OF INDOOR SCENES FROM INCOMPLETE POINT CLOUDS: DOOR AND WINDOW DETECTION AND REGULARIZATION

    Get PDF
    In the last years, point clouds have become the main source of information for building modelling. Although a considerable amount of methodologies addressing the automated generation of 3D models from point clouds have been developed, indoor modelling is still a challenging task due to complex building layouts and the high presence of severe clutters and occlusions. Most of methodologies are highly dependent on data quality, often producing irregular and non-consistent models. Although manmade environments generally exhibit some regularities, they are not commonly considered. This paper presents an optimization-based approach for detecting regularities (i.e., same shape, same alignment and same spacing) in building indoor features. The methodology starts from the detection of openings based on a voxel-based visibility analysis to distinguish ‘occluded’ from ‘empty’ regions in wall surfaces. The extraction of regular patterns in windows is addressed from studying the point cloud from an outdoor perspective. The layout is regularized by minimizing deformations while respecting the detected constraints. The methodology applies for elements placed in the same plane

    Efficient volumetric reconstruction from multiple calibrated cameras

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2005.Includes bibliographical references (p. 137-142).The automatic reconstruction of large scale 3-D models from real images is of significant value to the field of computer vision in the understanding of images. As a consequence, many techniques have emerged to perform scene reconstruction from calibrated images where the position and orientation of the camera are known. Feature based methods using points and lines have enjoyed much success and have been shown to be robust against noise and changing illumination conditions. The models produced by these techniques however, can often appear crude when untextured due to the sparse set of points from which they are created. Other reconstruction methods, such as volumetric techniques, use image pixel intensities rather than features, reconstructing the scene as small volumetric units called voxels. The direct use of pixel values in the images has restricted current methods to operating on scenes with static illumination conditions. Creating a volumetric representation of the scene may also require millions of interdependent voxels which must be efficiently processed. This has limited most techniques to constrained camera locations and small indoor scenes. The primary goal of this thesis is to perform efficient voxel-based reconstruction of urban environments using a large set of pose-instrumented images. In addition to the 3- D scene reconstruction, the algorithm will also generate estimates of surface reflectance and illumination. Designing an algorithm that operates in a discretized 3-D scene space allows for the recovery of intrinsic scene color and for the integration of visibility constraints, while avoiding the pitfalls of image based feature correspondence.(cont.) The algorithm demonstrates how in principle it is possible to reduce computational effort over more naive methods. The algorithm is intended to perform the reconstruction of large scale 3-D models from controlled imagery without human intervention.by Manish Jethwa.Ph.D
    corecore