102 research outputs found

    BuildMapper: A Fully Learnable Framework for Vectorized Building Contour Extraction

    Full text link
    Deep learning based methods have significantly boosted the study of automatic building extraction from remote sensing images. However, delineating vectorized and regular building contours like a human does remains very challenging, due to the difficulty of the methodology, the diversity of building structures, and the imperfect imaging conditions. In this paper, we propose the first end-to-end learnable building contour extraction framework, named BuildMapper, which can directly and efficiently delineate building polygons just as a human does. BuildMapper consists of two main components: 1) a contour initialization module that generates initial building contours; and 2) a contour evolution module that performs both contour vertex deformation and reduction, which removes the need for complex empirical post-processing used in existing methods. In both components, we provide new ideas, including a learnable contour initialization method to replace the empirical methods, dynamic predicted and ground truth vertex pairing for the static vertex correspondence problem, and a lightweight encoder for vertex information extraction and aggregation, which benefit a general contour-based method; and a well-designed vertex classification head for building corner vertices detection, which casts light on direct structured building contour extraction. We also built a suitable large-scale building dataset, the WHU-Mix (vector) building dataset, to benefit the study of contour-based building extraction methods. The extensive experiments conducted on the WHU-Mix (vector) dataset, the WHU dataset, and the CrowdAI dataset verified that BuildMapper can achieve a state-of-the-art performance, with a higher mask average precision (AP) and boundary AP than both segmentation-based and contour-based methods

    Deep Learning for Building Footprint Generation from Optical Imagery

    Get PDF
    Auf Deep Learning basierende Methoden haben vielversprechende Ergebnisse für die Aufgabe der Erstellung von Gebäudegrundrissen gezeigt, aber sie haben zwei inhärente Einschränkungen. Erstens zeigen die extrahierten Gebäude verschwommene Gebäudegrenzen und Klecksformen. Zweitens sind für das Netzwerktraining massive Annotationen auf Pixelebene erforderlich. Diese Dissertation hat eine Reihe von Methoden entwickelt, um die oben genannten Probleme anzugehen. Darüber hinaus werden die entwickelten Methoden in praktische Anwendungen umgesetzt

    GeoAI-enhanced Techniques to Support Geographical Knowledge Discovery from Big Geospatial Data

    Get PDF
    abstract: Big data that contain geo-referenced attributes have significantly reformed the way that I process and analyze geospatial data. Compared with the expected benefits received in the data-rich environment, more data have not always contributed to more accurate analysis. “Big but valueless” has becoming a critical concern to the community of GIScience and data-driven geography. As a highly-utilized function of GeoAI technique, deep learning models designed for processing geospatial data integrate powerful computing hardware and deep neural networks into various dimensions of geography to effectively discover the representation of data. However, limitations of these deep learning models have also been reported when People may have to spend much time on preparing training data for implementing a deep learning model. The objective of this dissertation research is to promote state-of-the-art deep learning models in discovering the representation, value and hidden knowledge of GIS and remote sensing data, through three research approaches. The first methodological framework aims to unify varied shadow into limited number of patterns, with the convolutional neural network (CNNs)-powered shape classification, multifarious shadow shapes with a limited number of representative shadow patterns for efficient shadow-based building height estimation. The second research focus integrates semantic analysis into a framework of various state-of-the-art CNNs to support human-level understanding of map content. The final research approach of this dissertation focuses on normalizing geospatial domain knowledge to promote the transferability of a CNN’s model to land-use/land-cover classification. This research reports a method designed to discover detailed land-use/land-cover types that might be challenging for a state-of-the-art CNN’s model that previously performed well on land-cover classification only.Dissertation/ThesisDoctoral Dissertation Geography 201

    Automated Building Information Extraction and Evaluation from High-resolution Remotely Sensed Data

    Get PDF
    The two-dimensional (2D) footprints and three-dimensional (3D) structures of buildings are of great importance to city planning, natural disaster management, and virtual environmental simulation. As traditional manual methodologies for collecting 2D and 3D building information are often both time consuming and costly, automated methods are required for efficient large area mapping. It is challenging to extract building information from remotely sensed data, considering the complex nature of urban environments and their associated intricate building structures. Most 2D evaluation methods are focused on classification accuracy, while other dimensions of extraction accuracy are ignored. To assess 2D building extraction methods, a multi-criteria evaluation system has been designed. The proposed system consists of matched rate, shape similarity, and positional accuracy. Experimentation with four methods demonstrates that the proposed multi-criteria system is more comprehensive and effective, in comparison with traditional accuracy assessment metrics. Building height is critical for building 3D structure extraction. As data sources for height estimation, digital surface models (DSMs) that are derived from stereo images using existing software typically provide low accuracy results in terms of rooftop elevations. Therefore, a new image matching method is proposed by adding building footprint maps as constraints. Validation demonstrates that the proposed matching method can estimate building rooftop elevation with one third of the error encountered when using current commercial software. With an ideal input DSM, building height can be estimated by the elevation contrast inside and outside a building footprint. However, occlusions and shadows cause indistinct building edges in the DSMs generated from stereo images. Therefore, a “building-ground elevation difference model” (EDM) has been designed, which describes the trend of the elevation difference between a building and its neighbours, in order to find elevation values at bare ground. Experiments using this novel approach report that estimated building height with 1.5m residual, which out-performs conventional filtering methods. Finally, 3D buildings are digitally reconstructed and evaluated. Current 3D evaluation methods did not present the difference between 2D and 3D evaluation methods well; traditionally, wall accuracy is ignored. To address these problems, this thesis designs an evaluation system with three components: volume, surface, and point. As such, the resultant multi-criteria system provides an improved evaluation method for building reconstruction

    A heterogeneous data-based proposal for procedural 3D cities visualization and generalization

    Get PDF
    Ce projet de thèse est né d'un projet de collaboration entre l'équipe de recherche VORTEX/ Objets visuels: de la réalité à l'expression (maintenant REVA: Réel Expression Vie Artificielle) à l'IRIT : Institut de Recherche en Informatique de Toulouse d'une part et de professionnels de l'éducation, entreprises et entités publiques d'autre part. Le projet de collaboration SCOLA est essentiellement une plate-forme d'apprentissage en ligne basée sur l'utilisation des jeux sérieux dans les écoles. Il aide les utilisateurs à acquérir et à repérer des compétences prédéfinies. Cette plate-forme offre aux enseignants un nouvel outil flexible qui crée des scénarios liés à la pédagogie et personnalise les dossiers des élèves. Plusieurs contributions ont été attribuées à l'IRIT. L'une d'elles consiste à suggérer une solution pour la création automatique d'environnements 3D, à intégrer au scénario du jeu. Cette solution vise à empêcher les infographistes 3D de modéliser manuellement des environnements 3D détaillés et volumineux, ce qui peut être très coûteux et prendre beaucoup de temps. Diverses applications et prototypes ont été développés pour permettre à l'utilisateur de généraliser et de visualiser son propre monde virtuel principalement à partir d'un ensemble de règles. Par conséquent, il n'existe pas de schéma de représentation unique dans le monde virtuel en raison de l'hétérogénéité et de la diversité de la conception de contenus 3D, en particulier des modèles de ville. Cette contrainte nous a amené à nous appuyer largement dans notre projet sur de vraies données urbaines 3D au lieu de données personnalisées prédéfinies par le concepteur de jeu. Les progrès réalisés en infographie, les capacités de calcul élevées et les technologies Web ont largement révolutionné les techniques de reconstruction et de visualisation des données. Ces techniques sont appliquées dans divers domaines, en commençant par les jeux vidéo, les simulations et en terminant par les films qui utilisent des espaces générés de manière procédurale et des animations de personnages. Bien que les jeux informatiques modernes n'aient pas les mêmes restrictions matérielles et de mémoire que les anciens jeux, la génération procédurale est fréquemment utilisée pour créer des jeux, des cartes, des niveaux, des personnages ou d'autres facettes aléatoires uniques sur chaque jeu. Actuellement, la tendance est déplacée vers les SIG: Systèmes d'Information Géographiques pour créer des mondes urbains, en particulier après leur mise en œuvre réussie dans le monde entier afin de prendre en charge de nombreuses domaines d'applications. Les SIG sont plus particulièrement dédiés à des applications telles que la simulation, la gestion des catastrophes et la planification urbaine, avec une grande utilisation plus ou moins limitée dans les jeux, par exemple le jeu "Minecraft", dont la dernière version propose une cartographie utilisant des villes du monde réel Geodata in Minecraft. L'utilisation des données urbaines existantes devient de plus en plus répandue dans les applications cartographiques pour deux raisons principales: premièrement, elle permet de comprendre le contenu spatial d'objets urbains de manière plus logique et, deuxièmement, elle fournit une plate-forme commune pour intégrer des informations au niveau de la ville provenant de différents environnements ou ressources et les rendre accessibles aux utilisateurs. Un modèle de ville virtuelle en 3D est une représentation numérique de l'espace urbain qui décrit les propriétés géométriques, topologiques, sémantiques et d'apparence de ses composants. En général, un MV3D\footnote{Modèle de Ville en 3D} sert de plate-forme d'intégration pour plusieurs facettes d'un espace d'informations urbain, comme l'a souligné "Batty": "En bref, les nouveaux modèles ne sont pas simplement la géométrie numérique des modèles traditionnels, mais des bases de données à grande échelle pouvant être visualisées en 3D. En tant que tels, ils représentent déjà un moyen de fusionner des données symboliques ou thématiques plus abstraites, même des modèles symboliques, dans ce mode de représentation".This thesis project was born from a collaborative project between the research team VORTEX / Visual objects: from reality to expression (now REVA: Real Expression Artificial Life) at IRIT: Institute of Research in Computer Science Toulouse on the one hand and education professionals, companies and public entities on the other.The SCOLA collaborative project is essentially an online learning platform based on the use of serious games in schools. It helps users to acquire and track predefined skills. This platform provides teachers with a new flexible tool that creates pedagogical scenarios and personalizes student records. Several contributions have been attributed to IRIT. One of these is to suggest a solution for the automatic creation of 3D environments, to integrate into the game scenario. This solution aims to prevent 3D graphic designers from manually modeling detailed and large 3D environments, which can be very expensive and take a lot of time. Various applications and prototypes have been developed to allow the user to generalize and visualize their own virtual world primarily from a set of rules. Therefore, there is no single representation scheme in the virtual world due to the heterogeneity and diversity of 3D content design, especially city models. This constraint has led us to rely heavily on our project on real 3D urban data instead of custom data predefined by the game designer. Advances in computer graphics, high computing capabilities, and Web technologies have revolutionized data reconstruction and visualization techniques. These techniques are applied in a variety of areas, starting with video games, simulations, and ending with movies that use procedurally generated spaces and character animations. Although modern computer games do not have the same hardware and memory restrictions as older games, procedural generation is frequently used to create unique games, cards, levels, characters, or other random facets on each. Currently, the trend is shifting towards GIS : Geographical Information Systems to create urban worlds, especially after their successful implementation around the world to support many areas of applications. GIS are more specifically dedicated to applications such as simulation, disaster management and urban planning, with a great use more or less limited in games, for example the game "Minecraft", the latest version offers a map using real world cities Geodata in Minecraft. The use of existing urban data is becoming more and more widespread in cartographic applications for two main reasons: first, it makes it possible to understand the spatial content of urban objects in a more logical way and, secondly, it provides a common platform to integrate city-level information from different environments or resources and make them available to users. A 3D virtual city model is a digital representation of urban space that describes the geometric, topological, semantic, and appearance properties of its components. In general, an MV3D: 3D City Model serves as an integration platform for many facets of an urban information space, as "Batty" pointed out: "In short, the new models are not just the digital geometry of traditional models, but large-scale databases that can be visualized in 3D. As such, they already represent a way to merge more abstract symbolic or thematic data, even symbolic patterns, into this mode of representation"

    Pre-processing, classification and semantic querying of large-scale Earth observation spaceborne/airborne/terrestrial image databases: Process and product innovations.

    Get PDF
    By definition of Wikipedia, “big data is the term adopted for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The big data challenges typically include capture, curation, storage, search, sharing, transfer, analysis and visualization”. Proposed by the intergovernmental Group on Earth Observations (GEO), the visionary goal of the Global Earth Observation System of Systems (GEOSS) implementation plan for years 2005-2015 is systematic transformation of multisource Earth Observation (EO) “big data” into timely, comprehensive and operational EO value-adding products and services, submitted to the GEO Quality Assurance Framework for Earth Observation (QA4EO) calibration/validation (Cal/Val) requirements. To date the GEOSS mission cannot be considered fulfilled by the remote sensing (RS) community. This is tantamount to saying that past and existing EO image understanding systems (EO-IUSs) have been outpaced by the rate of collection of EO sensory big data, whose quality and quantity are ever-increasing. This true-fact is supported by several observations. For example, no European Space Agency (ESA) EO Level 2 product has ever been systematically generated at the ground segment. By definition, an ESA EO Level 2 product comprises a single-date multi-spectral (MS) image radiometrically calibrated into surface reflectance (SURF) values corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its data-derived scene classification map (SCM), whose thematic legend is general-purpose, user- and application-independent and includes quality layers, such as cloud and cloud-shadow. Since no GEOSS exists to date, present EO content-based image retrieval (CBIR) systems lack EO image understanding capabilities. Hence, no semantic CBIR (SCBIR) system exists to date either, where semantic querying is synonym of semantics-enabled knowledge/information discovery in multi-source big image databases. In set theory, if set A is a strict superset of (or strictly includes) set B, then A B. This doctoral project moved from the working hypothesis that SCBIR computer vision (CV), where vision is synonym of scene-from-image reconstruction and understanding EO image understanding (EO-IU) in operating mode, synonym of GEOSS ESA EO Level 2 product human vision. Meaning that necessary not sufficient pre-condition for SCBIR is CV in operating mode, this working hypothesis has two corollaries. First, human visual perception, encompassing well-known visual illusions such as Mach bands illusion, acts as lower bound of CV within the multi-disciplinary domain of cognitive science, i.e., CV is conditioned to include a computational model of human vision. Second, a necessary not sufficient pre-condition for a yet-unfulfilled GEOSS development is systematic generation at the ground segment of ESA EO Level 2 product. Starting from this working hypothesis the overarching goal of this doctoral project was to contribute in research and technical development (R&D) toward filling an analytic and pragmatic information gap from EO big sensory data to EO value-adding information products and services. This R&D objective was conceived to be twofold. First, to develop an original EO-IUS in operating mode, synonym of GEOSS, capable of systematic ESA EO Level 2 product generation from multi-source EO imagery. EO imaging sources vary in terms of: (i) platform, either spaceborne, airborne or terrestrial, (ii) imaging sensor, either: (a) optical, encompassing radiometrically calibrated or uncalibrated images, panchromatic or color images, either true- or false color red-green-blue (RGB), multi-spectral (MS), super-spectral (SS) or hyper-spectral (HS) images, featuring spatial resolution from low (> 1km) to very high (< 1m), or (b) synthetic aperture radar (SAR), specifically, bi-temporal RGB SAR imagery. The second R&D objective was to design and develop a prototypical implementation of an integrated closed-loop EO-IU for semantic querying (EO-IU4SQ) system as a GEOSS proof-of-concept in support of SCBIR. The proposed closed-loop EO-IU4SQ system prototype consists of two subsystems for incremental learning. A primary (dominant, necessary not sufficient) hybrid (combined deductive/top-down/physical model-based and inductive/bottom-up/statistical model-based) feedback EO-IU subsystem in operating mode requires no human-machine interaction to automatically transform in linear time a single-date MS image into an ESA EO Level 2 product as initial condition. A secondary (dependent) hybrid feedback EO Semantic Querying (EO-SQ) subsystem is provided with a graphic user interface (GUI) to streamline human-machine interaction in support of spatiotemporal EO big data analytics and SCBIR operations. EO information products generated as output by the closed-loop EO-IU4SQ system monotonically increase their value-added with closed-loop iterations

    2D and 3D surface image processing algorithms and their applications

    Get PDF
    This doctoral dissertation work aims to develop algorithms for 2D image segmentation application of solar filament disappearance detection, 3D mesh simplification, and 3D image warping in pre-surgery simulation. Filament area detection in solar images is an image segmentation problem. A thresholding and region growing combined method is proposed and applied in this application. Based on the filament area detection results, filament disappearances are reported in real time. The solar images in 1999 are processed with this proposed system and three statistical results of filaments are presented. 3D images can be obtained by passive and active range sensing. An image registration process finds the transformation between each pair of range views. To model an object, a common reference frame in which all views can be transformed must be defined. After the registration, the range views should be integrated into a non-redundant model. Optimization is necessary to obtain a complete 3D model. One single surface representation can better fit to the data. It may be further simplified for rendering, storing and transmitting efficiently, or the representation can be converted to some other formats. This work proposes an efficient algorithm for solving the mesh simplification problem, approximating an arbitrary mesh by a simplified mesh. The algorithm uses Root Mean Square distance error metric to decide the facet curvature. Two vertices of one edge and the surrounding vertices decide the average plane. The simplification results are excellent and the computation speed is fast. The algorithm is compared with six other major simplification algorithms. Image morphing is used for all methods that gradually and continuously deform a source image into a target image, while producing the in-between models. Image warping is a continuous deformation of a: graphical object. A morphing process is usually composed of warping and interpolation. This work develops a direct-manipulation-of-free-form-deformation-based method and application for pre-surgical planning. The developed user interface provides a friendly interactive tool in the plastic surgery. Nose augmentation surgery is presented as an example. Displacement vector and lattices resulting in different resolution are used to obtain various deformation results. During the deformation, the volume change of the model is also considered based on a simplified skin-muscle model

    Management of spatial data for visualization on mobile devices

    Get PDF
    Vector-based mapping is emerging as a preferred format in Location-based Services(LBS), because it can deliver an up-to-date and interactive map visualization. The Progressive Transmission(PT) technique has been developed to enable the ecient transmission of vector data over the internet by delivering various incremental levels of detail(LoD). However, it is still challenging to apply this technique in a mobile context due to many inherent limitations of mobile devices, such as small screen size, slow processors and limited memory. Taking account of these limitations, PT has been extended by developing a framework of ecient data management for the visualization of spatial data on mobile devices. A data generalization framework is proposed and implemented in a software application. This application can signicantly reduce the volume of data for transmission and enable quick access to a simplied version of data while preserving appropriate visualization quality. Using volunteered geographic information as a case-study, the framework shows exibility in delivering up-to-date spatial information from dynamic data sources. Three models of PT are designed and implemented to transmit the additional LoD renements: a full scale PT as an inverse of generalisation, a viewdependent PT, and a heuristic optimised view-dependent PT. These models are evaluated with user trials and application examples. The heuristic optimised view-dependent PT has shown a signicant enhancement over the traditional PT in terms of bandwidth-saving and smoothness of transitions. A parallel data management strategy associated with three corresponding algorithms has been developed to handle LoD spatial data on mobile clients. This strategy enables the map rendering to be performed in parallel with a process which retrieves the data for the next map location the user will require. A viewdependent approach has been integrated to monitor the volume of each LoD for visible area. The demonstration of a exible rendering style shows its potential use in visualizing dynamic geoprocessed data. Future work may extend this to integrate topological constraints and semantic constraints for enhancing the vector map visualization

    Homotopy Based Reconstruction from Acoustic Images

    Get PDF
    • …
    corecore