6 research outputs found

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Robust Methods for Accurate and Efficient Reconstruction from Motion Imagery

    Get PDF
    Creating virtual representations of real-world scenes has been a long-standing goal in photogrammetry and computer vision, and has high practical relevance in industries involved in creating intelligent urban solutions. This includes a wide range of applications such as urban and community planning, reconnaissance missions by the military and government, autonomous robotics, virtual reality, cultural heritage preservation, and many others. Over the last decades, image-based modeling emerged as one of the most popular solutions. The objective is to extract metric information directly from images. Many procedural techniques achieve good results in terms of robustness, accuracy, completeness, and efficiency. More recently, deep-learning-based techniques were proposed to tackle this problem by training on vast amounts of data to learn to associate features between images through deep convolutional neural networks and were shown to outperform traditional procedural techniques. However, many of the key challenges such as large displacement and scalability still remain, especially when dealing with large-scale aerial imagery. This thesis investigates image-based modeling and proposes robust and scalable methods for large-scale aerial imagery. First, we present a method for reconstructing large-scale areas from aerial imagery that formulates the solution as a single-step process, reducing the processing time considerably. Next, we address feature matching and propose a variational optical flow technique (HybridFlow) for dense feature matching that leverages the robustness of graph matching to large displacements. The proposed solution efficiently handles arbitrary-sized aerial images. Finally, for general-purpose image-based modeling, we propose a deep-learning-based approach, an end-to-end multi-view structure from motion employing hypercorrelation volumes for learning dense feature matches. We demonstrate the application of the proposed techniques on several applications and report on task-related measures

    Development of a SGM-based multi-view reconstruction framework for aerial imagery

    Get PDF
    Advances in the technology of digital airborne camera systems allow for the observation of surfaces with sampling rates in the range of a few centimeters. In combination with novel matching approaches, which estimate depth information for virtually every pixel, surface reconstructions of impressive density and precision can be generated. Therefore, image based surface generation meanwhile is a serious alternative to LiDAR based data collection for many applications. Surface models serve as primary base for geographic products as for example map creation, production of true-ortho photos or visualization purposes within the framework of virtual globes. The goal of the presented theses is the development of a framework for the fully automatic generation of 3D surface models based on aerial images - both standard nadir as well as oblique views. This comprises several challenges. On the one hand dimensions of aerial imagery is considerable and the extend of the areas to be reconstructed can encompass whole countries. Beside scalability of methods this also requires decent processing times and efficient handling of the given hardware resources. Moreover, beside high precision requirements, a high degree of automation has to be guaranteed to limit manual interaction as much as possible. Due to the advantages of scalability, a stereo method is utilized in the presented thesis. The approach for dense stereo is based on an adapted version of the semi global matching (SGM) algorithm. Following a hierarchical approach corresponding image regions and meaningful disparity search ranges are identified. It will be verified that, dependent on undulations of the scene, time and memory demands can be reduced significantly, by up to 90% within some of the conducted tests. This enables the processing of aerial datasets on standard desktop machines in reasonable times even for large fields of depth. Stereo approaches generate disparity or depth maps, in which redundant depth information is available. To exploit this redundancy, a method for the refinement of stereo correspondences is proposed. Thereby redundant observations across stereo models are identified, checked for geometric consistency and their reprojection error is minimized. This way outliers are removed and precision of depth estimates is improved. In order to generate consistent surfaces, two algorithms for depth map fusion were developed. The first fusion strategy aims for the generation of 2.5D height models, also known as digital surface models (DSM). The proposed method improves existing methods regarding quality in areas of depth discontinuities, for example at roof edges. Utilizing benchmarks designed for the evaluation of image based DSM generation we show that the developed approaches favorably compare to state-of-the-art algorithms and that height precisions of few GSDs can be achieved. Furthermore, methods for the derivation of meshes based on DSM data are discussed. The fusion of depth maps for 3D scenes, as e.g. frequently required during evaluation of high resolution oblique aerial images in complex urban environments, demands for a different approach since scenes can in general not be represented as height fields. Moreover, depths across depth maps possess varying precision and sampling rates due to variances in image scale, errors in orientation and other effects. Within this thesis a median-based fusion methodology is proposed. By using geometry-adaptive triangulation of depth maps depth-wise normals are extracted and, along the point coordinates are filtered and fused using tree structures. The output of this method are oriented points which then can be used to generate meshes. Precision and density of the method will be evaluated using established multi-view benchmarks. Beside the capability to process close range datasets, results for large oblique airborne data sets will be presented. The report closes with a summary, discussion of limitations and perspectives regarding improvements and enhancements. The implemented algorithms are core elements of the commercial software package SURE, which is freely available for scientific purposes

    Videos in Context for Telecommunication and Spatial Browsing

    Get PDF
    The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance

    2019 EC3 July 10-12, 2019 Chania, Crete, Greece

    Get PDF
    corecore