1,836 research outputs found

    Discovering Regularity in Point Clouds of Urban Scenes

    Full text link
    Despite the apparent chaos of the urban environment, cities are actually replete with regularity. From the grid of streets laid out over the earth, to the lattice of windows thrown up into the sky, periodic regularity abounds in the urban scene. Just as salient, though less uniform, are the self-similar branching patterns of trees and vegetation that line streets and fill parks. We propose novel methods for discovering these regularities in 3D range scans acquired by a time-of-flight laser sensor. The applications of this regularity information are broad, and we present two original algorithms. The first exploits the efficiency of the Fourier transform for the real-time detection of periodicity in building facades. Periodic regularity is discovered online by doing a plane sweep across the scene and analyzing the frequency space of each column in the sweep. The simplicity and online nature of this algorithm allow it to be embedded in scanner hardware, making periodicity detection a built-in feature of future 3D cameras. We demonstrate the usefulness of periodicity in view registration, compression, segmentation, and facade reconstruction. The second algorithm leverages the hierarchical decomposition and locality in space of the wavelet transform to find stochastic parameters for procedural models that succinctly describe vegetation. These procedural models facilitate the generation of virtual worlds for architecture, gaming, and augmented reality. The self-similarity of vegetation can be inferred using multi-resolution analysis to discover the underlying branching patterns. We present a unified framework of these tools, enabling the modeling, transmission, and compression of high-resolution, accurate, and immersive 3D images

    Painting-to-3D Model Alignment Via Discriminative Visual Elements

    Get PDF
    International audienceThis paper describes a technique that can reliably align arbitrary 2D depictions of an architectural site, including drawings, paintings and historical photographs, with a 3D model of the site. This is a tremendously difficult task as the appearance and scene structure in the 2D depictions can be very different from the appearance and geometry of the 3D model, e.g., due to the specific rendering style, drawing error, age, lighting or change of seasons. In addition, we face a hard search problem: the number of possible alignments of the painting to a large 3D model, such as a partial reconstruction of a city, is huge. To address these issues, we develop a new compact representation of complex 3D scenes. The 3D model of the scene is represented by a small set of discriminative visual elements that are automatically learnt from rendered views. Similar to object detection, the set of visual elements, as well as the weights of individual features for each element, are learnt in a discriminative fashion. We show that the learnt visual elements are reliably matched in 2D depictions of the scene despite large variations in rendering style (e.g. watercolor, sketch, historical photograph) and structural changes (e.g. missing scene parts, large occluders) of the scene. We demonstrate an application of the proposed approach to automatic re-photography to find an approximate viewpoint of historical paintings and photographs with respect to a 3D model of the site. The proposed alignment procedure is validated via a human user study on a new database of paintings and sketches spanning several sites. The results demonstrate that our algorithm produces significantly better alignments than several baseline methods

    Low-rank Based Algorithms for Rectification, Repetition Detection and De-noising in Urban Images

    Full text link
    In this thesis, we aim to solve the problem of automatic image rectification and repeated patterns detection on 2D urban images, using novel low-rank based techniques. Repeated patterns (such as windows, tiles, balconies and doors) are prominent and significant features in urban scenes. Detection of the periodic structures is useful in many applications such as photorealistic 3D reconstruction, 2D-to-3D alignment, facade parsing, city modeling, classification, navigation, visualization in 3D map environments, shape completion, cinematography and 3D games. However both of the image rectification and repeated patterns detection problems are challenging due to scene occlusions, varying illumination, pose variation and sensor noise. Therefore, detection of these repeated patterns becomes very important for city scene analysis. Given a 2D image of urban scene, we automatically rectify a facade image and extract facade textures first. Based on the rectified facade texture, we exploit novel algorithms that extract repeated patterns by using Kronecker product based modeling that is based on a solid theoretical foundation. We have tested our algorithms in a large set of images, which includes building facades from Paris, Hong Kong and New York

    Semantic Validation in Structure from Motion

    Full text link
    The Structure from Motion (SfM) challenge in computer vision is the process of recovering the 3D structure of a scene from a series of projective measurements that are calculated from a collection of 2D images, taken from different perspectives. SfM consists of three main steps; feature detection and matching, camera motion estimation, and recovery of 3D structure from estimated intrinsic and extrinsic parameters and features. A problem encountered in SfM is that scenes lacking texture or with repetitive features can cause erroneous feature matching between frames. Semantic segmentation offers a route to validate and correct SfM models by labelling pixels in the input images with the use of a deep convolutional neural network. The semantic and geometric properties associated with classes in the scene can be taken advantage of to apply prior constraints to each class of object. The SfM pipeline COLMAP and semantic segmentation pipeline DeepLab were used. This, along with planar reconstruction of the dense model, were used to determine erroneous points that may be occluded from the calculated camera position, given the semantic label, and thus prior constraint of the reconstructed plane. Herein, semantic segmentation is integrated into SfM to apply priors on the 3D point cloud, given the object detection in the 2D input images. Additionally, the semantic labels of matched keypoints are compared and inconsistent semantically labelled points discarded. Furthermore, semantic labels on input images are used for the removal of objects associated with motion in the output SfM models. The proposed approach is evaluated on a data-set of 1102 images of a repetitive architecture scene. This project offers a novel method for improved validation of 3D SfM models

    Geometry-driven feature detection

    Get PDF
    Matching images taken from different viewpoints is a fundamental step for many computer vision applications including 3D reconstruction, scene recognition, virtual reality, robot localization, etc. The typical approaches detect feature keypoints based on local properties to achieve robustness to viewpoint changes, and establish correspondences between keypoints to recover the 3D geometry or determine the similarity between images. The complexity of perspective distortion challenges the detection of viewpoint invariant features; the lack of 3D geometric information about local features makes their matching inefficient. In this thesis, I explore feature detection based on 3D geometric information for improved projective invariance. The main novel research contributions of this thesis are as follows. First, I give a projective invariant feature detection method that exploits 3D structures recovered from simple stereo matching. By leveraging the rich geometric information of the detected features, I present an efficient 3D matching algorithm to handle large viewpoint changes. Second, I propose a compact high-level feature detector that robustly extracts repetitive structures in urban scenes, which allows efficient wide-baseline matching. I further introduce a novel single-view reconstruction approach to recover the 3D dense geometry of the repetition-based features

    Holistic Multi-View Building Analysis in the Wild with Projection Pooling

    Get PDF
    We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.Comment: Accepted for publication at the 35th AAAI Conference on Artificial Intelligence (AAAI 2021
    • 

    corecore