987 research outputs found

    3D Scene Geometry Estimation from 360∘^\circ Imagery: A Survey

    Full text link
    This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360∘^\circ, spherical or panoramic) images and videos. We then survey monocular layout and depth inference approaches, highlighting the recent advances in learning-based solutions suited for spherical data. The classical stereo matching is then revised on the spherical domain, where methodologies for detecting and describing sparse and dense features become crucial. The stereo matching concepts are then extrapolated for multiple view camera setups, categorizing them among light fields, multi-view stereo, and structure from motion (or visual simultaneous localization and mapping). We also compile and discuss commonly adopted datasets and figures of merit indicated for each purpose and list recent results for completeness. We conclude this paper by pointing out current and future trends.Comment: Published in ACM Computing Survey

    Data-driven depth and 3D architectural layout estimation of an interior environment from monocular panoramic input

    Get PDF
    Recent years have seen significant interest in the automatic 3D reconstruction of indoor scenes, leading to a distinct and very-active sub-field within 3D reconstruction. The main objective is to convert rapidly measured data representing real-world indoor environments into models encompassing geometric, structural, and visual abstractions. This thesis focuses on the particular subject of extracting geometric information from single panoramic images, using either visual data alone or sparse registered depth information. The appeal of this setup lies in the efficiency and cost-effectiveness of data acquisition using 360o images. The challenge, however, is that creating a comprehensive model from mostly visual input is extremely difficult, due to noise, missing data, and clutter. My research has concentrated on leveraging prior information, in the form of architectural and data-driven priors derived from large annotated datasets, to develop end-to-end deep learning solutions for specific tasks in the structured reconstruction pipeline. My first contribution consists in a deep neural network architecture for estimating a depth map from a single monocular indoor panorama, operating directly on the equirectangular projection. Leveraging the characteristics of indoor 360-degree images and recognizing the impact of gravity on indoor scene design, the network efficiently encodes the scene into vertical spherical slices. By exploiting long- and short- term relationships among these slices, it recovers an equirectangular depth map directly from the corresponding RGB image. My second contribution generalizes the approach to handle multimodal input, also covering the situation in which the equirectangular input image is paired with a sparse depth map, as provided from common capture setups. Depth is inferred using an efficient single-branch network with a dynamic gating system, processing both dense visual data and sparse geometric data. Additionally, a new augmentation strategy enhances the model's robustness to various types of sparsity, including those from structured light sensors and LiDAR setups. While the first two contributions focus on per-pixel geometric information, my third contribution addresses the recovery of the 3D shape of permanent room surfaces from a single panoramic image. Unlike previous methods, this approach tackles the problem in 3D, expanding the reconstruction space. It employs a graph convolutional network to directly infer the room structure as a 3D mesh, deforming a graph- encoded tessellated sphere mapped to the spherical panorama. Gravity- aligned features are actively incorporated using a projection layer with multi-head self-attention, and specialized losses guide plausible solutions in the presence of clutter and occlusions. The benchmarks on publicly available data show that all three methods provided significant improvements over the state-of-the-art

    Sketching space

    Get PDF
    In this paper, we present a sketch modelling system which we call Stilton. The program resembles a desktop VRML browser, allowing a user to navigate a three-dimensional model in a perspective projection, or panoramic photographs, which the program maps onto the scene as a `floor' and `walls'. We place an imaginary two-dimensional drawing plane in front of the user, and any geometric information that user sketches onto this plane may be reconstructed to form solid objects through an optimization process. We show how the system can be used to reconstruct geometry from panoramic images, or to add new objects to an existing model. While panoramic imaging can greatly assist with some aspects of site familiarization and qualitative assessment of a site, without the addition of some foreground geometry they offer only limited utility in a design context. Therefore, we suggest that the system may be of use in `just-in-time' CAD recovery of complex environments, such as shop floors, or construction sites, by recovering objects through sketched overlays, where other methods such as automatic line-retrieval may be impossible. The result of using the system in this manner is the `sketching of space' - sketching out a volume around the user - and once the geometry has been recovered, the designer is free to quickly sketch design ideas into the newly constructed context, or analyze the space around them. Although end-user trials have not, as yet, been undertaken we believe that this implementation may afford a user-interface that is both accessible and robust, and that the rapid growth of pen-computing devices will further stimulate activity in this area

    Semi-automatic 3D reconstruction of urban areas using epipolar geometry and template matching

    Get PDF
    WOS:000240143800002 (NÂș de Acesso Web of Science)In this work we describe a novel technique for semi-automatic three-dimensional (3D) reconstruction of urban areas, from airborne stereo-pair images whose output is VRML or DXF. The main challenge is to compute the relevant information—building's height and volume, roof's description, and texture—algorithmically, because it is very time consuming and thus expensive to produce it manually for large urban areas. The algorithm requires some initial calibration input and is able to compute the above-mentioned building characteristics from the stereo pair and the availability of the 2D CAD and the digital elevation model of the same area, with no knowledge of the camera pose or its intrinsic parameters. To achieve this, we have used epipolar geometry, homography computation, automatic feature extraction and we have solved the feature correspondence problem in the stereo pair, by using template matching

    Fisheye Photogrammetry to Survey Narrow Spaces in Architecture and a Hypogea Environment

    Get PDF
    Nowadays, the increasing computation power of commercial grade processors has actively led to a vast spreading of image-based reconstruction software as well as its application in different disciplines. As a result, new frontiers regarding the use of photogrammetry in a vast range of investigation activities are being explored. This paper investigates the implementation of fisheye lenses in non-classical survey activities along with the related problematics. Fisheye lenses are outstanding because of their large field of view. This characteristic alone can be a game changer in reducing the amount of data required, thus speeding up the photogrammetric process when needed. Although they come at a cost, field of view (FOV), speed and manoeuvrability are key to the success of those optics as shown by two of the presented case studies: the survey of a very narrow spiral staircase located in the Duomo di Milano and the survey of a very narrow hypogea structure in Rome. A third case study, which deals with low-cost sensors, shows the metric evaluation of a commercial spherical camera equipped with fisheye lenses

    Fracture mapping in challenging environment: a 3D virtual reality approach combining terrestrial LiDAR and high definition images

    Get PDF
    ArticleThis is the author accepted manuscript. The final version is available from Springer Verlag via the DOI in this record.The latest technological developments in computer vision allow the creation of georeferenced, non-immersive desktop virtual reality (VR) environments. VR uses a computer to produce a simulated three-dimensional world in which it is possible to interact with objects and derive metric and thematic data. In this context, modern geomatic tools enable the remote acquisition of information that can be used to produce georeferenced high-definition 3D models: these can be used to create a VR in support of rock mass data processing, analysis, and interpretation. Data from laser scanning and high quality images were combined to map deterministically and characterise discontinuities with the aim of creating accurate rock mass models. Discontinuities were compared with data from traditional engineering-geological surveys in order to check the level of accuracy in terms of the attitude of individual joints and sets. The quality of data collected through geomatic surveys and field measurements in two marble quarries of the Apuan Alps (Italy) was very satisfactory. Some fundamental geotechnical indices (e.g. joint roughness, alteration, opening, moisture, and infill) were also included in the VR models. Data were grouped, analysed, and shared in a single repository for VR visualization and stability analysis in order to study the interaction between geology and human activities.The authors gratefully acknowledge the assistance of the personal of the Romana Quarry and particularly Corniani M. This paper was possible because of support from the Tuscany Region Research Project known as “Health and safety in the quarries of ornamental stones—SECURECAVE”. The authors acknowledge Pellegri M and Gullì D (Local Sanitary Agency n.1, Mining Engineering Operative Unit—Department of Prevention) and Riccucci S (Centre of GeoTechnologies, University of Siena) for their support of this research

    Dynamic 3D Urban Scene Modeling Using Multiple Pushbroom Mosaics

    Full text link
    In this paper, a unified, segmentation-based approach is proposed to deal with both stereo reconstruction and moving objects detection problems using multiple stereo mosaics. Each set of parallel-perspective (pushbroom) stereo mosaics is generated from a video sequence captured by a single video camera. First a colorsegmentation approach is used to extract the so-called natural matching primitives from a reference view of a pair of stereo mosaics to facilitate both 3D reconstruction of textureless urban scenes and man-made moving targets (e.g. vehicles). Multiple pairs of stereo mosaics are used to improve the accuracy and robustness in 3D recovery and occlusion handling. Moving targets are detected by inspecting their 3D anomalies, either violating the epipolar geometry of the pushbroom stereo or exhibiting abnormal 3D structure. Experimental results on both simulated and real video sequences are provided to show the effectiveness of our approach. 1
    • 

    corecore