40 research outputs found

    Multi-frame scene-flow estimation using a patch model and smooth motion prior

    Get PDF
    This paper addresses the problem of estimating the dense 3D motion of a scene over several frames using a set of calibrated cameras. Most current 3D motion estimation techniques are limited to estimating the motion over a single frame, unless a strong prior model of the scene (such as a skeleton) is introduced. Estimating the 3D motion of a general scene is difficult due to untextured surfaces, complex movements and occlusions. In this paper, we show that it is possible to track the surfaces of a scene over several frames, by introducing an effective prior on the scene motion. Experimental results show that the proposed method estimates the dense scene-flow over multiple frames, without the need for multiple-view reconstructions at every frame. Furthermore, the accuracy of the proposed method is demonstrated by comparing the estimated motion against a ground truth

    Multi-View Stereo with Single-View Semantic Mesh Refinement

    Get PDF
    While 3D reconstruction is a well-established and widely explored research topic, semantic 3D reconstruction has only recently witnessed an increasing share of attention from the Computer Vision community. Semantic annotations allow in fact to enforce strong class-dependent priors, as planarity for ground and walls, which can be exploited to refine the reconstruction often resulting in non-trivial performance improvements. State-of-the art methods propose volumetric approaches to fuse RGB image data with semantic labels; even if successful, they do not scale well and fail to output high resolution meshes. In this paper we propose a novel method to refine both the geometry and the semantic labeling of a given mesh. We refine the mesh geometry by applying a variational method that optimizes a composite energy made of a state-of-the-art pairwise photo-metric term and a single-view term that models the semantic consistency between the labels of the 3D mesh and those of the segmented images. We also update the semantic labeling through a novel Markov Random Field (MRF) formulation that, together with the classical data and smoothness terms, takes into account class-specific priors estimated directly from the annotated mesh. This is in contrast to state-of-the-art methods that are typically based on handcrafted or learned priors. We are the first, jointly with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336, 2017], to propose the use of semantics inside a mesh refinement framework. Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more classical pairwise comparison to estimate the flow of the mesh, we apply a single-view comparison between the semantically annotated image and the current 3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho

    Semantically Informed Multiview Surface Refinement

    Full text link
    We present a method to jointly refine the geometry and semantic segmentation of 3D surface meshes. Our method alternates between updating the shape and the semantic labels. In the geometry refinement step, the mesh is deformed with variational energy minimization, such that it simultaneously maximizes photo-consistency and the compatibility of the semantic segmentations across a set of calibrated images. Label-specific shape priors account for interactions between the geometry and the semantic labels in 3D. In the semantic segmentation step, the labels on the mesh are updated with MRF inference, such that they are compatible with the semantic segmentations in the input images. Also, this step includes prior assumptions about the surface shape of different semantic classes. The priors induce a tight coupling, where semantic information influences the shape update and vice versa. Specifically, we introduce priors that favor (i) adaptive smoothing, depending on the class label; (ii) straightness of class boundaries; and (iii) semantic labels that are consistent with the surface orientation. The novel mesh-based reconstruction is evaluated in a series of experiments with real and synthetic data. We compare both to state-of-the-art, voxel-based semantic 3D reconstruction, and to purely geometric mesh refinement, and demonstrate that the proposed scheme yields improved 3D geometry as well as an improved semantic segmentation

    MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image

    Full text link
    In this paper, we address the problem of reconstructing an object's surface from a single image using generative networks. First, we represent a 3D surface with an aggregation of dense point clouds from multiple views. Each point cloud is embedded in a regular 2D grid aligned on an image plane of a viewpoint, making the point cloud convolution-favored and ordered so as to fit into deep network architectures. The point clouds can be easily triangulated by exploiting connectivities of the 2D grids to form mesh-based surfaces. Second, we propose an encoder-decoder network that generates such kind of multiple view-dependent point clouds from a single image by regressing their 3D coordinates and visibilities. We also introduce a novel geometric loss that is able to interpret discrepancy over 3D surfaces as opposed to 2D projective planes, resorting to the surface discretization on the constructed meshes. We demonstrate that the multi-view point regression network outperforms state-of-the-art methods with a significant improvement on challenging datasets.Comment: 8 pages; accepted by AAAI 201

    Overlapping camera clustering through dominant sets for scalable 3D reconstruction

    Full text link

    Multiple View Stereo by Reflectance Modeling

    Get PDF

    OBLIQUE MULTI-CAMERA SYSTEMS - ORIENTATION AND DENSE MATCHING ISSUES

    Get PDF
    International audience3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy <rupnik, franex, remondino>@fbk.eu, http://3dom.fbk.eu Commission III-WG4 ABS TRACT: The use of oblique imagery has become a standard for many civil and mapping applications, thanks to the development of airborne digital multi-camera systems, as proposed by many companies (Blomoblique, IGI, Leica, M idas, Pictometry, Vexcel/M icrosoft, VisionM ap, etc.). The indisputable virtue of oblique photography lies in its simplicity of interpretation and understanding for inexperienced users allowing their use of oblique images in very different applications, such as building detection and reconstruction, building structural damage classification, road land updating and administration services, etc. The paper reports an overview of the actual oblique commercial systems and presents a workflow for the automated orientation and dense matching of large image blocks. Perspectives, potentialities, pitfalls and suggestions for achieving satisfactory results are given. Tests performed on two datasets acquired with two multi-camera systems over urban areas are also reported. Figure 1: Large urban area pictured with an oblique multi-camera system. Once advanced image triangulation methods have retrieved interior and exterior parameters of the cameras, dense point clouds can be deriv ed for 3D city modelling, feature extraction and mapping purposes

    Structure and motion from scene registration

    Get PDF
    We propose a method for estimating the 3D structure and the dense 3D motion (scene flow) of a dynamic nonrigid 3D scene, using a camera array. The core idea is to use a dense multi-camera array to construct a novel, dense 3D volumetric representation of the 3D space where each voxel holds an estimated intensity value and a confidence measure of this value. The problem of 3D structure and 3D motion estimation of a scene is thus reduced to a nonrigid registration of two volumes - hence the term ”Scene Registration”. Registering two dense 3D scalar volumes does not require recovering the 3D structure of the scene as a preprocessing step, nor does it require explicit reasoning about occlusions. From this nonrigid registration we accurately extract the 3D scene flow and the 3D structure of the scene, and successfully recover the sharp discontinuities in both time and space. We demonstrate the advantages of our method on a number of challenging synthetic and real data sets
    corecore