40 research outputs found
Multi-frame scene-flow estimation using a patch model and smooth motion prior
This paper addresses the problem of estimating the dense 3D motion of a scene over several frames using a set of calibrated cameras. Most current 3D motion estimation techniques are limited to estimating the motion over a single frame, unless a strong prior model of the scene (such as a skeleton) is introduced. Estimating the 3D motion of a general scene is difficult due to untextured surfaces, complex movements and occlusions. In this paper, we show that it is possible to track the surfaces of a scene over several frames, by introducing an effective prior on the scene motion. Experimental results show that the proposed method estimates the dense scene-flow over multiple frames, without the need for multiple-view reconstructions at every frame. Furthermore, the accuracy of the proposed method is demonstrated by comparing the estimated motion against a ground truth
Multi-View Stereo with Single-View Semantic Mesh Refinement
While 3D reconstruction is a well-established and widely explored research
topic, semantic 3D reconstruction has only recently witnessed an increasing
share of attention from the Computer Vision community. Semantic annotations
allow in fact to enforce strong class-dependent priors, as planarity for ground
and walls, which can be exploited to refine the reconstruction often resulting
in non-trivial performance improvements. State-of-the art methods propose
volumetric approaches to fuse RGB image data with semantic labels; even if
successful, they do not scale well and fail to output high resolution meshes.
In this paper we propose a novel method to refine both the geometry and the
semantic labeling of a given mesh. We refine the mesh geometry by applying a
variational method that optimizes a composite energy made of a state-of-the-art
pairwise photo-metric term and a single-view term that models the semantic
consistency between the labels of the 3D mesh and those of the segmented
images. We also update the semantic labeling through a novel Markov Random
Field (MRF) formulation that, together with the classical data and smoothness
terms, takes into account class-specific priors estimated directly from the
annotated mesh. This is in contrast to state-of-the-art methods that are
typically based on handcrafted or learned priors. We are the first, jointly
with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336,
2017], to propose the use of semantics inside a mesh refinement framework.
Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more
classical pairwise comparison to estimate the flow of the mesh, we apply a
single-view comparison between the semantically annotated image and the current
3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho
Semantically Informed Multiview Surface Refinement
We present a method to jointly refine the geometry and semantic segmentation
of 3D surface meshes. Our method alternates between updating the shape and the
semantic labels. In the geometry refinement step, the mesh is deformed with
variational energy minimization, such that it simultaneously maximizes
photo-consistency and the compatibility of the semantic segmentations across a
set of calibrated images. Label-specific shape priors account for interactions
between the geometry and the semantic labels in 3D. In the semantic
segmentation step, the labels on the mesh are updated with MRF inference, such
that they are compatible with the semantic segmentations in the input images.
Also, this step includes prior assumptions about the surface shape of different
semantic classes. The priors induce a tight coupling, where semantic
information influences the shape update and vice versa. Specifically, we
introduce priors that favor (i) adaptive smoothing, depending on the class
label; (ii) straightness of class boundaries; and (iii) semantic labels that
are consistent with the surface orientation. The novel mesh-based
reconstruction is evaluated in a series of experiments with real and synthetic
data. We compare both to state-of-the-art, voxel-based semantic 3D
reconstruction, and to purely geometric mesh refinement, and demonstrate that
the proposed scheme yields improved 3D geometry as well as an improved semantic
segmentation
MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image
In this paper, we address the problem of reconstructing an object's surface
from a single image using generative networks. First, we represent a 3D surface
with an aggregation of dense point clouds from multiple views. Each point cloud
is embedded in a regular 2D grid aligned on an image plane of a viewpoint,
making the point cloud convolution-favored and ordered so as to fit into deep
network architectures. The point clouds can be easily triangulated by
exploiting connectivities of the 2D grids to form mesh-based surfaces. Second,
we propose an encoder-decoder network that generates such kind of multiple
view-dependent point clouds from a single image by regressing their 3D
coordinates and visibilities. We also introduce a novel geometric loss that is
able to interpret discrepancy over 3D surfaces as opposed to 2D projective
planes, resorting to the surface discretization on the constructed meshes. We
demonstrate that the multi-view point regression network outperforms
state-of-the-art methods with a significant improvement on challenging
datasets.Comment: 8 pages; accepted by AAAI 201
OBLIQUE MULTI-CAMERA SYSTEMS - ORIENTATION AND DENSE MATCHING ISSUES
International audience3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy <rupnik, franex, remondino>@fbk.eu, http://3dom.fbk.eu Commission III-WG4 ABS TRACT: The use of oblique imagery has become a standard for many civil and mapping applications, thanks to the development of airborne digital multi-camera systems, as proposed by many companies (Blomoblique, IGI, Leica, M idas, Pictometry, Vexcel/M icrosoft, VisionM ap, etc.). The indisputable virtue of oblique photography lies in its simplicity of interpretation and understanding for inexperienced users allowing their use of oblique images in very different applications, such as building detection and reconstruction, building structural damage classification, road land updating and administration services, etc. The paper reports an overview of the actual oblique commercial systems and presents a workflow for the automated orientation and dense matching of large image blocks. Perspectives, potentialities, pitfalls and suggestions for achieving satisfactory results are given. Tests performed on two datasets acquired with two multi-camera systems over urban areas are also reported. Figure 1: Large urban area pictured with an oblique multi-camera system. Once advanced image triangulation methods have retrieved interior and exterior parameters of the cameras, dense point clouds can be deriv ed for 3D city modelling, feature extraction and mapping purposes
Structure and motion from scene registration
We propose a method for estimating the 3D structure and the dense 3D motion (scene flow) of a dynamic nonrigid 3D scene, using a camera array. The core idea is to use a dense multi-camera array to construct a novel, dense 3D volumetric representation of the 3D space where each voxel holds an estimated intensity value and a confidence measure of this value. The problem of 3D structure and 3D motion estimation of a scene is thus reduced to a nonrigid registration of two volumes - hence the term ”Scene Registration”. Registering two dense 3D scalar volumes does not require recovering the 3D structure of the scene as a preprocessing step, nor does it require explicit reasoning about occlusions. From this nonrigid registration we accurately extract the 3D scene flow and the 3D structure of the scene, and successfully recover the sharp discontinuities in both time and space. We demonstrate the advantages of our method on a number of challenging synthetic and real data sets