Search CORE

14,704 research outputs found

DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks

Author: Fox Dieter
Xiang Yu
Publication venue
Publication date: 30/05/2017
Field of study

3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of the scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recurrent neural network architecture for semantic labeling on RGB-D videos. The output of the network is integrated with mapping techniques such as KinectFusion in order to inject semantic information into the reconstructed 3D scene. Experiments conducted on a real world dataset and a synthetic dataset with RGB-D videos demonstrate the ability of our method in semantic 3D scene mapping.Comment: Published in RSS 201

arXiv.org e-Print Archive

Crossref

Multi-Scale 3D Scene Flow from Binocular Stereo Sequences

Author: Li Rui
Sclaroff Stan
Publication venue: Boston University Computer Science Department
Publication date: 01/01/2007
Field of study

Scene ﬂow methods estimate the three-dimensional motion ﬁeld for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene ﬂow estimation that provides reliable results using only two cameras by fusing stereo and optical ﬂow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical ﬂow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene ﬂow than previous methods allow. To handle the aperture problems inherent in the estimation of optical ﬂow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108

CiteSeerX

Boston University Institutional Repository (OpenBU)

DeepVoxels: Learning Persistent 3D Feature Embeddings

Author: Heide Felix
Nießner Matthias
Sitzmann Vincent
Thies Justus
Wetzstein Gordon
Zollhöfer Michael
Publication venue
Publication date: 01/01/2019
Field of study

In this work, we address the lack of 3D understanding of generative neural networks by introducing a persistent 3D feature embedding for view synthesis. To this end, we propose DeepVoxels, a learned representation that encodes the view-dependent appearance of a 3D scene without having to explicitly model its geometry. At its core, our approach is based on a Cartesian 3D grid of persistent embedded features that learn to make use of the underlying 3D scene structure. Our approach combines insights from 3D geometric computer vision with recent advances in learning image-to-image mappings based on adversarial loss functions. DeepVoxels is supervised, without requiring a 3D reconstruction of the scene, using a 2D re-rendering loss and enforces perspective and multi-view geometry in a principled manner. We apply our persistent 3D scene representation to the problem of novel view synthesis demonstrating high-quality results for a variety of challenging scenes.Comment: Video: https://www.youtube.com/watch?v=HM_WsZhoGXw Supplemental material: https://drive.google.com/file/d/1BnZRyNcVUty6-LxAstN83H79ktUq8Cjp/view?usp=sharing Code: https://github.com/vsitzmann/deepvoxels Project page: https://vsitzmann.github.io/deepvoxels

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

The Modelling of Stereoscopic 3D Scene Acquisition

Author: Hasmanda M.
Riha K.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/04/2012
Field of study

The main goal of this work is to find a suitable method for calculating the best setting of a stereo pair of cameras that are viewing the scene to enable spatial imaging. The method is based on a geometric model of a stereo pair cameras currently used for the acquisition of 3D scenes. Based on selectable camera parameters and object positions in the scene, the resultant model allows calculating the parameters of the stereo pair of images that influence the quality of spatial imaging. For the purpose of presenting the properties of the model of a simple 3D scene, an interactive application was created that allows, in addition to setting the cameras and scene parameters and displaying the calculated parameters, also displaying the modelled scene using perspective views and the stereo pair modelled with the aid of anaglyphic images. The resulting modelling method can be used in practice to determine appropriate parameters of the camera configuration based on the known arrangement of the objects in the scene. Analogously, it can, for a given camera configuration, determine appropriate geometrical limits of arranging the objects in the scene being displayed. This method ensures that the resulting stereoscopic recording will be of good quality and observer-friendly

Directory of Open Access Journals

Digital library of Brno University of Technology

Precision Enhancement of 3D Surfaces from Multiple Compressed Depth Maps

Author: Au Oscar C.
Cheung Gene
Chou Philip A.
Florencio Dinei
Wan Pengfei
Zhang Cha
Publication venue
Publication date: 24/02/2014
Field of study

In texture-plus-depth representation of a 3D scene, depth maps from different camera viewpoints are typically lossily compressed via the classical transform coding / coefficient quantization paradigm. In this paper we propose to reduce distortion of the decoded depth maps due to quantization. The key observation is that depth maps from different viewpoints constitute multiple descriptions (MD) of the same 3D scene. Considering the MD jointly, we perform a POCS-like iterative procedure to project a reconstructed signal from one depth map to the other and back, so that the converged depth maps have higher precision than the original quantized versions.Comment: This work was accepted as ongoing work paper in IEEE MMSP'201

arXiv.org e-Print Archive