95 research outputs found
Capture, Reconstruction, and Representation of the Visual Real World for Virtual Reality
We provide an overview of the concerns, current practice, and limitations for capturing, reconstructing, and representing the real world visually within virtual reality. Given that our goals are to capture, transmit, and depict complex real-world phenomena to humans, these challenges cover the opto-electro-mechanical, computational, informational, and perceptual fields. Practically producing a system for real-world VR capture requires navigating a complex design space and pushing the state of the art in each of these areas. As such, we outline several promising directions for future work to improve the quality and flexibility of real-world VR capture systems
OmniPhotos: Casual 360° VR Photography
Virtual reality headsets are becoming increasingly popular, yet it remains difficult for casual users to capture immersive 360° VR panoramas. State-of-the-art approaches require capture times of usually far more than a minute and are often limited in their supported range of head motion. We introduce OmniPhotos, a novel approach for quickly and casually capturing high-quality 360° panoramas with motion parallax. Our approach requires a single sweep with a consumer 360° video camera as input, which takes less than 3 seconds to capture with a rotating selfie stick or 10 seconds handheld. This is the fastest capture time for any VR photography approach supporting motion parallax by an order of magnitude. We improve the visual rendering quality of our OmniPhotos by alleviating vertical distortion using a novel deformable proxy geometry, which we fit to a sparse 3D reconstruction of captured scenes. In addition, the 360° input views significantly expand the available viewing area, and thus the range of motion, compared to previous approaches. We have captured more than 50 OmniPhotos and show video results for a large variety of scenes.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 66599
CAD-Estate: Large-scale CAD Model Annotation in RGB Videos
We propose a method for annotating videos of complex multi-object scenes with
a globally-consistent 3D representation of the objects. We annotate each object
with a CAD model from a database, and place it in the 3D coordinate frame of
the scene with a 9-DoF pose transformation. Our method is semi-automatic and
works on commonly-available RGB videos, without requiring a depth sensor. Many
steps are performed automatically, and the tasks performed by humans are
simple, well-specified, and require only limited reasoning in 3D. This makes
them feasible for crowd-sourcing and has allowed us to construct a large-scale
dataset by annotating real-estate videos from YouTube. Our dataset CAD-Estate
offers 101k instances of 12k unique CAD models placed in the 3D representations
of 20k videos. In comparison to Scan2CAD, the largest existing dataset with CAD
model annotations on real scenes, CAD-Estate has 7x more instances and 4x more
unique CAD models. We showcase the benefits of pre-training a Mask2CAD model on
CAD-Estate for the task of automatic 3D object reconstruction and pose
estimation, demonstrating that it leads to performance improvements on the
popular Scan2CAD benchmark. The dataset is available at
https://github.com/google-research/cad-estate.Comment: Project page: https://github.com/google-research/cad-estat
- …