1,124 research outputs found
Semantic modeling of indoor scenes with support inference from a single photograph
We present an automatic approach for the semantic modeling of indoor scenes based on a single photograph, instead of relying on depth sensors. Without using handcrafted features, we guide indoor scene modeling with feature maps extracted by fully convolutional networks. Three parallel fully convolutional networks are adopted to generate object instance masks, a depth map, and an edge map of the room layout. Based on these high-level features, support relationships between indoor objects can be efficiently inferred in a data-driven manner. Constrained by the support context, a global-to-local model matching strategy is followed to retrieve the whole indoor scene. We demonstrate that the proposed method can efficiently retrieve indoor objects including situations where the objects are badly occluded. This approach enables efficient semantic-based scene editing
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
Neural Scene Decoration from a Single Photograph
Furnishing and rendering indoor scenes has been a long-standing task for
interior design, where artists create a conceptual design for the space, build
a 3D model of the space, decorate, and then perform rendering. Although the
task is important, it is tedious and requires tremendous effort. In this paper,
we introduce a new problem of domain-specific indoor scene image synthesis,
namely neural scene decoration. Given a photograph of an empty indoor space and
a list of decorations with layout determined by user, we aim to synthesize a
new image of the same space with desired furnishing and decorations. Neural
scene decoration can be applied to create conceptual interior designs in a
simple yet effective manner. Our attempt to this research problem is a novel
scene generation architecture that transforms an empty scene and an object
layout into a realistic furnished scene photograph. We demonstrate the
performance of our proposed method by comparing it with conditional image
synthesis baselines built upon prevailing image translation approaches both
qualitatively and quantitatively. We conduct extensive experiments to further
validate the plausibility and aesthetics of our generated scenes. Our
implementation is available at
\url{https://github.com/hkust-vgd/neural_scene_decoration}.Comment: ECCV 2022 paper. 14 pages of main content, 4 pages of references, and
11 pages of appendi
Matterport3D: Learning from RGB-D Data in Indoor Environments
Access to large, diverse RGB-D datasets is critical for training RGB-D scene
understanding algorithms. However, existing datasets still cover only a limited
number of views or a restricted scale of spaces. In this paper, we introduce
Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views
from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided
with surface reconstructions, camera poses, and 2D and 3D semantic
segmentations. The precise global alignment and comprehensive, diverse
panoramic set of views over entire buildings enable a variety of supervised and
self-supervised computer vision tasks, including keypoint matching, view
overlap prediction, normal prediction from color, semantic segmentation, and
region classification
- …