50 research outputs found
Example-based Image Recoloring in Indoor Environment
Color structure of a home scene image closely relates to the material properties of its local regions. Existing color migration methods typically fail to fully infer the correlation between the coloring of local home scene regions, leading to a local blur problem. In this paper, we propose a color migration framework for home scene images. It picks the coloring from a template image and transforms such coloring to a home scene image through a simple interaction. Our framework comprises three main parts. First, we carry out an interactive segmentation to divide an image into local regions and extract their corresponding colors. Second, we generate a matching color table by sampling the template image according to the color structure of the original home scene image. Finally, we transform colors from the matching color table to the target home scene image with the boundary transition maintained. Experimental results show that our method can effectively transform the coloring of a scene matching with the color composition of a given natural or interior scenery
Deep Depth Completion of a Single RGB-D Image
The goal of our work is to complete the depth channel of an RGB-D image.
Commodity-grade depth cameras often fail to sense depth for shiny, bright,
transparent, and distant surfaces. To address this problem, we train a deep
network that takes an RGB image as input and predicts dense surface normals and
occlusion boundaries. Those predictions are then combined with raw depth
observations provided by the RGB-D camera to solve for depths for all pixels,
including those missing in the original observation. This method was chosen
over others (e.g., inpainting depths directly) as the result of extensive
experiments with a new depth completion benchmark dataset, where holes are
filled in training data through the rendering of surface reconstructions
created from multiview RGB-D scans. Experiments with different network inputs,
depth representations, loss functions, optimization methods, inpainting
methods, and deep depth estimation networks show that our proposed approach
provides better depth completions than these alternatives.Comment: Accepted by CVPR2018 (Spotlight). Project webpage:
http://deepcompletion.cs.princeton.edu/ This version includes supplementary
materials which provide more implementation details, quantitative evaluation,
and qualitative results. Due to file size limit, please check project website
for high-res pape
C2Ideas: Supporting Creative Interior Color Design Ideation with Large Language Model
Interior color design is a creative process that endeavors to allocate colors
to furniture and other elements within an interior space. While much research
focuses on generating realistic interior designs, these automated approaches
often misalign with user intention and disregard design rationales. Informed by
a need-finding preliminary study, we develop C2Ideas, an innovative system for
designers to creatively ideate color schemes enabled by an intent-aligned and
domain-oriented large language model. C2Ideas integrates a three-stage process:
Idea Prompting stage distills user intentions into color linguistic prompts;
Word-Color Association stage transforms the prompts into semantically and
stylistically coherent color schemes; and Interior Coloring stage assigns
colors to interior elements complying with design principles. We also develop
an interactive interface that enables flexible user refinement and
interpretable reasoning. C2Ideas has undergone a series of indoor cases and
user studies, demonstrating its effectiveness and high recognition of
interactive functionality by designers.Comment: 26 pages, 11 figure
LEGO-Net: Learning Regular Rearrangements of Objects in Rooms
Humans universally dislike the task of cleaning up a messy room. If machines
were to help us with this task, they must understand human criteria for regular
arrangements, such as several types of symmetry, co-linearity or
co-circularity, spacing uniformity in linear or circular patterns, and further
inter-object relationships that relate to style and functionality. Previous
approaches for this task relied on human input to explicitly specify goal
state, or synthesized scenes from scratch -- but such methods do not address
the rearrangement of existing messy scenes without providing a goal state. In
this paper, we present LEGO-Net, a data-driven transformer-based iterative
method for learning regular rearrangement of objects in messy rooms. LEGO-Net
is partly inspired by diffusion models -- it starts with an initial messy state
and iteratively "de-noises'' the position and orientation of objects to a
regular state while reducing the distance traveled. Given randomly perturbed
object positions and orientations in an existing dataset of
professionally-arranged scenes, our method is trained to recover a regular
re-arrangement. Results demonstrate that our method is able to reliably
rearrange room scenes and outperform other methods. We additionally propose a
metric for evaluating regularity in room arrangements using number-theoretic
machinery.Comment: Project page: https://ivl.cs.brown.edu/projects/lego-ne
Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes
This report surveys advances in deep learning-based modeling techniques that
address four different 3D indoor scene analysis tasks, as well as synthesis of
3D indoor scenes. We describe different kinds of representations for indoor
scenes, various indoor scene datasets available for research in the
aforementioned areas, and discuss notable works employing machine learning
models for such scene modeling tasks based on these representations.
Specifically, we focus on the analysis and synthesis of 3D indoor scenes. With
respect to analysis, we focus on four basic scene understanding tasks -- 3D
object detection, 3D scene segmentation, 3D scene reconstruction and 3D scene
similarity. And for synthesis, we mainly discuss neural scene synthesis works,
though also highlighting model-driven methods that allow for human-centric,
progressive scene synthesis. We identify the challenges involved in modeling
scenes for these tasks and the kind of machinery that needs to be developed to
adapt to the data representation, and the task setting in general. For each of
these tasks, we provide a comprehensive summary of the state-of-the-art works
across different axes such as the choice of data representation, backbone,
evaluation metric, input, output, etc., providing an organized review of the
literature. Towards the end, we discuss some interesting research directions
that have the potential to make a direct impact on the way users interact and
engage with these virtual scene models, making them an integral part of the
metaverse.Comment: Published in Computer Graphics Forum, Aug 202
Multi feature-rich synthetic colour to improve human visual perception of point clouds
Although point features have shown their usefulness in classification with Machine Learning, point cloud visualization enhancement methods focus mainly on lighting. The visualization of point features helps to improve the perception of the 3D environment. This paper proposes Multi Feature-Rich Synthetic Colour (MFRSC) as an alternative non-photorealistic colour approach of natural-coloured point clouds. The method is based on the selection of nine features (reflectance, return number, inclination, depth, height, point density, linearity, planarity, and scattering) associated with five human perception descriptors (edges, texture, shape, size, depth, orientation). The features are reduced to fit the RGB display channels. All feature permutations are analysed according to colour distance with the natural-coloured point cloud and Image Quality Assessment. As a result, the selected feature permutations allow a clear visualization of the scene's rendering objects, highlighting edges, planes, and volumetric objects. MFRSC effectively replaces natural colour, even with less distorted visualization according to BRISQUE, NIQUE and PIQE. In addition, the assignment of features in RGB channels enables the use of MFRSC in software that does not support colorization based on point attributes (most commercially available software). MFRSC can be combined with other non-photorealistic techniques such as Eye-Dome Lighting or Ambient Occlusion.Xunta de Galicia | Ref. ED481B-2019-061Xunta de Galicia | Ref. ED431F 2022/08Agencia Estatal de Investigación | Ref. PID2019-105221RB-C43Universidade de Vigo/CISU
Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems
Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt
innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle
Objekte an einem willkürlichen Ort anzuzeigen, kann AR Technologie auch genutzt
werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten
beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene
Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese
Spielart von AR als "Fused Reality" definiert und eingehend diskutiert.
Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lässt sich
gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen Gebäudemodellen
demonstrieren, wo sich gebäudespezifische Informationen - beispielsweise der
Verlauf von Leitungen und Kabeln innerhalb der Wände - lagegerecht am realen
Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality
Anwendung realisieren zu können, müssen einige grundlegende Bedingungen erfüllt
sein. So kann ein bestimmtes Gebäude nur dann mit ortsbezogenen Informationen
augmentiert werden, wenn von diesem Gebäude ein digitales Modell verfügbar ist.
Zwar werden größere Bauprojekt heutzutage oft unter Zuhilfename von Building
Information Modelling (BIM) geplant und durchgeführt, sodass ein digitales Modell
direkt zusammen mit dem realen Gebäude ensteht, jedoch sind im Falle älterer
Bestandsgebäude digitale Modelle meist nicht verfügbar. Ein digitales Modell eines
bestehenden Gebäudes manuell zu erstellen, ist zwar möglich, jedoch mit großem
Aufwand verbunden. Ist ein passendes Gebäudemodell vorhanden, muss ein AR
Gerät außerdem in der Lage sein, die eigene Position und Orientierung im Gebäude
relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht
anzeigen zu können.
Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik
untersucht und diskutiert. Dabei werden zunächst verschiedene Möglichkeiten
diskutiert, Indoor-Gebäudegeometrie mittels Sensorsystemen zu erfassen. Anschließend
wird eine Untersuchung präsentiert, inwiefern moderne AR Geräte, die
in der Regel ebenfalls über eine Vielzahl an Sensoren verfügen, ebenfalls geeignet
sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor
Mapping Datensätze können daraufhin genutzt werden, um automatisiert
Gebäudemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes,
voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird außerdem
auf der Grundlage vierer zu diesem Zweck erfasster Datensätze mit zugehörigen
Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene
Möglichkeiten diskutiert, mobile AR Geräte innerhalb eines Gebäudes und des zugehörigen
Gebäudemodells zu lokalisieren. In diesem Kontext wird außerdem auch
die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode präsentiert.
Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping Datensätze an den
Achsen des Koordinatensystems auszurichten, vorgestellt