111 research outputs found
Exploiting 2D Floorplan for Building-scale Panorama RGBD Alignment
This paper presents a novel algorithm that utilizes a 2D floorplan to align
panorama RGBD scans. While effective panorama RGBD alignment techniques exist,
such a system requires extremely dense RGBD image sampling. Our approach can
significantly reduce the number of necessary scans with the aid of a floorplan
image. We formulate a novel Markov Random Field inference problem as a scan
placement over the floorplan, as opposed to the conventional scan-to-scan
alignment. The technical contributions lie in multi-modal image correspondence
cues (between scans and schematic floorplan) as well as a novel coverage
potential avoiding an inherent stacking bias. The proposed approach has been
evaluated on five challenging large indoor spaces. To the best of our
knowledge, we present the first effective system that utilizes a 2D floorplan
image for building-scale 3D pointcloud alignment. The source code and the data
will be shared with the community to further enhance indoor mapping research
PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models
This paper presents PolyDiffuse, a novel structured reconstruction algorithm
that transforms visual sensor data into polygonal shapes with Diffusion Models
(DM), an emerging machinery amid exploding generative AI, while formulating
reconstruction as a generation process conditioned on sensor data. The task of
structured reconstruction poses two fundamental challenges to DM: 1) A
structured geometry is a ``set'' (e.g., a set of polygons for a floorplan
geometry), where a sample of elements has different but equivalent
representations, making the denoising highly ambiguous; and 2) A
``reconstruction'' task has a single solution, where an initial noise needs to
be chosen carefully, while any initial noise works for a generation task. Our
technical contribution is the introduction of a Guided Set Diffusion Model
where 1) the forward diffusion process learns guidance networks to control
noise injection so that one representation of a sample remains distinct from
its other permutation variants, thus resolving denoising ambiguity; and 2) the
reverse denoising process reconstructs polygonal shapes, initialized and
directed by the guidance networks, as a conditional generation process subject
to the sensor data. We have evaluated our approach for reconstructing two types
of polygonal shapes: floorplan as a set of polygons and HD map for autonomous
cars as a set of polylines. Through extensive experiments on standard
benchmarks, we demonstrate that PolyDiffuse significantly advances the current
state of the art and enables broader practical applications.Comment: Project page: https://poly-diffuse.github.io
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising
The paper presents a novel approach for vector-floorplan generation via a
diffusion model, which denoises 2D coordinates of room/door corners with two
inference objectives: 1) a single-step noise as the continuous quantity to
precisely invert the continuous forward process; and 2) the final 2D coordinate
as the discrete quantity to establish geometric incident relationships such as
parallelism, orthogonality, and corner-sharing. Our task is graph-conditioned
floorplan generation, a common workflow in floorplan design. We represent a
floorplan as 1D polygonal loops, each of which corresponds to a room or a door.
Our diffusion model employs a Transformer architecture at the core, which
controls the attention masks based on the input graph-constraint and directly
generates vector-graphics floorplans via a discrete and continuous denoising
process. We have evaluated our approach on RPLAN dataset. The proposed approach
makes significant improvements in all the metrics against the state-of-the-art
with significant margins, while being capable of generating non-Manhattan
structures and controlling the exact number of corners per room. A project
website with supplementary video and document is here
https://aminshabani.github.io/housediffusion
Hierarchical Neural Memory Network for Low Latency Event Processing
This paper proposes a low latency neural network architecture for event-based
dense prediction tasks. Conventional architectures encode entire scene contents
at a fixed rate regardless of their temporal characteristics. Instead, the
proposed network encodes contents at a proper temporal scale depending on its
movement speed. We achieve this by constructing temporal hierarchy using
stacked latent memories that operate at different rates. Given low latency
event steams, the multi-level memories gradually extract dynamic to static
scene contents by propagating information from the fast to the slow memory
modules. The architecture not only reduces the redundancy of conventional
architectures but also exploits long-term dependencies. Furthermore, an
attention-based event representation efficiently encodes sparse event streams
into the memory cells. We conduct extensive evaluations on three event-based
dense prediction tasks, where the proposed approach outperforms the existing
methods on accuracy and latency, while demonstrating effective event and image
fusion capabilities. The code is available at https://hamarh.github.io/hmnet/Comment: Accepted to CVPR 202
Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path
This paper proposes a new approach for automated floorplan reconstruction
from RGBD scans, a major milestone in indoor mapping research. The approach,
dubbed Floor-SP, formulates a novel optimization problem, where room-wise
coordinate descent sequentially solves dynamic programming to optimize the
floorplan graph structure. The objective function consists of data terms guided
by deep neural networks, consistency terms encouraging adjacent rooms to share
corners and walls, and the model complexity term. The approach does not require
corner/edge detection with thresholds, unlike most other methods. We have
evaluated our system on production-quality RGBD scans of 527 apartments or
houses, including many units with non-Manhattan structures. Qualitative and
quantitative evaluations demonstrate a significant performance boost over the
current state-of-the-art. Please refer to our project website
http://jcchen.me/floor-sp/ for code and data.Comment: 10 pages, 9 figures, accepted to ICCV 201
JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models
This paper presents a novel approach to the Extreme Structure from Motion
(E-SfM) problem, which takes a set of room layouts as polygonal curves in the
top-down view, and aligns the room layout pieces by estimating their 2D
translations and rotations, akin to solving the jigsaw puzzle of room layouts.
The biggest discovery and surprise of the paper is that the simple use of a
Diffusion Model solves this challenging registration problem as a conditional
generation process. The paper presents a new dataset of room layouts and
floorplans for 98,780 houses. The qualitative and quantitative evaluations
demonstrate that the proposed approach outperforms the competing methods by
significant margins
- …