32,710 research outputs found
Ship Deck Segmentation in Engineering Document Using Generative Adversarial Networks
Generative adversarial networks (GANs) have become very popular in recent years. GANs have proved to be successful in different computer vision tasks including image-translation, image super-resolution etc. In this paper, we have used GAN models for ship deck segmentation. We have used 2D scanned raster images of ship decks provided by US Navy Military Sealift Command (MSC) to extract necessary information including ship walls, objects etc. Our segmentation results will be helpful to get vector and 3D image of a ship that can be later used for maintenance of the ship. We applied the trained models to engineering documents provided by MSC and obtained very promising results, demonstrating that GANs can be potentially good candidates for this research area
Automatic 3D building model generation using deep learning methods based on cityjson and 2D floor plans
In the past decade, a lot of effort is put into applying digital innovations to building life cycles. 3D Models have been proven to be efficient for decision making, scenario simulation and 3D data analysis during this life cycle. Creating such digital representation of a building can be a labour-intensive task, depending on the desired scale and level of detail (LOD). This research aims at creating a new automatic deep learning based method for building model reconstruction. It combines exterior and interior data sources: 1) 3D BAG, 2) archived floor plan images. To reconstruct 3D building models from the two data sources, an innovative combination of methods is proposed. In order to obtain the information needed from the floor plan images (walls, openings and labels), deep learning techniques have been used. In addition, post-processing techniques are introduced to transform the data in the required format. In order to fuse the extracted 2D data and the 3D exterior, a data fusion process is introduced. From the literature review, no prior research on automatic integration of CityGML/JSON and floor plan images has been found. Therefore, this method is a first approach to this data integration
MuraNet: Multi-task Floor Plan Recognition with Relation Attention
The recognition of information in floor plan data requires the use of
detection and segmentation models. However, relying on several single-task
models can result in ineffective utilization of relevant information when there
are multiple tasks present simultaneously. To address this challenge, we
introduce MuraNet, an attention-based multi-task model for segmentation and
detection tasks in floor plan data. In MuraNet, we adopt a unified encoder
called MURA as the backbone with two separated branches: an enhanced
segmentation decoder branch and a decoupled detection head branch based on
YOLOX, for segmentation and detection tasks respectively. The architecture of
MuraNet is designed to leverage the fact that walls, doors, and windows usually
constitute the primary structure of a floor plan's architecture. By jointly
training the model on both detection and segmentation tasks, we believe MuraNet
can effectively extract and utilize relevant features for both tasks. Our
experiments on the CubiCasa5k public dataset show that MuraNet improves
convergence speed during training compared to single-task models like U-Net and
YOLOv3. Moreover, we observe improvements in the average AP and IoU in
detection and segmentation tasks, respectively.Our ablation experiments
demonstrate that the attention-based unified backbone of MuraNet achieves
better feature extraction in floor plan recognition tasks, and the use of
decoupled multi-head branches for different tasks further improves model
performance. We believe that our proposed MuraNet model can address the
disadvantages of single-task models and improve the accuracy and efficiency of
floor plan data recognition.Comment: Document Analysis and Recognition - ICDAR 2023 Workshops. ICDAR 2023.
Lecture Notes in Computer Science, vol 14193. Springer, Cha
HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising
The paper presents a novel approach for vector-floorplan generation via a
diffusion model, which denoises 2D coordinates of room/door corners with two
inference objectives: 1) a single-step noise as the continuous quantity to
precisely invert the continuous forward process; and 2) the final 2D coordinate
as the discrete quantity to establish geometric incident relationships such as
parallelism, orthogonality, and corner-sharing. Our task is graph-conditioned
floorplan generation, a common workflow in floorplan design. We represent a
floorplan as 1D polygonal loops, each of which corresponds to a room or a door.
Our diffusion model employs a Transformer architecture at the core, which
controls the attention masks based on the input graph-constraint and directly
generates vector-graphics floorplans via a discrete and continuous denoising
process. We have evaluated our approach on RPLAN dataset. The proposed approach
makes significant improvements in all the metrics against the state-of-the-art
with significant margins, while being capable of generating non-Manhattan
structures and controlling the exact number of corners per room. A project
website with supplementary video and document is here
https://aminshabani.github.io/housediffusion
LEGO-Net: Learning Regular Rearrangements of Objects in Rooms
Humans universally dislike the task of cleaning up a messy room. If machines
were to help us with this task, they must understand human criteria for regular
arrangements, such as several types of symmetry, co-linearity or
co-circularity, spacing uniformity in linear or circular patterns, and further
inter-object relationships that relate to style and functionality. Previous
approaches for this task relied on human input to explicitly specify goal
state, or synthesized scenes from scratch -- but such methods do not address
the rearrangement of existing messy scenes without providing a goal state. In
this paper, we present LEGO-Net, a data-driven transformer-based iterative
method for learning regular rearrangement of objects in messy rooms. LEGO-Net
is partly inspired by diffusion models -- it starts with an initial messy state
and iteratively "de-noises'' the position and orientation of objects to a
regular state while reducing the distance traveled. Given randomly perturbed
object positions and orientations in an existing dataset of
professionally-arranged scenes, our method is trained to recover a regular
re-arrangement. Results demonstrate that our method is able to reliably
rearrange room scenes and outperform other methods. We additionally propose a
metric for evaluating regularity in room arrangements using number-theoretic
machinery.Comment: Project page: https://ivl.cs.brown.edu/projects/lego-ne
- …