2 research outputs found
MuraNet: Multi-task Floor Plan Recognition with Relation Attention
The recognition of information in floor plan data requires the use of
detection and segmentation models. However, relying on several single-task
models can result in ineffective utilization of relevant information when there
are multiple tasks present simultaneously. To address this challenge, we
introduce MuraNet, an attention-based multi-task model for segmentation and
detection tasks in floor plan data. In MuraNet, we adopt a unified encoder
called MURA as the backbone with two separated branches: an enhanced
segmentation decoder branch and a decoupled detection head branch based on
YOLOX, for segmentation and detection tasks respectively. The architecture of
MuraNet is designed to leverage the fact that walls, doors, and windows usually
constitute the primary structure of a floor plan's architecture. By jointly
training the model on both detection and segmentation tasks, we believe MuraNet
can effectively extract and utilize relevant features for both tasks. Our
experiments on the CubiCasa5k public dataset show that MuraNet improves
convergence speed during training compared to single-task models like U-Net and
YOLOv3. Moreover, we observe improvements in the average AP and IoU in
detection and segmentation tasks, respectively.Our ablation experiments
demonstrate that the attention-based unified backbone of MuraNet achieves
better feature extraction in floor plan recognition tasks, and the use of
decoupled multi-head branches for different tasks further improves model
performance. We believe that our proposed MuraNet model can address the
disadvantages of single-task models and improve the accuracy and efficiency of
floor plan data recognition.Comment: Document Analysis and Recognition - ICDAR 2023 Workshops. ICDAR 2023.
Lecture Notes in Computer Science, vol 14193. Springer, Cha