Search CORE

3,783 research outputs found

3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform

Author: Gao Yue
Wen Chao
Xue Zhou
Zhao Yining
Publication venue
Publication date: 19/07/2022
Field of study

Significant geometric structures can be compactly described by global wireframes in the estimation of 3D room layout from a single panoramic image. Based on this observation, we present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block. We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to the geometric output. The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure. Unlike most previous work, the predictions are performed individually on each cubemap tile, and then assembled to get the layout estimation. Experimental results show that we achieve comparable results with recent state-of-the-art in prediction accuracy and performance. Code is available at https://github.com/Starrah/DMH-Net.Comment: Accepted by ECCV 202

arXiv.org e-Print Archive

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer

Author: Bo Liefeng
Dong Yuan
Dong Zilong
Fang Chuan
Tan Ping
Publication venue
Publication date: 05/06/2023
Field of study

Panoramic image enables deeper understanding and more holistic perception of

360^\circ

surrounding environment, which can naturally encode enriched scene context information compared to standard perspective image. Previous work has made lots of effort to solve the scene understanding task in a bottom-up form, thus each sub-task is processed separately and few correlations are explored in this procedure. In this paper, we propose a novel method using depth prior for holistic indoor scene understanding which recovers the objects' shapes, oriented bounding boxes and the 3D room layout simultaneously from a single panorama. In order to fully utilize the rich context information, we design a transformer-based context module to predict the representation and relationship among each component of the scene. In addition, we introduce a real-world dataset for scene understanding, including photo-realistic panoramas, high-fidelity depth images, accurately annotated room layouts, and oriented object bounding boxes and shapes. Experiments on the synthetic and real-world datasets demonstrate that our method outperforms previous panoramic scene understanding methods in terms of both layout estimation and 3D object detection

arXiv.org e-Print Archive