Semantic scene completion (SSC) jointly predicts the semantics and geometry
of the entire 3D scene, which plays an essential role in 3D scene understanding
for autonomous driving systems. SSC has achieved rapid progress with the help
of semantic context in segmentation. However, how to effectively exploit the
relationships between the semantic context in semantic segmentation and
geometric structure in scene completion remains under exploration. In this
paper, we propose to solve outdoor SSC from the perspective of representation
separation and BEV fusion. Specifically, we present the network, named SSC-RS,
which uses separate branches with deep supervision to explicitly disentangle
the learning procedure of the semantic and geometric representations. And a BEV
fusion network equipped with the proposed Adaptive Representation Fusion (ARF)
module is presented to aggregate the multi-scale features effectively and
efficiently. Due to the low computational burden and powerful representation
ability, our model has good generality while running in real-time. Extensive
experiments on SemanticKITTI demonstrate our SSC-RS achieves state-of-the-art
performance.Comment: 8 pages, 5 figures, IROS202