Neural 3D scene reconstruction methods have achieved impressive performance
when reconstructing complex geometry and low-textured regions in indoor scenes.
However, these methods heavily rely on 3D data which is costly and
time-consuming to obtain in real world. In this paper, we propose a novel
neural reconstruction method that reconstructs scenes using sparse depth under
the plane constraints without 3D supervision. We introduce a signed distance
function field, a color field, and a probability field to represent a scene. We
optimize these fields to reconstruct the scene by using differentiable ray
marching with accessible 2D images as supervision. We improve the
reconstruction quality of complex geometry scene regions with sparse depth
obtained by using the geometric constraints. The geometric constraints project
3D points on the surface to similar-looking regions with similar features in
different 2D images. We impose the plane constraints to make large planes
parallel or vertical to the indoor floor. Both two constraints help reconstruct
accurate and smooth geometry structures of the scene. Without 3D supervision,
our method achieves competitive performance compared with existing methods that
use 3D supervision on the ScanNet dataset.Comment: 10 pages, 6 figure