Three-dimensional (3D) object reconstruction based on differentiable
rendering (DR) is an active research topic in computer vision. DR-based methods
minimize the difference between the rendered and target images by optimizing
both the shape and appearance and realizing a high visual reproductivity.
However, most approaches perform poorly for textureless objects because of the
geometrical ambiguity, which means that multiple shapes can have the same
rendered result in such objects. To overcome this problem, we introduce active
sensing with structured light (SL) into multi-view 3D object reconstruction
based on DR to learn the unknown geometry and appearance of arbitrary scenes
and camera poses. More specifically, our framework leverages the
correspondences between pixels in different views calculated by structured
light as an additional constraint in the DR-based optimization of implicit
surface, color representations, and camera poses. Because camera poses can be
optimized simultaneously, our method realizes high reconstruction accuracy in
the textureless region and reduces efforts for camera pose calibration, which
is required for conventional SL-based methods. Experiment results on both
synthetic and real data demonstrate that our system outperforms conventional
DR- and SL-based methods in a high-quality surface reconstruction, particularly
for challenging objects with textureless or shiny surfaces.Comment: Accepted by BMVC 202